Service for generating image embeddings using CLIP (Contrastive Language-Image Pre-training) model.

CLIP Image Vectorization Service

Service for generating image embeddings using CLIP (Contrastive Language-Image Pre-training) model.

Features

🖼️ Single image embedding generation
📦 Batch image processing
🔗 Image embedding from URL
🚀 GPU support for fast inference
📊 Metadata support for enriched vectors

Quick Start

Local Development

cd clip_service
pip install -r requirements.txt

# Set environment variables
export CLIP_MODEL="openai/clip-vit-base-patch32"
export CLIP_DEVICE="cuda"  # or "cpu"

# Run service
uvicorn main:app --reload --port 8002

Service will be available at http://localhost:8002

API Endpoints

Health Check

GET /
GET /health

Single Image Embedding

POST /embed
Content-Type: multipart/form-data

file: <image file>
metadata: (optional) JSON string

Batch Image Embedding

POST /embed/batch
Content-Type: multipart/form-data

files: <image files>
metadata_list: (optional) JSON string with list of metadata

Image from URL

POST /embed/url
Content-Type: application/json

{
  "url": "https://example.com/image.jpg",
  "metadata": {...}
}

Deployment to GCP

⚠️ IMPORTANT: GCP GPU instances are expensive. The deployment script REQUIRES GPU and will fail if GPU is not available. Service MUST run on GPU, no CPU fallback allowed.

Build Docker Image

First, build the Docker image via Cloud Build:

cd clip_service
./build_image.sh

This will:

Build Docker image using Dockerfile.gpu
Push to Container Registry: gcr.io/$PROJECT_ID/clip-service:latest
Show build progress and logs

Options:

# Build with specific tag
./build_image.sh --tag v1.0.0

# Submit build without waiting (asynchronous)
./build_image.sh --async

Deploy to GPU VM (Production)

CLIP requires GPU for optimal performance. Deploy on a GCP GPU VM:

cd clip_service
./deploy_gpu_vm.sh

This will:

Build Docker image (if not using --skip-build)
Check if VM exists:
- ✅ If VM exists: Update Docker service only (no VM recreation)
- ❌ If VM doesn't exist: Create new GPU VM (T4 by default) in us-east1 region
Install NVIDIA drivers and Docker (only if needed)
Stop and remove old container (if exists)
Pull latest image from Container Registry
Create new container with GPU support (using --force-recreate)
Verify GPU usage - deployment fails if service doesn't use GPU
Configure firewall rules
Return the service URL

⚠️ Critical: The deployment script ensures:

✅ VM is reused - doesn't recreate VM on each deployment (saves time and preserves data)
✅ Old container is removed before starting new one
✅ Latest image is pulled from registry
✅ New container is created (not just restarted)
✅ Service MUST use GPU (verification at end)
✅ If GPU not available, deployment fails (no CPU fallback)

VM Management:

First run: Creates new GPU VM
Subsequent runs: Reuses existing VM, only updates Docker service
To recreate VM: Answer "y" when asked "Recreate VM? (y/N)"

Options:

# Skip image build (use existing image)
./deploy_gpu_vm.sh --skip-build

# Use specific image
./deploy_gpu_vm.sh --image gcr.io/$PROJECT_ID/clip-service:v1.0.0

Workflow:

Build image: ./build_image.sh --tag v1.0.0
Deploy: ./deploy_gpu_vm.sh --skip-build --image gcr.io/$PROJECT_ID/clip-service:v1.0.0

Or build and deploy in one command:

./deploy_gpu_vm.sh  # Will build image first, then deploy

Configuration (set in .env.deploy):

CLIP_VM_NAME=clip-service
CLIP_VM_MACHINE_TYPE=n1-standard-4
CLIP_GPU_TYPE=nvidia-tesla-t4
CLIP_GPU_COUNT=1
CLIP_MODEL=openai/clip-vit-base-patch32
CLIP_PORT=8002

GPU Options:

T4 GPU (Recommended): ~$0.35/hour, 16GB memory
L4 GPU: ~$0.50/hour, 24GB memory
A10 GPU: ~$2.00/hour, 24GB memory (high-throughput)

Service Management:

# Check service status
gcloud compute ssh clip-service --zone=us-east1-b \
    --command="cd /opt/clip-service && docker compose ps"

# View logs
gcloud compute ssh clip-service --zone=us-east1-b \
    --command="cd /opt/clip-service && docker compose logs -f"

# Restart service
gcloud compute ssh clip-service --zone=us-east1-b \
    --command="cd /opt/clip-service && docker compose restart"

Check API:

cd clip_service
./check_api.sh

Configuration

Environment variables:

CLIP_MODEL: Hugging Face model name (default: openai/clip-vit-base-patch32)
CLIP_DEVICE: Device to use (cuda or cpu, default: cuda if available)
PORT: Service port (default: 8002)

Model Options

Available CLIP models from Hugging Face:

openai/clip-vit-base-patch32 - Base model (fast, smaller)
openai/clip-vit-large-patch14 - Large model (slower, better quality)
laion/CLIP-ViT-B-32-xlaai-en - Alternative base model

Integration with Weaviate

Use this service to generate embeddings for images, then store them in Weaviate:

import requests

# Generate embedding
response = requests.post(
    "http://clip-service:8002/embed",
    files={"file": open("image.jpg", "rb")},
    data={"metadata": '{"part_number": "12345", "description": "..."}'}
)

embedding = response.json()["embedding"]

# Store in Weaviate
# (see data_preparation/init_weaviate_images.py)

Performance

CPU: ~100-200ms per image
GPU (T4): ~10-20ms per image
Batch processing: Significantly faster on GPU

CLIP Image Vectorization Service

On this page