CROP
ProjectsPDF Parser

MLflow Service

MLflow Tracking Server for CROP AI Pipeline monitoring and experiment management.

MLflow Service

MLflow Tracking Server for CROP AI Pipeline monitoring and experiment management.

Overview

MLflow provides centralized tracking for machine learning experiments and data pipelines in the CROP ecosystem. It tracks:

  • Pipeline runs (PDF processing, embedding generation, RAG preparation)
  • Model experiments (embeddings, CLIP, LLaMA fine-tuning)
  • Artifacts (processed data, model checkpoints)
  • Metrics and parameters for reproducibility

Production Deployment

Cloud Run Service

PropertyValue
URLhttps://mlflow-service-atife5uvka-ue.a.run.app
Regionus-east1
Artifact Storagegs://mlflow-artifacts-noted-bliss-466410-q6/mlflow-artifacts/
Backend StoreCloud SQL / SQLite

Accessing the UI

Open in browser: https://mlflow-service-atife5uvka-ue.a.run.app

The MLflow UI provides:

  • Experiments: View all experiments with searchable runs
  • Runs: Compare runs, view metrics, parameters, and artifacts
  • Models: Model registry for versioning and deployment
  • Artifacts: Browse stored artifacts (models, data, etc.)

Local Development

Using Docker Compose

MLflow is included in the main docker-compose.yml:

cd CROP-pdf-parser-service
docker-compose up mlflow

Local MLflow will be available at: http://localhost:5000

Configuration

# docker-compose.yml
mlflow:
  image: ghcr.io/mlflow/mlflow:v2.8.1
  ports:
    - "5000:5000"
  environment:
    BACKEND_STORE_URI: sqlite:///mlflow.db
    DEFAULT_ARTIFACT_ROOT: /mlflow/artifacts

Environment Variables

VariableDescriptionDefault
MLFLOW_TRACKING_URIMLflow server URLhttp://localhost:5000
MLFLOW_BACKEND_STORE_URIDatabase URI for metadatasqlite:///mlflow.db
MLFLOW_ARTIFACT_ROOTRoot path for artifacts/mlflow/artifacts

API Endpoints

MLflow exposes a REST API for programmatic access:

List Experiments

curl "https://mlflow-service-atife5uvka-ue.a.run.app/api/2.0/mlflow/experiments/search?max_results=100"

Search Runs

curl -X POST "https://mlflow-service-atife5uvka-ue.a.run.app/api/2.0/mlflow/runs/search" \
  -H "Content-Type: application/json" \
  -d '{"experiment_ids": ["0"], "max_results": 10}'

Get Run Details

curl "https://mlflow-service-atife5uvka-ue.a.run.app/api/2.0/mlflow/runs/get?run_id=<RUN_ID>"

Integration with Data Preparation Service

The Data Preparation Service uses MLflow for pipeline tracking:

import mlflow

# Set tracking URI
mlflow.set_tracking_uri("https://mlflow-service-atife5uvka-ue.a.run.app")

# Start a run
with mlflow.start_run(run_name="pdf-processing"):
    mlflow.log_param("pdf_path", "pdfs/manual.pdf")
    mlflow.log_param("document_number", "SPD00805")

    # Process PDF...

    mlflow.log_metric("total_pages", 100)
    mlflow.log_metric("documents_processed", 3)
    mlflow.log_metric("rag_documents_created", 150)

Tracked Metrics

MetricDescription
pdf_size_bytesSize of downloaded PDF
total_pagesTotal pages in PDF
document_countNumber of documents found
documents_processedDocuments successfully processed
pages_processedPages successfully processed
rag_documents_createdRAG documents created and stored
images_storedImages with CLIP embeddings stored

Deployment to GCP Cloud Run

Prerequisites

  1. GCP project with billing enabled
  2. gcloud CLI installed and authenticated
  3. Cloud SQL instance or persistent storage for backend

Deploy MLflow Service

# Build and push Docker image
gcloud builds submit --tag gcr.io/$PROJECT_ID/mlflow-service

# Deploy to Cloud Run
gcloud run deploy mlflow-service \
  --image gcr.io/$PROJECT_ID/mlflow-service \
  --region us-east1 \
  --platform managed \
  --allow-unauthenticated \
  --memory 1Gi \
  --cpu 1 \
  --set-env-vars "BACKEND_STORE_URI=sqlite:///mlflow.db,DEFAULT_ARTIFACT_ROOT=gs://mlflow-artifacts-$PROJECT_ID/mlflow-artifacts/"

Using GCS for Artifacts

For production, store artifacts in Google Cloud Storage:

# Create GCS bucket
gsutil mb gs://mlflow-artifacts-$PROJECT_ID

# Set artifact root
DEFAULT_ARTIFACT_ROOT=gs://mlflow-artifacts-$PROJECT_ID/mlflow-artifacts/

Use Cases

1. PDF Processing Pipeline Tracking

Track each PDF processing run with:

  • Input parameters (PDF path, document number)
  • Processing metrics (pages, documents, time)
  • Output artifacts (processed data)

2. Embedding Model Experiments

Track embedding model experiments:

  • Model parameters (model name, dimensions)
  • Performance metrics (latency, throughput)
  • Model artifacts (checkpoints, configs)

3. LLaMA Fine-tuning

Track fine-tuning experiments:

  • Training parameters (learning rate, epochs)
  • Evaluation metrics (loss, accuracy)
  • Model checkpoints

Monitoring

Viewing Pipeline Runs

  1. Open MLflow UI: https://mlflow-service-atife5uvka-ue.a.run.app
  2. Select experiment (e.g., "Default")
  3. Browse runs with filters and search
  4. Compare runs side-by-side

Pipeline Health

Check recent runs via API:

curl -X POST "https://mlflow-service-atife5uvka-ue.a.run.app/api/2.0/mlflow/runs/search" \
  -H "Content-Type: application/json" \
  -d '{
    "experiment_ids": ["0"],
    "max_results": 10,
    "order_by": ["start_time DESC"]
  }'

Troubleshooting

Common Issues

  1. Connection refused: Ensure MLflow server is running and accessible
  2. Artifact upload failed: Check GCS permissions and bucket access
  3. Run not found: Verify experiment ID and run ID

Logs

Cloud Run logs:

gcloud logging read "resource.type=cloud_run_revision AND resource.labels.service_name=mlflow-service" --limit=50

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    MLflow Tracking Server                    │
│              (Cloud Run: us-east1)                          │
├─────────────────────────────────────────────────────────────┤
│  UI: https://mlflow-service-atife5uvka-ue.a.run.app        │
│  API: /api/2.0/mlflow/*                                     │
└────────────────────────┬────────────────────────────────────┘

         ┌───────────────┼───────────────┐
         │               │               │
         ▼               ▼               ▼
┌─────────────┐  ┌─────────────┐  ┌─────────────┐
│   Backend   │  │  Artifacts  │  │   Clients   │
│   Store     │  │   (GCS)     │  │             │
├─────────────┤  ├─────────────┤  ├─────────────┤
│ SQLite/     │  │ gs://mlflow │  │ Data Prep   │
│ Cloud SQL   │  │ -artifacts  │  │ AI Service  │
│             │  │             │  │ Notebooks   │
└─────────────┘  └─────────────┘  └─────────────┘

Status

ComponentStatus
Cloud Run ServiceActive
Artifact Storage (GCS)Active
Backend StoreActive
UI AccessPublic

On this page