MLflow Service
MLflow Tracking Server for CROP AI Pipeline monitoring and experiment management.
MLflow Service
MLflow Tracking Server for CROP AI Pipeline monitoring and experiment management.
Overview
MLflow provides centralized tracking for machine learning experiments and data pipelines in the CROP ecosystem. It tracks:
- Pipeline runs (PDF processing, embedding generation, RAG preparation)
- Model experiments (embeddings, CLIP, LLaMA fine-tuning)
- Artifacts (processed data, model checkpoints)
- Metrics and parameters for reproducibility
Production Deployment
Cloud Run Service
| Property | Value |
|---|---|
| URL | https://mlflow-service-atife5uvka-ue.a.run.app |
| Region | us-east1 |
| Artifact Storage | gs://mlflow-artifacts-noted-bliss-466410-q6/mlflow-artifacts/ |
| Backend Store | Cloud SQL / SQLite |
Accessing the UI
Open in browser: https://mlflow-service-atife5uvka-ue.a.run.app
The MLflow UI provides:
- Experiments: View all experiments with searchable runs
- Runs: Compare runs, view metrics, parameters, and artifacts
- Models: Model registry for versioning and deployment
- Artifacts: Browse stored artifacts (models, data, etc.)
Local Development
Using Docker Compose
MLflow is included in the main docker-compose.yml:
cd CROP-pdf-parser-service
docker-compose up mlflowLocal MLflow will be available at: http://localhost:5000
Configuration
# docker-compose.yml
mlflow:
image: ghcr.io/mlflow/mlflow:v2.8.1
ports:
- "5000:5000"
environment:
BACKEND_STORE_URI: sqlite:///mlflow.db
DEFAULT_ARTIFACT_ROOT: /mlflow/artifactsEnvironment Variables
| Variable | Description | Default |
|---|---|---|
MLFLOW_TRACKING_URI | MLflow server URL | http://localhost:5000 |
MLFLOW_BACKEND_STORE_URI | Database URI for metadata | sqlite:///mlflow.db |
MLFLOW_ARTIFACT_ROOT | Root path for artifacts | /mlflow/artifacts |
API Endpoints
MLflow exposes a REST API for programmatic access:
List Experiments
curl "https://mlflow-service-atife5uvka-ue.a.run.app/api/2.0/mlflow/experiments/search?max_results=100"Search Runs
curl -X POST "https://mlflow-service-atife5uvka-ue.a.run.app/api/2.0/mlflow/runs/search" \
-H "Content-Type: application/json" \
-d '{"experiment_ids": ["0"], "max_results": 10}'Get Run Details
curl "https://mlflow-service-atife5uvka-ue.a.run.app/api/2.0/mlflow/runs/get?run_id=<RUN_ID>"Integration with Data Preparation Service
The Data Preparation Service uses MLflow for pipeline tracking:
import mlflow
# Set tracking URI
mlflow.set_tracking_uri("https://mlflow-service-atife5uvka-ue.a.run.app")
# Start a run
with mlflow.start_run(run_name="pdf-processing"):
mlflow.log_param("pdf_path", "pdfs/manual.pdf")
mlflow.log_param("document_number", "SPD00805")
# Process PDF...
mlflow.log_metric("total_pages", 100)
mlflow.log_metric("documents_processed", 3)
mlflow.log_metric("rag_documents_created", 150)Tracked Metrics
| Metric | Description |
|---|---|
pdf_size_bytes | Size of downloaded PDF |
total_pages | Total pages in PDF |
document_count | Number of documents found |
documents_processed | Documents successfully processed |
pages_processed | Pages successfully processed |
rag_documents_created | RAG documents created and stored |
images_stored | Images with CLIP embeddings stored |
Deployment to GCP Cloud Run
Prerequisites
- GCP project with billing enabled
gcloudCLI installed and authenticated- Cloud SQL instance or persistent storage for backend
Deploy MLflow Service
# Build and push Docker image
gcloud builds submit --tag gcr.io/$PROJECT_ID/mlflow-service
# Deploy to Cloud Run
gcloud run deploy mlflow-service \
--image gcr.io/$PROJECT_ID/mlflow-service \
--region us-east1 \
--platform managed \
--allow-unauthenticated \
--memory 1Gi \
--cpu 1 \
--set-env-vars "BACKEND_STORE_URI=sqlite:///mlflow.db,DEFAULT_ARTIFACT_ROOT=gs://mlflow-artifacts-$PROJECT_ID/mlflow-artifacts/"Using GCS for Artifacts
For production, store artifacts in Google Cloud Storage:
# Create GCS bucket
gsutil mb gs://mlflow-artifacts-$PROJECT_ID
# Set artifact root
DEFAULT_ARTIFACT_ROOT=gs://mlflow-artifacts-$PROJECT_ID/mlflow-artifacts/Use Cases
1. PDF Processing Pipeline Tracking
Track each PDF processing run with:
- Input parameters (PDF path, document number)
- Processing metrics (pages, documents, time)
- Output artifacts (processed data)
2. Embedding Model Experiments
Track embedding model experiments:
- Model parameters (model name, dimensions)
- Performance metrics (latency, throughput)
- Model artifacts (checkpoints, configs)
3. LLaMA Fine-tuning
Track fine-tuning experiments:
- Training parameters (learning rate, epochs)
- Evaluation metrics (loss, accuracy)
- Model checkpoints
Monitoring
Viewing Pipeline Runs
- Open MLflow UI: https://mlflow-service-atife5uvka-ue.a.run.app
- Select experiment (e.g., "Default")
- Browse runs with filters and search
- Compare runs side-by-side
Pipeline Health
Check recent runs via API:
curl -X POST "https://mlflow-service-atife5uvka-ue.a.run.app/api/2.0/mlflow/runs/search" \
-H "Content-Type: application/json" \
-d '{
"experiment_ids": ["0"],
"max_results": 10,
"order_by": ["start_time DESC"]
}'Troubleshooting
Common Issues
- Connection refused: Ensure MLflow server is running and accessible
- Artifact upload failed: Check GCS permissions and bucket access
- Run not found: Verify experiment ID and run ID
Logs
Cloud Run logs:
gcloud logging read "resource.type=cloud_run_revision AND resource.labels.service_name=mlflow-service" --limit=50Related Documentation
- Data Preparation Service - Uses MLflow for pipeline tracking
- AI Service - Model experiment tracking
- MLflow Official Docs - MLflow documentation
- MLflow REST API - API reference
Architecture
┌─────────────────────────────────────────────────────────────┐
│ MLflow Tracking Server │
│ (Cloud Run: us-east1) │
├─────────────────────────────────────────────────────────────┤
│ UI: https://mlflow-service-atife5uvka-ue.a.run.app │
│ API: /api/2.0/mlflow/* │
└────────────────────────┬────────────────────────────────────┘
│
┌───────────────┼───────────────┐
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Backend │ │ Artifacts │ │ Clients │
│ Store │ │ (GCS) │ │ │
├─────────────┤ ├─────────────┤ ├─────────────┤
│ SQLite/ │ │ gs://mlflow │ │ Data Prep │
│ Cloud SQL │ │ -artifacts │ │ AI Service │
│ │ │ │ │ Notebooks │
└─────────────┘ └─────────────┘ └─────────────┘Status
| Component | Status |
|---|---|
| Cloud Run Service | Active |
| Artifact Storage (GCS) | Active |
| Backend Store | Active |
| UI Access | Public |
Linear Integration Service
FastAPI service for integrating with Linear.app via GraphQL API. This service allows you to create, update, and manage Linear issues programmatically.
Event Sourcing Guide for Weaviate Service
Event Sourcing is implemented to provide: - Transparency - complete history of all changes - Audit - tracking of all operations - Recovery - time-travel...