ProjectsParts Services
Metrics Quick Reference
Key Metrics at a Glance
Metrics Quick Reference
Accessing Metrics
# View all metrics
curl http://search-service:3001/metrics
# View specific metric
curl http://search-service:3001/metrics | grep autocomplete_result_count
# View with Prometheus
http://prometheus:9090/metrics (configured as scrape target)Key Metrics at a Glance
Autocomplete Quality (Most Important)
| Metric | Type | Range | Target | Alert |
|---|---|---|---|---|
autocomplete_zero_results_total | Counter | Rate (%) | < 5% | > 10% |
autocomplete_empty_fallback_total | Counter | Rate (%) | < 2% | > 5% |
autocomplete_result_count | Histogram | P95 | >= 5 | < 3 |
Fitment Coverage (Secondary)
| Metric | Type | Range | Target | Alert |
|---|---|---|---|---|
parts_fitment_coverage_ratio | Gauge | 0-1 | >= 0.8 | < 0.6 |
equipment_fitment_coverage_ratio | Gauge | 0-1 | >= 0.8 | < 0.5 |
fitment_data_points_total / parts_with_fitment_total | Calculated | Ratio | >= 2 | < 1.5 |
Data Quality (Tertiary)
| Metric | Type | Range | Target | Alert |
|---|---|---|---|---|
catalog_ready_ratio | Gauge | 0-1 | >= 0.8 | < 0.6 |
missing_image_ratio | Gauge | 0-1 | < 0.2 | > 0.3 |
Common PromQL Queries
Zero Result Rate (24h)
(sum(increase(autocomplete_zero_results_total[24h])) / sum(increase(autocomplete_queries_total[24h]))) * 100Fallback Rate (5m)
(rate(autocomplete_empty_fallback_total[5m]) / rate(autocomplete_queries_total[5m])) * 100Result Count P95
histogram_quantile(0.95, rate(autocomplete_result_count_bucket[5m]))Query Latency P95
histogram_quantile(0.95, rate(autocomplete_query_duration_seconds_bucket[5m])) * 1000Fitment Coverage Trend
parts_fitment_coverage_ratioCatalog Ready Percentage
catalog_ready_ratio * 100Grafana Dashboard
Import: scripts/grafana-dashboard-autocomplete-metrics.json
Quick Links:
- Query Volume: Top left panel
- Zero Result Rate: Top right gauge
- Result Distribution: Middle left
- Query Latency: Middle right
- Suggestion Types: Bottom left
- Fallback Rate: Bottom middle
- Fitment Coverage: Bottom right panels
Alerts Summary
Critical (P1)
- Zero result rate > 10% for 5 minutes
- Fallback rate > 5% for 5 minutes
- Query latency P95 > 500ms for 5 minutes
Warning (P2)
- Fitment coverage < 60%
- Catalog ready ratio < 60%
- Missing image ratio > 30%
Fitment Coverage Calculator
# Dry run (no changes)
bun scripts/calculate-fitment-coverage.ts --dry-run
# Run for specific manufacturer
bun scripts/calculate-fitment-coverage.ts --manufacturer=nh
# Update metrics
bun scripts/calculate-fitment-coverage.tsSchedule: Run hourly via cron or Cloud Run Job
Metric Recording in Code
import {
autocompleteResultCount,
autocompleteZeroResults,
fitmentCoverageRatio,
} from '../metrics/registry';
// Record result count
autocompleteResultCount.observe({ query_type: 'textual', mode: 'auto' }, 5);
// Record zero results
autocompleteZeroResults.inc({ query_length: 'short', query_type: 'textual' });
// Update fitment ratio (gauge)
fitmentCoverageRatio.set({ manufacturer: 'nh' }, 0.85);Troubleshooting
Metrics Not Showing in Prometheus
- Check service is running:
curl http://localhost:3001/metrics - Check Prometheus config:
http://prometheus:9090/config - Check scrape targets:
http://prometheus:9090/targets
High Cardinality Warnings
Review metric labels in METRICS.md - all use fixed enum values.
Latency Spike in Metrics
Metric recording overhead is < 0.1ms per request. If queries slow down:
- Check Elasticsearch latency
- Check query complexity (number of parallel queries)
- Use
autocompleteESLatencyhistogram to debug
File Structure
microservices/
├── services/search/
│ ├── src/
│ │ ├── metrics/
│ │ │ └── registry.ts (20 new metrics)
│ │ └── routes/
│ │ ├── metrics.ts (exports)
│ │ └── autocomplete.ts (recording logic)
│ └── scripts/
│ └── calculate-fitment-coverage.ts
└── docs/
├── METRICS.md (reference)
├── METRICS_INTEGRATION.md (guide)
├── METRICS_QUICK_REFERENCE.md (this file)
└── METRICS_IMPLEMENTATION_SUMMARY.mdNext Steps
- Deploy to development environment
- Verify Prometheus scraping
/metricsendpoint - Import Grafana dashboard
- Review metric values
- Configure alerts in AlertManager
- Schedule fitment coverage job
Contacts
See documentation:
- Complete reference:
METRICS.md - Implementation details:
METRICS_INTEGRATION.md - Summary:
METRICS_IMPLEMENTATION_SUMMARY.md
Metrics Integration Guide
This guide explains how to integrate metrics into the Search service and use them for monitoring autocomplete quality and fitment coverage.
Category Field Migration: snake_case → camelCase
Migration Date: 2025-11-26 Status: Code Changes Complete - Pending Database/Index Migration