CROP
ProjectsParts Services

Prometheus Metrics Documentation

This document describes all custom Prometheus metrics collected by the Search service for monitoring autocomplete quality, fitment coverage, and system health.

Prometheus Metrics Documentation

This document describes all custom Prometheus metrics collected by the Search service for monitoring autocomplete quality, fitment coverage, and system health.

Overview

All metrics follow Prometheus naming conventions:

  • Counter: Monotonically increasing value (total requests, errors, etc.)
  • Gauge: Point-in-time measurement (document count, ratios, etc.)
  • Histogram: Distribution of values with quantile aggregation (latency, sizes, etc.)

Metrics endpoint: GET /metrics


Autocomplete Quality Metrics

autocomplete_result_count (Histogram)

Type: Histogram Labels: query_type, mode Buckets: [0, 1, 3, 5, 10, 20, 50]

Number of suggestions returned per query. Tracks distribution to identify queries with poor or excessive results.

Query Type Values:

  • short - Queries <= 3 characters
  • pn_like - Part number pattern queries
  • textual - Natural language queries
  • equipment - Equipment-focused queries

Mode Values:

  • auto - Automatic mode selection
  • parts - Parts-only search
  • equipment - Equipment-centric search

Example Queries:

# Mean result count
rate(autocomplete_result_count_sum[5m]) / rate(autocomplete_result_count_count[5m])

# P95 result count distribution
histogram_quantile(0.95, autocomplete_result_count_bucket)

# Percentage of queries returning 0-1 results
(autocomplete_result_count_bucket{le="1"} / autocomplete_result_count_count) * 100

autocomplete_zero_results_total (Counter)

Type: Counter Labels: query_length, query_type

Total queries returning zero results. Indicates potential quality issues or gaps in autocomplete coverage.

Query Length Values:

  • short - 1-3 characters
  • medium - 4-10 characters
  • long - 11+ characters

Example Queries:

# Zero result rate (5-minute window)
rate(autocomplete_zero_results_total[5m])

# Zero result rate by query length
rate(autocomplete_zero_results_total{query_length="short"}[5m])

# Percentage of queries with zero results
(rate(autocomplete_zero_results_total[5m]) / rate(autocomplete_queries_total[5m])) * 100

autocomplete_empty_fallback_total (Counter)

Type: Counter Labels: intent

Times enhanced autocomplete returned empty results and triggered legacy fallback. Lower values indicate stable enhanced mode.

Intent Values:

  • short - Short queries
  • pn_like - Part number queries
  • textual - Text queries
  • equipment - Equipment queries

Example Queries:

# Fallback rate
rate(autocomplete_empty_fallback_total[5m])

# Fallback percentage of total queries
(rate(autocomplete_empty_fallback_total[5m]) / rate(autocomplete_queries_total[5m])) * 100

# Alert on high fallback rate
rate(autocomplete_empty_fallback_total[5m]) > 0.1

autocomplete_suggestion_types_total (Counter)

Type: Counter Labels: type

Distribution of suggestion types returned. Indicates which fields provide results and their relative importance.

Type Values:

  • sku - Part number/SKU matches
  • category - Product category matches
  • manufacturer - Manufacturer matches
  • equipment - Equipment model matches
  • description - Description field matches
  • brand - Brand matches

Example Queries:

# Top suggestion types by count
topk(5, rate(autocomplete_suggestion_types_total[5m]))

# SKU suggestion percentage
(rate(autocomplete_suggestion_types_total{type="sku"}[5m]) /
 sum(rate(autocomplete_suggestion_types_total[5m]))) * 100

# Equipment suggestion availability
rate(autocomplete_suggestion_types_total{type="equipment"}[5m])

autocomplete_ranking_score (Histogram)

Type: Histogram Labels: suggestion_type Buckets: [0, 0.1, 0.2, 0.3, 0.5, 0.7, 0.9, 1.0]

Relevance/ranking scores of returned suggestions. Higher scores indicate better ranking quality.

Note: Sampled at 20% to reduce cardinality impact.

Example Queries:

# Average score per suggestion type
rate(autocomplete_ranking_score_sum[5m]) / rate(autocomplete_ranking_score_count[5m])

# P95 score by suggestion type
histogram_quantile(0.95, rate(autocomplete_ranking_score_bucket[5m]))

# Percentage of suggestions with score >= 0.7 (good)
(histogram_quantile(0.95, autocomplete_ranking_score_bucket) >= 0.7) * 100

Autocomplete Performance Metrics

autocomplete_elasticsearch_latency_ms (Histogram)

Type: Histogram Labels: section_type Buckets: [10, 25, 50, 100, 200, 500, 1000, 2000]

Time spent in Elasticsearch queries for each section type (milliseconds).

Section Type Values:

  • sku - Part number/SKU search
  • equipment - Equipment model search
  • category - Category search
  • name - Description/name search
  • brand - Brand/manufacturer search

Example Queries:

# P95 latency by section
histogram_quantile(0.95, rate(autocomplete_elasticsearch_latency_ms_bucket[5m]))

# Average latency per section
rate(autocomplete_elasticsearch_latency_ms_sum[5m]) / rate(autocomplete_elasticsearch_latency_ms_count[5m])

# Alert on high SKU latency
histogram_quantile(0.95, rate(autocomplete_elasticsearch_latency_ms_bucket{section_type="sku"}[5m])) > 100

autocomplete_parallel_queries_count (Histogram)

Type: Histogram Buckets: [1, 2, 3, 4, 5, 10]

Number of parallel Elasticsearch queries in a single autocomplete request. Tracks query efficiency and concurrency.

Example Queries:

# Average number of parallel queries
rate(autocomplete_parallel_queries_count_sum[5m]) / rate(autocomplete_parallel_queries_count_count[5m])

# P95 parallel query count
histogram_quantile(0.95, rate(autocomplete_parallel_queries_count_bucket[5m]))

autocomplete_cache_hits_total (Counter)

Type: Counter Labels: cache_type

Cache hits for autocomplete queries. Currently informational; used for future optimization planning.

Cache Type Values:

  • query_result - Full query result cache
  • section_data - Individual section data cache
  • suggestion_dedup - Deduplication cache

Example Queries:

# Cache hit rate
(rate(autocomplete_cache_hits_total[5m]) / rate(autocomplete_queries_total[5m])) * 100

Fitment Coverage Metrics

parts_fitment_coverage_ratio (Gauge)

Type: Gauge Labels: manufacturer, collection Range: 0-1 (ratio)

Ratio of parts with equipment fitment data. Calculated as: parts_with_fitment / total_parts

Collection Values:

  • nh_unified - New Holland unified collection
  • mchale - McHale collection
  • hotsy - HOTSY collection

Interpretation:

  • 0.0 = No parts have fitment data
  • 1.0 = All parts have fitment data
  • 0.7+ = Good coverage

Example Queries:

# NH fitment coverage
parts_fitment_coverage_ratio{manufacturer="nh"}

# Coverage by all manufacturers
parts_fitment_coverage_ratio

# Alert on coverage drop
(parts_fitment_coverage_ratio < 0.6)

# Comparison between manufacturers
parts_fitment_coverage_ratio group by (manufacturer)

parts_with_fitment_total (Gauge)

Type: Gauge Labels: manufacturer, collection

Absolute count of parts with equipment fitment data.

Example Queries:

# NH parts with fitment
parts_with_fitment_total{manufacturer="nh"}

# Total across all
sum(parts_with_fitment_total)

# Track changes over time
rate(parts_with_fitment_total[1h])

fitment_data_points_total (Gauge)

Type: Gauge Labels: manufacturer, collection

Total equipment fitment data points (individual fitment entries) across all parts.

Interpretation:

  • Indicates richness of fitment data
  • Higher values = more detailed fitment coverage
  • Use ratio: fitment_data_points / parts_with_fitment = avg fitments per part

Example Queries:

# Average fitments per part
fitment_data_points_total / parts_with_fitment_total

# Total fitment data points
sum(fitment_data_points_total)

# Alert on data loss
(fitment_data_points_total / parts_with_fitment_total) < 2

equipment_fitment_coverage_ratio (Gauge)

Type: Gauge Range: 0-1 (ratio)

Ratio of search queries with equipment fitment data available. Measures coverage from query perspective.

Interpretation:

  • 0.0 = No queries return parts with fitment data
  • 1.0 = All queries return parts with fitment data
  • 0.5+ = Reasonable coverage

Example Queries:

# Current fitment coverage
equipment_fitment_coverage_ratio

# Alert on coverage below threshold
(equipment_fitment_coverage_ratio < 0.5)

# Track coverage over time
equipment_fitment_coverage_ratio

Equipment Fitment Query Metrics

equipment_fitment_queries_total (Counter)

Type: Counter Labels: mode

Total equipment fitment queries (part -> equipment search operations).

Mode Values:

  • browse - Equipment browsing queries
  • search - Search-based equipment queries
  • related - Related equipment queries

Example Queries:

# Equipment query rate
rate(equipment_fitment_queries_total[5m])

# Query volume by mode
rate(equipment_fitment_queries_total[5m]) by (mode)

equipment_fitment_query_latency_ms (Histogram)

Type: Histogram Labels: None Buckets: [10, 50, 100, 200, 500, 1000]

Latency of equipment fitment queries (milliseconds).

Example Queries:

# P95 equipment query latency
histogram_quantile(0.95, rate(equipment_fitment_query_latency_ms_bucket[5m]))

# Average latency
rate(equipment_fitment_query_latency_ms_sum[5m]) / rate(equipment_fitment_query_latency_ms_count[5m])

# Alert on slow queries
histogram_quantile(0.95, rate(equipment_fitment_query_latency_ms_bucket[5m])) > 500

Index Health Metrics

index_document_count (Gauge)

Type: Gauge Labels: index_name

Total number of documents in search index.

Index Name Values:

  • parts_current - Current parts index (via alias)

Example Queries:

# Document count
index_document_count

# Alert on unexpected count drop
(index_document_count < 5000)

index_size_bytes (Gauge)

Type: Gauge Labels: index_name

Total size of index in bytes.

Example Queries:

# Index size in GB
index_size_bytes / 1024 / 1024 / 1024

# Alert on excessive growth
(index_size_bytes / 1024 / 1024 / 1024) > 100

Data Quality Metrics

quality_score_distribution (Histogram)

Type: Histogram Labels: manufacturer Buckets: [0, 20, 40, 60, 80, 100]

Distribution of part quality scores (0-100).

Score Ranges:

  • 0-20: Poor
  • 21-40: Fair
  • 41-60: Good
  • 61-80: Very Good
  • 81-100: Excellent

Example Queries:

# Average quality score
rate(quality_score_distribution_sum[5m]) / rate(quality_score_distribution_count[5m])

# Average by manufacturer
(rate(quality_score_distribution_sum[5m]) / rate(quality_score_distribution_count[5m])) by (manufacturer)

# P95 score
histogram_quantile(0.95, rate(quality_score_distribution_bucket[5m]))

catalog_ready_ratio (Gauge)

Type: Gauge Labels: manufacturer Range: 0-1 (ratio)

Ratio of parts with catalogReady=true (quality_score >= 80).

Interpretation:

  • Percentage of production-ready parts
  • 0.8+ = Strong data quality
  • < 0.5 = Quality concerns

Example Queries:

# Catalog ready by manufacturer
catalog_ready_ratio

# Alert on quality drop
(catalog_ready_ratio < 0.6)

# Track over time
catalog_ready_ratio

missing_image_ratio (Gauge)

Type: Gauge Labels: manufacturer Range: 0-1 (ratio)

Ratio of parts missing primary images.

Interpretation:

  • 0.0 = All parts have images
  • 1.0 = No parts have images
  • < 0.2 = Good image coverage

Example Queries:

# Missing image ratio
missing_image_ratio

# Alert on high missing rate
(missing_image_ratio > 0.3)

# Comparison across manufacturers
missing_image_ratio

Common Alert Patterns

Autocomplete Quality

# Zero result rate > 10%
- alert: AutocompleteZeroResultsHigh
  expr: (rate(autocomplete_zero_results_total[5m]) / rate(autocomplete_queries_total[5m])) > 0.1

# Fallback rate > 5%
- alert: AutocompleteFallbackHigh
  expr: (rate(autocomplete_empty_fallback_total[5m]) / rate(autocomplete_queries_total[5m])) > 0.05

# P95 latency > 500ms
- alert: AutocompleteSlowLatency
  expr: histogram_quantile(0.95, rate(autocomplete_query_duration_seconds_bucket[5m])) > 0.5

Fitment Coverage

# Fitment coverage drop
- alert: FitmentCoverageLow
  expr: parts_fitment_coverage_ratio < 0.6

# Equipment coverage drop
- alert: EquipmentFitmentCoverageLow
  expr: equipment_fitment_coverage_ratio < 0.5

Data Quality

# Catalog ready ratio < 60%
- alert: CatalogQualityLow
  expr: catalog_ready_ratio < 0.6

# Missing images > 30%
- alert: MissingImageRatioHigh
  expr: missing_image_ratio > 0.3

Dashboard Queries

Autocomplete Quality Dashboard

# Top panel: Query volume and success rate
sum(rate(autocomplete_queries_total[5m])) by (status)

# Second panel: Result distribution
histogram_quantile(0.95, rate(autocomplete_result_count_bucket[5m]))

# Third panel: Zero result rate
(rate(autocomplete_zero_results_total[5m]) / rate(autocomplete_queries_total[5m])) * 100

# Fourth panel: Suggestion type distribution
rate(autocomplete_suggestion_types_total[5m]) by (type)

# Fifth panel: Query latency
histogram_quantile(0.95, rate(autocomplete_query_duration_seconds_bucket[5m]))

# Sixth panel: Fallback rate
(rate(autocomplete_empty_fallback_total[5m]) / rate(autocomplete_queries_total[5m])) * 100

Fitment Coverage Dashboard

# Top panel: Fitment coverage by manufacturer
parts_fitment_coverage_ratio by (manufacturer)

# Second panel: Parts with fitment
parts_with_fitment_total by (manufacturer)

# Third panel: Equipment fitment coverage
equipment_fitment_coverage_ratio

# Fourth panel: Fitments per part (richness)
fitment_data_points_total / parts_with_fitment_total

# Fifth panel: Equipment query latency P95
histogram_quantile(0.95, rate(equipment_fitment_query_latency_ms_bucket[5m]))

Data Quality Dashboard

# Top panel: Catalog ready by manufacturer
catalog_ready_ratio by (manufacturer)

# Second panel: Average quality score
(rate(quality_score_distribution_sum[5m]) / rate(quality_score_distribution_count[5m])) by (manufacturer)

# Third panel: Missing image ratio
missing_image_ratio by (manufacturer)

# Fourth panel: Index health
index_document_count

# Fifth panel: Index size
index_size_bytes / 1024 / 1024 / 1024

Metric Recording Guidelines

Low Cardinality

Keep label cardinality low to prevent memory issues:

  • Use fixed set of label values (enum-like)
  • Avoid high-cardinality labels like part_id, query_hash
  • Group similar values (query_length: short/medium/long)
  • Sample high-frequency metrics (20% for suggestion types)

Sampling

To reduce overhead on frequent operations:

// Sample 20% of requests
if (Math.random() < 0.2) {
  // Record detailed metric
  autocompleteSuggestionTypes.inc({ type });
}

Performance

Metric recording should be < 1ms per request:

// Fast path - aggregate only totals
autocompleteQueriesTotal.inc({ status: 'success' });

// Slower path - histograms (reserve for important metrics)
autocompleteQueryDuration.observe({ intent }, durationSec);

Implementation Examples

Recording Result Count

autocompleteResultCount.observe(
  { query_type: intent, mode: resolvedMode },
  suggestions.length
);

Recording Zero Results

if (suggestions.length === 0) {
  const queryLength = query.length <= 3 ? 'short' : 'medium';
  autocompleteZeroResults.inc({ query_length: queryLength, query_type: intent });
}

Recording Fitment Coverage (periodic job)

const total = await countTotalParts();
const withFitment = await countPartsWithFitment();
const ratio = total > 0 ? withFitment / total : 0;

fitmentCoverageRatio.set({ manufacturer: 'nh' }, ratio);
fitmentPartsTotal.set({ manufacturer: 'nh' }, withFitment);
fitmentDataPointsTotal.set({ manufacturer: 'nh' }, totalDataPoints);

Future Improvements

  1. Query Result Caching: Track autocomplete_cache_hits once cache is implemented
  2. A/B Testing Metrics: Track variant-specific metrics for ranking experiments
  3. User Behavior: Session-level metrics (session conversion, suggestion click-through)
  4. Index Performance: Shard-level metrics, query complexity tracking
  5. Automated Alerts: Machine learning-based anomaly detection

On this page