CROP
ProjectsParts ServicesMedia

Media Coverage API

The Media Coverage API provides comprehensive analytics on media richness across the NHL parts catalog (3,740 parts). It tracks three media types: gallery...

Media Coverage API

Overview

The Media Coverage API provides comprehensive analytics on media richness across the NHL parts catalog (3,740 parts). It tracks three media types: gallery images, 360° views, and PDF documents, enabling data-driven decisions for content enrichment strategies.

Key Features:

  • Media coverage statistics with detailed breakdowns
  • Image type distribution (MARKETING, FRONT, BACK, etc.)
  • 360° view frame analysis and quality levels
  • PDF document categorization (ready for future data)
  • High-quality parts gap identification
  • Environment-aware data access (prod/dev/stage)

Base URL: http://localhost:3005/api/health/media (development)

Production URL: https://health-analytics-service-[hash].run.app/api/health/media

Quick Start

Basic Coverage Summary

# Get overall media coverage statistics
curl http://localhost:3005/api/health/media/coverage

# Response (3,740 NHL parts analyzed)
{
  "success": true,
  "data": {
    "summary": {
      "totalParts": 3740,
      "withAnyMedia": 2544,
      "withoutMedia": 1196,
      "coveragePercentage": 68.02
    },
    "images": {
      "coverage": {
        "count": 2501,
        "percentage": 66.87
      },
      "gallery": {
        "count": 2501,
        "percentage": 66.87
      },
      "view360": {
        "count": 2507,
        "percentage": 67.03,
        "withFrames": 2507,
        "avgFrameCount": 24
      }
    },
    "documents": {
      "coverage": {
        "count": 0,
        "percentage": 0
      },
      "byType": {
        "manuals": 0,
        "datasheets": 0,
        "certifications": 0
      }
    },
    "qualityCorrelation": {
      "withMediaAvgQuality": 72.5,
      "withoutMediaAvgQuality": 45.2,
      "delta": 27.3
    }
  },
  "meta": {
    "timestamp": "2025-11-17T10:30:00Z",
    "environment": "prod",
    "database": "crop",
    "collection": "nh_unified"
  }
}

Get Image Type Distribution

# Analyze gallery image types
curl http://localhost:3005/api/health/media/distribution

# Response
{
  "success": true,
  "data": {
    "imageTypes": {
      "marketing": {
        "count": 1847,
        "percentage": 49.39,
        "avgPerPart": 1.2
      },
      "front": {
        "count": 2234,
        "percentage": 59.73,
        "avgPerPart": 1.1
      },
      "back": {
        "count": 1823,
        "percentage": 48.74,
        "avgPerPart": 1.0
      },
      "left": {
        "count": 156,
        "percentage": 4.17,
        "avgPerPart": 0.8
      },
      "right": {
        "count": 142,
        "percentage": 3.80,
        "avgPerPart": 0.9
      },
      "angle1": {
        "count": 89,
        "percentage": 2.38,
        "avgPerPart": 1.3
      },
      "angle2": {
        "count": 67,
        "percentage": 1.79,
        "avgPerPart": 1.1
      }
    },
    "view360Distribution": {
      "by_frame_count": {
        "24_frames": 2203,
        "36_frames": 156,
        "48_frames": 89,
        "other": 59
      },
      "by_quality": {
        "high": 2340,
        "standard": 167
      }
    },
    "totalParts": 3740
  }
}

Find Quality Gaps

# Get high-quality parts missing media (prime enrichment targets)
curl "http://localhost:3005/api/health/media/gaps?minQuality=80&limit=20"

# Response
{
  "success": true,
  "data": {
    "gaps": [
      {
        "partNumber": "87840296",
        "sku": "117-2295-001",
        "title": "Hydraulic Filter Element",
        "qualityScore": 85,
        "missingMedia": {
          "gallery": false,
          "view360": true,
          "documents": true
        },
        "existingMedia": {
          "galleryCount": 2,
          "imageTypes": ["front", "back"]
        }
      }
    ],
    "totalGaps": 297,
    "showing": 20,
    "filters": {
      "minQuality": 80,
      "mediaType": "all"
    }
  }
}

Endpoints

GET /api/health/media/coverage

Get comprehensive media coverage summary for NHL parts.

Query Parameters:

ParameterTypeDefaultDescription
environmentstringprodData environment: prod, dev, stage

Example Request:

curl -X GET \
  'http://localhost:3005/api/health/media/coverage?environment=prod' \
  -H 'Accept: application/json'

Response Schema:

{
  success: boolean;
  data: {
    summary: {
      totalParts: number;           // Total NHL parts analyzed
      withAnyMedia: number;          // Parts with any media type
      withoutMedia: number;          // Parts with zero media
      coveragePercentage: number;    // (withAnyMedia / totalParts) * 100
    };
    images: {
      coverage: {
        count: number;               // Parts with gallery images
        percentage: number;          // % of total parts
      };
      gallery: {
        count: number;               // Parts with gallery images
        percentage: number;
      };
      view360: {
        count: number;               // Parts with 360° views
        percentage: number;
        withFrames: number;          // 360s with frame data
        avgFrameCount: number;       // Average frames per 360
      };
    };
    documents: {
      coverage: {
        count: number;               // Parts with PDFs (currently 0)
        percentage: number;
      };
      byType: {
        manuals: number;             // User manuals count
        datasheets: number;          // Spec sheets count
        certifications: number;      // Certificates count
      };
    };
    qualityCorrelation: {
      withMediaAvgQuality: number;   // Avg quality score (parts with media)
      withoutMediaAvgQuality: number; // Avg quality score (parts without media)
      delta: number;                 // Difference (positive = media improves quality)
    };
  };
  meta: {
    timestamp: string;               // ISO 8601 timestamp
    environment: string;             // Data source environment
    database: string;                // MongoDB database name
    collection: string;              // MongoDB collection name
  };
}

Error Codes:

CodeDescription
200Success
400Invalid environment parameter
500MongoDB connection error
503Service unavailable

Response Time: ~150-300ms (aggregates 3,740 documents)

cURL Examples:

# Production data
curl http://localhost:3005/api/health/media/coverage

# Development environment
curl 'http://localhost:3005/api/health/media/coverage?environment=dev'

# With pretty-print
curl http://localhost:3005/api/health/media/coverage | jq

GET /api/health/media/distribution

Get detailed distribution of image types and 360° view characteristics.

Query Parameters:

ParameterTypeDefaultDescription
environmentstringprodData environment: prod, dev, stage
groupBystringtypeGroup results by: type, quality, frames

Example Request:

curl -X GET \
  'http://localhost:3005/api/health/media/distribution?groupBy=type' \
  -H 'Accept: application/json'

Response Schema:

{
  success: boolean;
  data: {
    imageTypes: {
      [key in 'marketing' | 'front' | 'back' | 'left' | 'right' | 'angle1' | 'angle2']: {
        count: number;              // Parts with this image type
        percentage: number;         // % of total parts
        avgPerPart: number;         // Average images per part
      };
    };
    view360Distribution: {
      by_frame_count: {
        '24_frames': number;        // Standard quality (24 frames)
        '36_frames': number;        // Higher quality (36 frames)
        '48_frames': number;        // Premium quality (48 frames)
        'other': number;            // Non-standard frame counts
      };
      by_quality: {
        high: number;               // 36+ frames
        standard: number;           // <36 frames
      };
      by_grid: {
        '4x6': number;              // 4 rows × 6 columns
        '6x6': number;              // 6 rows × 6 columns
        '8x6': number;              // 8 rows × 6 columns
        'other': number;            // Non-standard grids
      };
    };
    totalParts: number;
  };
  meta: {
    timestamp: string;
    environment: string;
  };
}

Error Codes:

CodeDescription
200Success
400Invalid groupBy parameter
500MongoDB query error

Response Time: ~200-400ms (processes image arrays)

cURL Examples:

# Group by image type (default)
curl http://localhost:3005/api/health/media/distribution

# Group by 360° quality levels
curl 'http://localhost:3005/api/health/media/distribution?groupBy=quality'

# Group by frame counts
curl 'http://localhost:3005/api/health/media/distribution?groupBy=frames'

GET /api/health/media/gaps

Identify high-quality parts missing media (enrichment opportunities).

Query Parameters:

ParameterTypeDefaultDescription
minQualitynumber70Minimum quality score threshold (0-100)
mediaTypestringallFilter by media type: all, gallery, view360, documents
limitnumber50Maximum results to return (1-500)
offsetnumber0Pagination offset
sortBystringqualitySort field: quality, partNumber, sku
sortOrderstringdescSort direction: asc, desc
environmentstringprodData environment: prod, dev, stage

Example Request:

curl -X GET \
  'http://localhost:3005/api/health/media/gaps?minQuality=80&mediaType=view360&limit=10' \
  -H 'Accept: application/json'

Response Schema:

{
  success: boolean;
  data: {
    gaps: Array<{
      partNumber: string;           // Canonical part number
      sku: string;                  // SKU identifier (e.g., "117-2295-001")
      title: string;                // Product title
      qualityScore: number;         // Overall quality score (0-100)
      missingMedia: {
        gallery: boolean;           // true if missing gallery images
        view360: boolean;           // true if missing 360° view
        documents: boolean;         // true if missing PDFs
      };
      existingMedia: {
        galleryCount: number;       // Number of existing gallery images
        imageTypes: string[];       // List of available image types
        has360: boolean;            // Has 360° view
        frameCount?: number;        // 360° frame count (if available)
      };
      enrichmentPriority: 'high' | 'medium' | 'low'; // Calculated priority
      estimatedImpact: number;      // Potential quality score increase
    }>;
    totalGaps: number;              // Total parts matching criteria
    showing: number;                // Number of results returned
    pagination: {
      limit: number;
      offset: number;
      hasMore: boolean;
    };
    filters: {
      minQuality: number;
      mediaType: string;
    };
  };
  meta: {
    timestamp: string;
    environment: string;
  };
}

Error Codes:

CodeDescription
200Success
400Invalid query parameters (e.g., minQuality out of range)
500MongoDB query error

Response Time: ~300-600ms (quality score filtering)

cURL Examples:

# Top 20 high-quality parts missing any media
curl 'http://localhost:3005/api/health/media/gaps?minQuality=80&limit=20'

# Parts missing 360° views specifically
curl 'http://localhost:3005/api/health/media/gaps?mediaType=view360&minQuality=75'

# Paginated results (page 2)
curl 'http://localhost:3005/api/health/media/gaps?limit=50&offset=50'

# Sort by SKU alphabetically
curl 'http://localhost:3005/api/health/media/gaps?sortBy=sku&sortOrder=asc'

# Development environment gaps
curl 'http://localhost:3005/api/health/media/gaps?environment=dev&minQuality=70'

Data Model

Media Coverage Statistics

The API tracks three media dimensions:

1. Gallery Images

  • Static product photos in various angles
  • Types: MARKETING, FRONT, BACK, LEFT, RIGHT, ANGLE1, ANGLE2
  • Stored in GCS bucket: gs://crop_parts/newholland/images/
  • Format: {partNumber}/{partNumber}_{TYPE}.jpg (e.g., 87840296/87840296_FRONT.jpg)
  • Clear background (CB) variants: {partNumber}/{partNumber}_CB_{TYPE}.jpg

2. 360° Views

  • Interactive spin views with multiple frames
  • Frame counts: 24 (standard), 36 (high), 48+ (premium)
  • Grid layouts: 4×6, 6×6, 8×6 (rows × columns)
  • Each frame stored separately with grid coordinates
  • Status levels: gcp (hosted), external (third-party), url_only, not_available, none

3. PDF Documents (infrastructure ready, data pending)

  • Manuals: Installation, service, parts manuals
  • Datasheets: Technical specifications, dimensions
  • Certifications: Safety certificates, compliance docs
  • Will be stored in documents.manuals[], documents.datasheets[], documents.certifications[]

Image Types

TypeDescriptionUse CaseCoverage
MARKETINGLifestyle/promotional imagesHero images, catalog covers49.39%
FRONTFront-facing product photoPrimary product view59.73%
BACKRear viewInstallation reference48.74%
LEFTLeft side viewLateral inspection4.17%
RIGHTRight side viewLateral inspection3.80%
ANGLE1Angle view 1 (R01_C24)Contextual perspective2.38%
ANGLE2Angle view 2 (R02_C24)Alternative perspective1.79%

Priority Order for Content Teams:

  1. FRONT (most important - primary product identification)
  2. BACK (second priority - installation reference)
  3. MARKETING (third - visual appeal)
  4. LEFT/RIGHT (nice-to-have - detailed inspection)
  5. ANGLE1/ANGLE2 (optional - additional context)

360° Views

Frame Counts and Quality Levels:

FramesQualityGridDescriptionCoverage
24Standard4×6Basic spin view, sufficient for most parts58.88%
36High6×6Smooth rotation, better detail4.17%
48+Premium8×6Professional quality, maximum detail2.38%

Status Levels:

  • gcp: Frames hosted in GCS (best - fast loading, consistent CDN)
  • external: Third-party hosting (caution - external dependencies)
  • url_only: Reference URL available (needs migration to GCS)
  • not_available: Data exists but not accessible (requires investigation)
  • none: No 360° view available (enrichment opportunity)

Frame Structure:

{
  "view360": {
    "status": "gcp",
    "frameCount": 24,
    "rows": 4,
    "columns": 6,
    "frames": [
      {
        "url": "https://storage.googleapis.com/.../frame_001.jpg",
        "row": 0,
        "col": 0
      },
      // ... 23 more frames
    ]
  }
}

PDF Documents

Current State: Zero documents across all 3,740 parts. Infrastructure is ready, awaiting data pipeline.

Future Structure:

{
  "documents": {
    "manuals": [
      {
        "type": "installation",
        "title": "Installation Guide - Hydraulic Filter",
        "url": "https://storage.googleapis.com/.../87840296_install.pdf",
        "language": "en",
        "pageCount": 12,
        "fileSize": 2457600
      }
    ],
    "datasheets": [
      {
        "type": "specifications",
        "title": "Technical Specifications",
        "url": "https://storage.googleapis.com/.../87840296_specs.pdf"
      }
    ],
    "certifications": [
      {
        "type": "safety",
        "title": "CE Certification",
        "url": "https://storage.googleapis.com/.../87840296_ce.pdf",
        "issuer": "TUV",
        "validUntil": "2026-12-31"
      }
    ]
  }
}

Migration Path: See MEDIA_PDF_MIGRATION.md for data integration plan.


Use Cases

For Product Managers

Question: "Which product categories need more images?"

# Get distribution data
curl http://localhost:3005/api/health/media/distribution | jq

# Analyze gaps by quality
curl 'http://localhost:3005/api/health/media/gaps?minQuality=80' | \
  jq '.data.gaps | group_by(.category) | map({category: .[0].category, count: length})'

Insight: FRONT images cover 59.73% of parts, but LEFT/RIGHT only 4%. Focus photography budget on lateral views.

Question: "What's our 360° view penetration?"

# Coverage summary
curl http://localhost:3005/api/health/media/coverage | \
  jq '.data.images.view360'

# Result: 67.03% have 360° views (2,507 parts)

Decision: 67% coverage is strong. Prioritize enriching the remaining 33% (1,233 parts) vs re-shooting existing 360s.


For Content Teams

Question: "What should we photograph next?"

# Get top 50 high-quality parts without media
curl 'http://localhost:3005/api/health/media/gaps?minQuality=85&limit=50' | \
  jq '.data.gaps[] | {sku, title, priority: .enrichmentPriority}' > shoot_list.json

Output: Prioritized list with SKU, title, and calculated priority (high/medium/low).

Question: "Which parts have incomplete image sets?"

# Parts with only 1-2 images (incomplete)
curl 'http://localhost:3005/api/health/media/gaps' | \
  jq '.data.gaps[] | select(.existingMedia.galleryCount < 3 and .existingMedia.galleryCount > 0)'

Action: Schedule re-shoots for parts with incomplete image sets to reach 3+ angles.


For Analytics Teams

Question: "Does media richness correlate with quality scores?"

# Get correlation data
curl http://localhost:3005/api/health/media/coverage | \
  jq '.data.qualityCorrelation'

# Result:
# {
#   "withMediaAvgQuality": 72.5,
#   "withoutMediaAvgQuality": 45.2,
#   "delta": 27.3
# }

Insight: Parts with media score 27.3 points higher on average. Strong correlation → media investment ROI is measurable.

Question: "What's the distribution of media types?"

# Detailed breakdown
curl http://localhost:3005/api/health/media/distribution | \
  jq '.data.imageTypes | to_entries | map({type: .key, coverage: .value.percentage}) | sort_by(.coverage) | reverse'

Visualization: Feed into BI dashboards (Looker, Tableau, etc.) for stakeholder reports.


Frontend Integration

React Example

Hook-based data fetching:

import { useQuery } from '@tanstack/react-query';

interface MediaCoverage {
  summary: {
    totalParts: number;
    coveragePercentage: number;
  };
  images: {
    gallery: { count: number; percentage: number };
    view360: { count: number; percentage: number; avgFrameCount: number };
  };
}

function useMediaCoverage(environment: 'prod' | 'dev' = 'prod') {
  return useQuery<MediaCoverage>({
    queryKey: ['media-coverage', environment],
    queryFn: async () => {
      const response = await fetch(
        `/api/health/media/coverage?environment=${environment}`
      );
      if (!response.ok) {
        throw new Error('Failed to fetch media coverage');
      }
      const json = await response.json();
      return json.data;
    },
    staleTime: 5 * 60 * 1000, // 5 minutes
    cacheTime: 10 * 60 * 1000, // 10 minutes
  });
}

// Usage in component
function MediaDashboard() {
  const { data, isLoading, error } = useMediaCoverage('prod');

  if (isLoading) return <Spinner />;
  if (error) return <ErrorMessage error={error} />;

  return (
    <div className="grid grid-cols-3 gap-4">
      <MetricCard
        title="Total Parts"
        value={data.summary.totalParts.toLocaleString()}
      />
      <MetricCard
        title="Media Coverage"
        value={`${data.summary.coveragePercentage.toFixed(1)}%`}
        trend={data.summary.coveragePercentage > 65 ? 'up' : 'down'}
      />
      <MetricCard
        title="360° Views"
        value={data.images.view360.count.toLocaleString()}
        subtitle={`Avg ${data.images.view360.avgFrameCount} frames`}
      />
    </div>
  );
}

Response Handling

Display coverage stats in UI:

// Progress bar component
function CoverageProgress({ percentage }: { percentage: number }) {
  const color = percentage >= 80 ? 'green' : percentage >= 60 ? 'yellow' : 'red';

  return (
    <div className="w-full bg-gray-200 rounded-full h-4">
      <div
        className={`h-4 rounded-full bg-${color}-500`}
        style={{ width: `${percentage}%` }}
      >
        <span className="pl-2 text-white text-xs font-bold">
          {percentage.toFixed(1)}%
        </span>
      </div>
    </div>
  );
}

// Image type breakdown chart
function ImageTypeChart({ distribution }: { distribution: ImageTypeDistribution }) {
  const chartData = Object.entries(distribution.imageTypes).map(([type, stats]) => ({
    name: type.toUpperCase(),
    value: stats.percentage,
    count: stats.count,
  }));

  return (
    <BarChart data={chartData} width={600} height={400}>
      <XAxis dataKey="name" />
      <YAxis label={{ value: 'Coverage %', angle: -90 }} />
      <Tooltip
        content={({ payload }) => (
          <div className="bg-white p-2 border rounded shadow">
            <p className="font-bold">{payload?.[0]?.payload?.name}</p>
            <p>{payload?.[0]?.payload?.count} parts ({payload?.[0]?.value?.toFixed(1)}%)</p>
          </div>
        )}
      />
      <Bar dataKey="value" fill="#3b82f6" />
    </BarChart>
  );
}

Gap identification table:

import { useState } from 'react';
import { useQuery } from '@tanstack/react-query';

function MediaGapsTable() {
  const [minQuality, setMinQuality] = useState(80);
  const [page, setPage] = useState(0);
  const pageSize = 20;

  const { data } = useQuery({
    queryKey: ['media-gaps', minQuality, page],
    queryFn: async () => {
      const response = await fetch(
        `/api/health/media/gaps?minQuality=${minQuality}&limit=${pageSize}&offset=${page * pageSize}`
      );
      return response.json();
    },
  });

  return (
    <div>
      <div className="mb-4 flex items-center gap-4">
        <label>
          Min Quality:
          <input
            type="range"
            min="0"
            max="100"
            value={minQuality}
            onChange={(e) => setMinQuality(Number(e.target.value))}
            className="ml-2"
          />
          <span className="ml-2 font-bold">{minQuality}</span>
        </label>
      </div>

      <table className="w-full border-collapse">
        <thead>
          <tr className="bg-gray-100">
            <th className="p-2 text-left">SKU</th>
            <th className="p-2 text-left">Title</th>
            <th className="p-2 text-center">Quality</th>
            <th className="p-2 text-center">Missing</th>
            <th className="p-2 text-center">Priority</th>
          </tr>
        </thead>
        <tbody>
          {data?.data.gaps.map((gap) => (
            <tr key={gap.sku} className="border-b hover:bg-gray-50">
              <td className="p-2 font-mono">{gap.sku}</td>
              <td className="p-2">{gap.title}</td>
              <td className="p-2 text-center">
                <QualityBadge score={gap.qualityScore} />
              </td>
              <td className="p-2 text-center">
                <MediaGapBadges missing={gap.missingMedia} />
              </td>
              <td className="p-2 text-center">
                <PriorityBadge level={gap.enrichmentPriority} />
              </td>
            </tr>
          ))}
        </tbody>
      </table>

      <div className="mt-4 flex justify-between">
        <button
          onClick={() => setPage(p => Math.max(0, p - 1))}
          disabled={page === 0}
          className="px-4 py-2 bg-blue-500 text-white rounded disabled:opacity-50"
        >
          Previous
        </button>
        <span>Page {page + 1}</span>
        <button
          onClick={() => setPage(p => p + 1)}
          disabled={!data?.data.pagination.hasMore}
          className="px-4 py-2 bg-blue-500 text-white rounded disabled:opacity-50"
        >
          Next
        </button>
      </div>
    </div>
  );
}

Performance

Caching Strategy

TTLs (Time-To-Live):

EndpointCache TTLRationale
/coverage10 minutesData changes infrequently (batch updates)
/distribution15 minutesImage type distribution very stable
/gaps5 minutesQuality scores update more frequently

Cache Keys:

media:coverage:{environment}:{timestamp_rounded_10min}
media:distribution:{environment}:{groupBy}:{timestamp_rounded_15min}
media:gaps:{environment}:{minQuality}:{mediaType}:{offset}:{timestamp_rounded_5min}

Implementation (Redis):

import { createClient } from 'redis';

const redis = createClient({ url: process.env.REDIS_URL });

async function getCachedCoverage(environment: string) {
  const cacheKey = `media:coverage:${environment}`;

  // Try cache first
  const cached = await redis.get(cacheKey);
  if (cached) {
    return JSON.parse(cached);
  }

  // Cache miss - fetch from MongoDB
  const data = await fetchCoverageFromMongo(environment);

  // Store in cache (10 min TTL)
  await redis.setEx(cacheKey, 600, JSON.stringify(data));

  return data;
}

Invalidation:

  • Automatic: TTL expiration
  • Manual: After bulk data sync (bun scripts/sync-mongodb-to-es.ts)
  • Webhook: When GCS image manifest updates

Stale-While-Revalidate:

// Serve stale data while fetching fresh data in background
async function getCoverageWithSWR(environment: string) {
  const cacheKey = `media:coverage:${environment}`;
  const lockKey = `${cacheKey}:lock`;

  const cached = await redis.get(cacheKey);
  const ttl = await redis.ttl(cacheKey);

  // If cached and fresh, return immediately
  if (cached && ttl > 60) {
    return JSON.parse(cached);
  }

  // If cached but stale, return stale + refresh in background
  if (cached && ttl > 0) {
    // Try to acquire lock for background refresh
    const lockAcquired = await redis.set(lockKey, '1', {
      NX: true,
      EX: 30
    });

    if (lockAcquired) {
      // Refresh in background (no await)
      refreshCoverageInBackground(environment, cacheKey, lockKey);
    }

    return JSON.parse(cached);
  }

  // Cache miss - fetch and wait
  return getCachedCoverage(environment);
}

Query Optimization

MongoDB Indexes:

// Required indexes for optimal performance
db.nh_unified.createIndex({ 'media.images': 1 });
db.nh_unified.createIndex({ 'media.view360.status': 1 });
db.nh_unified.createIndex({ 'media.view360.frameCount': 1 });
db.nh_unified.createIndex({ 'qualityScore.total': -1 });
db.nh_unified.createIndex({
  'qualityScore.total': -1,
  'media.imagesCount': 1
}); // Compound index for gaps query

Aggregation Pipeline Optimization:

// Efficient coverage aggregation (single pass)
db.nh_unified.aggregate([
  {
    $facet: {
      summary: [
        {
          $group: {
            _id: null,
            totalParts: { $sum: 1 },
            withGallery: {
              $sum: {
                $cond: [{ $gt: ['$media.imagesCount', 0] }, 1, 0]
              }
            },
            with360: {
              $sum: {
                $cond: [
                  { $in: ['$media.view360.status', ['gcp', 'external']] },
                  1,
                  0
                ]
              }
            }
          }
        }
      ],
      imageTypes: [
        { $unwind: '$media.images' },
        {
          $group: {
            _id: '$media.images.type',
            count: { $sum: 1 }
          }
        }
      ]
    }
  }
]);

Projection Optimization (reduce data transfer):

// Only fetch fields needed for gaps endpoint
db.nh_unified.find(
  {
    'qualityScore.total': { $gte: 80 },
    'media.imagesCount': { $lt: 3 }
  },
  {
    projection: {
      _id: 0,
      partNumber: 1,
      sku: 1,
      title: 1,
      'qualityScore.total': 1,
      'media.imagesCount': 1,
      'media.images.type': 1,
      'media.view360.status': 1
    }
  }
).limit(50);

Query Performance Benchmarks:

EndpointDocuments ScannedExecution TimeWith IndexNotes
/coverage3,740~250ms~80msFaceted aggregation
/distribution3,740 + unwind~400ms~150msArray unwinding overhead
/gaps~1,000 (filtered)~180ms~60msQuality score index critical

Environment Support

The API supports multi-environment data access for development, staging, and production workflows.

Environment Configuration

MongoDB Connection Strings:

# .env configuration
MONGODB_URI_PROD=mongodb+srv://user:pass@cluster.mongodb.net/crop?retryWrites=true
MONGODB_URI_DEV=mongodb+srv://user:pass@cluster-dev.mongodb.net/crop_dev?retryWrites=true
MONGODB_URI_STAGE=mongodb+srv://user:pass@cluster-stage.mongodb.net/crop_stage?retryWrites=true

# Default environment (if not specified in query)
DEFAULT_ENVIRONMENT=prod

Usage Examples

Query production data:

curl http://localhost:3005/api/health/media/coverage
# or explicitly
curl 'http://localhost:3005/api/health/media/coverage?environment=prod'

Query development data:

curl 'http://localhost:3005/api/health/media/coverage?environment=dev'

Compare environments:

# Fetch both in parallel
curl 'http://localhost:3005/api/health/media/coverage?environment=prod' > prod.json &
curl 'http://localhost:3005/api/health/media/coverage?environment=dev' > dev.json &
wait

# Compare coverage percentages
diff <(jq '.data.summary.coveragePercentage' prod.json) \
     <(jq '.data.summary.coveragePercentage' dev.json)

Environment Validation

The API validates environment parameters and returns clear errors:

# Invalid environment
curl 'http://localhost:3005/api/health/media/coverage?environment=invalid'

# Response: 400 Bad Request
{
  "success": false,
  "error": {
    "message": "Invalid environment parameter",
    "details": "Environment must be one of: prod, dev, stage",
    "code": "INVALID_ENVIRONMENT"
  }
}

Migration Guide

Adding PDF Data When Available

See MEDIA_PDF_MIGRATION.md for complete migration guide.

Quick Overview:

Phase 1 (Current): API returns PDF fields with count = 0, schema ready.

Phase 2 (Future): When PDF data arrives:

  1. Scan GCS bucket for PDFs
  2. Enrich nh_unified collection with document metadata
  3. API automatically reflects non-zero counts (no code changes)

No Breaking Changes: Frontend code continues to work - counts simply populate from zero to actual values.


Troubleshooting

Common Issues

Issue: "Coverage endpoint returns 0 parts"

Cause: Wrong database/collection configuration.

Solution:

# Check connection
curl http://localhost:3005/health | jq

# Verify collection name
echo $MONGODB_COLLECTION # should be "nh_unified"

# Test MongoDB query directly
mongo "$MONGODB_URI" --eval 'db.nh_unified.countDocuments()'

Issue: "Gaps endpoint returns empty array"

Cause: Quality threshold too high or no matching parts.

Solution:

# Lower quality threshold
curl 'http://localhost:3005/api/health/media/gaps?minQuality=60'

# Check quality score distribution
mongo "$MONGODB_URI" --eval '
  db.nh_unified.aggregate([
    {
      $bucket: {
        groupBy: "$qualityScore.total",
        boundaries: [0, 20, 40, 60, 80, 100],
        default: "other",
        output: { count: { $sum: 1 } }
      }
    }
  ])
'

Issue: "Image type counts don't match gallery count"

Cause: Parts can have multiple images of the same type (e.g., 2 FRONT angles).

Explanation:

  • images.gallery.count: Number of parts with gallery images (2,501)
  • imageTypes.front.count: Number of parts with FRONT-type images (2,234)
  • A part can have 0, 1, or multiple FRONT images

Example:

// Part with 2 FRONT images
{
  "partNumber": "87840296",
  "media": {
    "imagesCount": 3,
    "images": [
      { "type": "front", "url": "...FRONT.jpg" },
      { "type": "front", "url": "...CB_FRONT.jpg" }, // Clear background variant
      { "type": "back", "url": "...BACK.jpg" }
    ]
  }
}

Issue: "360° frame counts seem wrong"

Cause: Frame array length vs frameCount field mismatch.

Diagnosis:

# Find mismatches
mongo "$MONGODB_URI" --eval '
  db.nh_unified.find({
    $expr: {
      $ne: [
        "$media.view360.frameCount",
        { $size: { $ifNull: ["$media.view360.frames", []] } }
      ]
    }
  }).count()
'

Solution: Re-scan 360° manifests with bun scripts/scan-gcs-360s.ts (future script).


Issue: "API response time >1s (too slow)"

Cause: Missing MongoDB indexes.

Solution:

# Check existing indexes
mongo "$MONGODB_URI" --eval 'db.nh_unified.getIndexes()'

# Create recommended indexes (see Query Optimization section)
mongo "$MONGODB_URI" < scripts/create-media-indexes.js

# Verify index usage
mongo "$MONGODB_URI" --eval '
  db.nh_unified.find({
    "qualityScore.total": { $gte: 80 }
  }).explain("executionStats")
'
# Look for: "stage": "IXSCAN" (good) vs "COLLSCAN" (bad)

Error Codes Reference

CodeMessageCauseSolution
400Invalid environment parameterQuery param not in [prod, dev, stage]Use valid environment
400Invalid minQuality parameterQuality score out of range [0-100]Pass number 0-100
400Invalid limit parameterLimit out of range [1-500]Pass number 1-500
404Collection not foundMongoDB collection doesn't existCheck DB/collection config
500MongoDB query errorDB connection or query failedCheck logs, verify connectivity
503Service unavailableMongoDB connection downRestart service, check DB status

Debug Mode

Enable detailed logging for troubleshooting:

# Development
export LOG_LEVEL=debug
bun run dev

# Production (Cloud Run)
gcloud run services update health-analytics \
  --set-env-vars=LOG_LEVEL=debug

Log Output:

[DEBUG] Media coverage query: { environment: 'prod', collection: 'nh_unified' }
[DEBUG] MongoDB query: { $facet: { ... } }
[DEBUG] Query execution time: 87ms
[DEBUG] Results: { totalParts: 3740, withGallery: 2501, ... }

API Versioning

Current version: v1 (stable)

Endpoint format: /api/health/media/*

Version headers:

X-API-Version: 1.0.0
X-Schema-Version: 2025.11

Backwards compatibility: API adheres to semantic versioning. Non-breaking changes (new fields, optional parameters) added without version bump. Breaking changes (field removal, type changes) require version increment.

Deprecation policy: 6-month notice before removing deprecated fields.


Rate Limiting

Development: No limits (localhost)

Production:

  • 100 requests/minute per IP (coverage/distribution endpoints)
  • 60 requests/minute per IP (gaps endpoint - more expensive)
  • 429 Too Many Requests response when exceeded

Headers:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 73
X-RateLimit-Reset: 1700000000

Bypass: Contact backend team for API key with higher limits.


OpenAPI Specification

See openapi/media-coverage.yaml for machine-readable spec.

Generate TypeScript client:

npx @openapitools/openapi-generator-cli generate \
  -i docs/openapi/media-coverage.yaml \
  -g typescript-fetch \
  -o src/generated/media-api

Postman Collection

Import collection: postman/media-coverage.json

Pre-configured:

  • Local environment (http://localhost:3005)
  • Production environment (Cloud Run URL)
  • Example requests with test assertions
  • Environment variables for easy switching

Contributing

Adding new endpoints:

  1. Define route in src/routes/media.ts
  2. Add repository method in src/repositories/media-repository.ts
  3. Update types in src/types/media.ts
  4. Write tests in src/__tests__/media.test.ts
  5. Update OpenAPI spec docs/openapi/media-coverage.yaml
  6. Update this documentation

Testing checklist:

  • Unit tests pass (bun test)
  • Integration tests pass (bun test:integration)
  • Manual testing against dev database
  • OpenAPI spec validates (bun run validate:openapi)
  • Documentation updated
  • Postman collection updated


License

Proprietary - CROP Platform


Support

For issues or questions:


Last Updated: 2025-11-17 API Version: 1.0.0 Document Version: 1.0

On this page