Equipment Fitment
Three data pipelines extracting parts compatibility, service manuals, and model images from vendor data into MongoDB.
Equipment Fitment
Repository: CT-CROP/equipment-fitment Last updated: 2026-02-26 Last synced to docs: 2026-03-10
Three pipelines from vendor data into MongoDB:
- Equipment Fitment — extracts parts compatibility data into
equipment_fitmentcollection - Service Manuals — extracts full-text service manual content into
service_manual_documents+service_manual_pages - Model Image Scraper — scrapes product images from vendor websites, uploads to GCS
Supported Vendors (6)
| Code | Provider | Key Trait |
|---|---|---|
| VNT | Ventrac | Clean tables, OCR dehyphenation fixups |
| FER | Ferris | Groups with serial number ranges |
| KUH | Kuhn | Multilingual (FR/EN/DE/IT) |
| KNZ | Kinze | Planters/grain carts, filename-based model resolution |
| MCH | McHale | Balers, part descriptions as section names |
| HVT | HarvestTec | Preservative applicators |
Quick Start
bun install
# Equipment Fitment
bun run extract:ventrac # Single vendor
bun run extract:all # All 6 vendors
# Service Manuals
bun run extract:service-manuals:mchale
# Model Images
bun run scrape:images:ventracData Pipeline
PDF (vendor) → CROP-pdf-parser-service (Python OCR) → GCS bucket
→ equipment-fitment (this repo, Bun/TS) → MongoDBMongoDB Collections
| Collection | Unique Index |
|---|---|
equipment_fitment | { vendorCode, modelNumber, partNumber, section, referenceNumber } |
service_manual_documents | { documentId } |
service_manual_pages | { documentId, pageNumber } |
equipment_models | { modelKey } |
Environment
Requires MONGODB_URI. Uses GCS Application Default Credentials.
Tech Stack
Bun, TypeScript, @google-cloud/storage, MongoDB, Biome.