DIS Integration Refactoring Plan
Investigation of ~500 "not found" parts revealed several data quality and code architecture issues. This document provides constructive criticism of current...
DIS Integration Refactoring Plan
Executive Summary
Investigation of ~500 "not found" parts revealed several data quality and code architecture issues. This document provides constructive criticism of current implementation and a phased refactoring plan.
COMPLETED (2026-01-14): BNS routing fix implemented!
- Before: 977/1314 (74.4%) found
- After: 1071/1314 (81.5%) found
- Improvement: +94 parts (+7.1%)
Part 1: Constructive Criticism
1.1 Data Quality Issues
Issue A: MCH Collection Contains Foreign Parts (CRITICAL)
Location: MongoDB parts_mch collection
Impact: 27 parts incorrectly categorized, 23 not found
The parts_mch collection contains Marcrest parts that should be in parts_mar:
- Numeric-only part numbers (40563264, 121052, 13377, 34134, etc.)
- These are NOT McHale format (McHale uses CEL00xxx, CBR00xxx prefixes)
Evidence from crop-john SSH:
/home/vova/prts/McHale/ - 66 files (CEL00xxx format)
/home/vova/prts/MARCREST/ - 69 files (numeric format)Root Cause: Photo upload process may have incorrect vendor selection, or initial data migration was flawed.
Issue B: FS Suffix Routes to Wrong Manufacturer (HIGH)
Location: lib/types/dis.ts:225
Impact: ~35 parts (91% found as FER vs 62% as B&S)
Current code:
case "FS_SUFFIX":
return "B&S"; // WRONG - should return "FER"Testing showed FS suffix parts have significantly higher success rate with FER manufacturer code.
Issue C: Test Data in Production Collections (LOW)
Location: MongoDB parts_rav collection
Impact: Cosmetic, affects analytics accuracy
RAV vendor has a test part with partNumber "raven" - should be removed.
1.2 Code Architecture Issues
Issue D: Code Duplication (HIGH)
Locations:
crop-front-admin/lib/types/dis.tsCROP-parts-services/services/catalog/src/services/dis-service.ts
~1000+ lines of identical DIS logic duplicated between frontend and backend:
- BNS routing patterns
- Manufacturer code mapping
- Part number detection logic
Risk: Logic drift when one is updated but not the other.
Issue E: Vendor Support Mismatch (MEDIUM)
Frontend: 10 vendors (KUH, NHL, KIN, MCH, BNS, VNT, HAR, HOT, MAR, RAV) Backend: 6 vendors (missing HAR, HOT, MAR, RAV)
This causes inconsistent behavior:
- Frontend can display and query these vendors
- Backend cannot process them in certain operations
Issue F: Coverage Metric Inconsistency (LOW)
Frontend: coverage = inStock / totalChecked
Backend: coverage = found / totalChecked
Different definitions lead to confusing analytics.
1.3 Naming Inconsistencies
| Source | Vendor Name |
|---|---|
| crop-john filesystem | Harvest Tech |
| DIS_VENDOR_NAMES | Harvest Tec |
| crop-john filesystem | Raven GPS |
| DIS_VENDOR_NAMES | Raven |
Part 2: Root Cause Analysis
Why Parts Show "Not Found"
| Reason | Affected Parts | Percentage |
|---|---|---|
| Wrong ManufacturerCode routing (FS→B&S instead of FER) | ~35 | 7% |
| Parts in wrong collection (MCH→MAR) | 23 | 5% |
| Genuinely not in DIS system | ~350 | 70% |
| Obsolete/discontinued parts | ~50 | 10% |
| Formatting issues (spaces, case) | ~40 | 8% |
DIS Routing Success Rates by Pattern
| Pattern | Current Route | Success Rate | Better Route | Better Rate |
|---|---|---|---|---|
| 7-digit | FER | 94% | - | - |
| DOT format | VNT | 100% | - | - |
| SM suffix | B&S | 100% | - | - |
| YP suffix | B&S | 95% | - | - |
| FS suffix | B&S | 62% | FER | 91% |
| 6-digit | B&S | 85% | - | - |
| DASH format | B&S | 30% | - | (internal codes) |
Part 3: Multi-Phase Refactoring Plan
Phase 1: Quick Wins (Data Fixes)
Timeline: Immediate Risk: Low Impact: +58 parts found
1.1 Fix FS Suffix Routing
// In lib/types/dis.ts, line 225
case "FS_SUFFIX":
return "FER"; // Changed from "B&S"Expected Impact: +31 parts found
1.2 Move MAR Parts from MCH Collection
Create migration script to identify and move Marcrest parts:
// Identify by pattern: all-numeric, no letter prefix
db.parts_mch.find({ partNumber: /^\d+$/ }).forEach(doc => {
db.parts_mar.insertOne(doc);
db.parts_mch.deleteOne({ _id: doc._id });
});Expected Impact: +23 parts found
1.3 Remove Test Data
db.parts_rav.deleteOne({ partNumber: "raven" });
db.parts_123test.drop();Phase 2: Consistency Fixes
Timeline: 1-2 sprints Risk: Low-Medium Impact: Improved maintainability
2.1 Normalize Vendor Names
Update DIS_VENDOR_NAMES to match source data:
export const DIS_VENDOR_NAMES: Record<string, string> = {
// ...
HAR: "Harvest Tech", // Changed from "Harvest Tec"
RAV: "Raven GPS", // Changed from "Raven"
};2.2 Align Backend Vendor Support
Add missing vendors to backend DIS service:
// In CROP-parts-services
const DIS_VENDOR_CODE_MAP = {
// ...existing...
HAR: "HAR",
HOT: "HOT",
MAR: "MAR",
RAV: "RAV",
};2.3 Unify Coverage Metric
Decide on single definition and apply consistently:
- Recommended:
coverage = found / totalChecked(clearer meaning)
Phase 3: Architecture Improvements
Timeline: 2-3 sprints Risk: Medium Impact: Long-term maintainability
3.1 Extract Shared DIS Package
Create @crop/dis-common package:
packages/
dis-common/
src/
types.ts # DISPartResult, etc.
constants.ts # DIS_VENDOR_NAMES, DIS_VENDOR_CODE_MAP
routing.ts # getBNSManufacturerCode, detectBNSPartPattern
index.ts
package.jsonBenefits:
- Single source of truth for DIS logic
- Both frontend and backend import from same package
- Changes automatically propagate
3.2 Add Smart MCH/MAR Detection
Since MCH collection has mixed parts, add runtime detection:
function getMCHManufacturerCode(partNumber: string): string {
const pn = partNumber.toUpperCase().trim();
// McHale format: 3-letter prefix + digits (CEL00xxx, CBR00xxx, CFA00xxx)
if (/^[A-Z]{3}\d+$/.test(pn)) return "MCH";
// Numeric-only are likely Marcrest
if (/^\d+$/.test(pn)) return "MAR";
// Default to MCH
return "MCH";
}3.3 Add Part Number Validation
Validate parts on upload to prevent wrong-collection issues:
interface PartValidation {
vendorCode: string;
partNumber: string;
isValidFormat: boolean;
suggestedVendor?: string;
warning?: string;
}
function validatePartNumber(vendorCode: string, partNumber: string): PartValidation {
// MCH should have letter prefix
if (vendorCode === "MCH" && /^\d+$/.test(partNumber)) {
return {
vendorCode,
partNumber,
isValidFormat: false,
suggestedVendor: "MAR",
warning: "Numeric-only part numbers are typically Marcrest, not McHale"
};
}
// ... more validations
}Phase 4: Monitoring & Analytics
Timeline: 1 sprint (after Phase 3) Risk: Low Impact: Operational visibility
4.1 Add Routing Analytics
Track which patterns route to which manufacturers and their success rates:
interface RoutingAnalytics {
pattern: BNSPartPattern;
manufacturer: string;
successRate: number;
totalTested: number;
lastUpdated: Date;
}4.2 Add "Not Found" Investigation Queue
Create admin UI to review and correct not-found parts:
- Show part details and suggested corrections
- Allow manual manufacturer code override
- Track corrections for pattern learning
Part 4: Implementation Checklist
Phase 1 Checklist
- Fix FS suffix routing in
lib/types/dis.ts - Create migration script for MCH→MAR parts
- Run migration in dev environment
- Verify +54 parts now found
- Run migration in prod
- Remove test data (RAV "raven", parts_123test)
Phase 2 Checklist
- Update vendor names in frontend
- Add missing vendors to backend
- Update coverage calculation (both services)
- Add tests for vendor consistency
Phase 3 Checklist
- Create @crop/dis-common package structure
- Move shared types to package
- Move routing logic to package
- Update frontend imports
- Update backend imports
- Add runtime MCH/MAR detection
- Add upload validation
Phase 4 Checklist
- Implement routing analytics collection
- Create admin dashboard for not-found review
- Add automated pattern success tracking
Part 5: Risk Assessment
| Phase | Risk Level | Mitigation |
|---|---|---|
| 1 | Low | Changes are data-level, easily reversible |
| 2 | Low-Medium | Thorough testing, feature flags if needed |
| 3 | Medium | Incremental migration, keep old code until verified |
| 4 | Low | New features, don't affect existing functionality |
Appendix: Test Scripts
Analysis scripts created during investigation:
scripts/analyze-dis-parts.ts- MongoDB collection analysisscripts/analyze-dis-mapping.ts- ManufacturerCode mappingscripts/find-not-found-parts.ts- FER vs B&S testingscripts/analyze-bns-failures.ts- Suffix pattern testingscripts/test-dis-mfg-codes.ts- Manufacturer code variationsscripts/list-dis-manufacturers.ts- DIS manufacturer listing
Run with: bun run scripts/<script-name>.ts
Conclusion
The "500 not found parts" issue is primarily caused by:
- Wrong routing (~35 parts) - FS suffix should go to FER
- Wrong collection (~27 parts) - MAR parts in MCH collection
- Genuinely unavailable (~350 parts) - Not in DIS system
Implementing Phase 1 alone will recover ~58 parts with minimal risk. The remaining not-found parts are likely genuinely unavailable in DIS or have formatting issues that require manual review.