R&D tool for calibrating coordinate positions on technical parts diagrams. This document outlines identified issues, recommendations, and improvement roadmap.

Calibration Editor: Improvement Plan

Overview

R&D tool for calibrating coordinate positions on technical parts diagrams. This document outlines identified issues, recommendations, and improvement roadmap.

1. Critical Issue: Spurious Coordinates

Problem

The parser extracts ALL numeric text from diagrams, including:

✅ Valid item numbers — actual callouts pointing to parts
❌ Page numbers — "Page 5", "5" in footer/header
❌ Dimensions — "12.5", "100mm"
❌ Quantities — "Qty: 2"
❌ Part numbers — sometimes numeric SKUs appear on diagrams
❌ Other text — dates, revision numbers, etc.

Impact

Manual calibration wastes time on irrelevant coordinates
Statistics (deviation metrics) are skewed by spurious data
No way to track what's been "cleaned up"

Solution: Coordinate Validity Classification

Add validity field to ManualCoordinate:

export type CoordinateValidity =
  | "valid"      // Confirmed item number (callout to part)
  | "spurious"   // Page number, dimension, or other non-item text
  | "uncertain"; // Needs review

export interface ManualCoordinate extends ItemCoordinate {
  validity?: CoordinateValidity; // Default: "valid"
  calibratedBy?: string;
  calibratedAt?: string;
}

UI Changes Needed

Add validity toggle/dropdown for each coordinate in right sidebar
Color-code markers: green=valid, gray=spurious, yellow=uncertain
Add filter toggle: "Hide spurious" checkbox
Statistics should separate: "15 valid / 3 spurious / 2 uncertain"
Keyboard shortcut: select marker + press "S" to mark spurious

2. Missing Features Analysis

2.1 No Undo/Redo

Problem: Accidentally drag a marker → no way to undo Solution: Local history stack (10-20 states) with Ctrl+Z/Ctrl+Y

2.2 No Batch Operations

Problem: Can't select multiple markers to move/delete together Solution:

Shift+click to multi-select
Drag selection box
"Select all spurious" action

2.3 No User Attribution

Problem: calibratedBy field exists but never set Solution: Get current user from auth context, auto-populate on save

2.4 No Change History

Problem: Each save overwrites previous calibration Solution: Add history array with timestamped snapshots (last 5-10)

2.5 No Validation Thresholds

Problem: No warning when deviation seems wrong (>10% is suspicious) Solution: Configurable thresholds with warnings

2.6 No Bulk Import/Export

Problem: No way to export calibration data for analysis Solution: Add CSV/JSON export buttons

3. UX Issues

3.1 Small Click Targets

Problem: Markers are small at high zoom, hard to grab Solution: Increase hit area, show larger handle on hover

3.2 No Snap-to-Grid

Problem: Hard to align coordinates precisely Solution: Optional grid overlay + snap mode

3.3 No Zoom to Selection

Problem: Can't quickly zoom to a specific coordinate Solution: Double-click coordinate in list → zoom & center

3.4 Confusing "Original" vs "Manual" Terminology

Problem: Users don't understand what "original" means Recommendation:

"Original" → "Parsed" or "Auto-detected"
"Manual" → "Calibrated" or "Corrected"

export interface ManualCoordinate extends ItemCoordinate {
  // Existing
  calibratedBy?: string;
  calibratedAt?: string;

  // New
  validity?: "valid" | "spurious" | "uncertain";
  confidence?: number;        // 0-100% confidence score
  notes?: string;             // Per-coordinate notes
  sourceType?: "auto" | "manual"; // Was this auto-extracted or manually added?
}

4.2 Add Page-Level Metadata

export interface RndCoordinateCalibrationDocument {
  // ... existing fields

  // New quality metrics
  valid_items_count: number;
  spurious_items_count: number;
  uncertain_items_count: number;

  // Processing flags
  needs_review: boolean;
  review_notes?: string;
}

4.3 Separate "Excluded" Coordinates

Instead of deleting spurious coordinates, move to separate array:

export interface RndCoordinateCalibrationDocument {
  original_coords: OriginalCoordinate[];
  manual_coords: ManualCoordinate[];        // Valid + uncertain
  excluded_coords: ManualCoordinate[];      // Spurious (preserved for audit)
}

5. Performance Considerations

5.1 Large Documents

Problem: Documents with 100+ pages load slowly Solution:

Virtual scrolling for thumbnail list
Lazy load page data only when selected
Cache parsed frontend_structure.json

5.2 SVG Rendering

Problem: Complex SVGs with thousands of elements are slow Solution:

Use canvas rendering for preview
Lazy load full SVG only when editing
Consider SVG simplification for thumbnails

6. Implementation Priority

Phase 1: Critical (High Impact, Low Effort) ✅ COMPLETED

✅ Coordinate validity classification — valid/spurious/uncertain
✅ Better terminology — "Parsed" vs "Calibrated"
✅ Filter spurious coordinates — "Hide junk" checkbox
✅ Statistics separation — ✓15 ✗3 ?2 breakdown
✅ Context menu — right-click to classify
✅ Validity persistence — saved to DB, tracked in unsaved changes
✅ Smooth zoom — continuous zoom instead of step-based
✅ UX improvements — legend, hints, better tooltips

Phase 2: Quality of Life (Medium Impact)

Undo/redo (local history) — Ctrl+Z/Ctrl+Y
Zoom to coordinate — double-click in list to focus
Keyboard shortcuts for validity — S/V/U keys
User attribution — auto-set calibratedBy from auth
Export to CSV/JSON — for analysis

Phase 3: Advanced (Lower Priority)

Batch selection — Shift+click multi-select
Snap-to-grid — optional alignment
Change history — versioned snapshots
Validation thresholds — warn on high deviation
Bulk import — upload calibration data

Phase 4: Refactoring (Tech Debt)

Split component — 1500+ lines → smaller components
Move state to zustand — calibration-store.ts
Add unit tests — deviation calculations
Add E2E tests — Playwright workflow tests
Fix accessibility — keyboard navigation for markers

7. Code Quality Notes

Good Practices Found ✅

Clean separation of types in dedicated file
Consistent naming conventions (Rnd prefix)
Comprehensive TypeScript types
Deviation calculation is correct
Responsive layout works well
Keyboard shortcuts implemented

Areas for Improvement

Component is 1400+ lines — consider splitting
Some magic numbers (e.g., 0.001 threshold)
No unit tests for calibration logic
No E2E tests for editor workflow
Some state could move to zustand store

Suggested Refactoring

calibration-editor/
├── _components/
│   ├── calibration-editor.tsx      → Main orchestrator (reduced)
│   ├── coordinate-canvas.tsx       → SVG viewer + markers
│   ├── coordinate-list.tsx         → Right sidebar list
│   ├── page-thumbnails.tsx         → Left sidebar thumbnails
│   ├── toolbar.tsx                 → Top controls
│   └── dialogs/
│       ├── add-coordinate-dialog.tsx
│       └── delete-confirmation-dialog.tsx
├── _hooks/
│   ├── use-calibration-state.ts    → Zustand store
│   ├── use-keyboard-shortcuts.ts
│   └── use-coordinate-drag.ts
└── _lib/
    ├── deviation-utils.ts
    └── coordinate-validation.ts

8. Testing Strategy

Unit Tests Needed

calculateDeviation() — edge cases
calculateDeviationStats() — empty arrays, single item
Coordinate clamping (0-100%)
Item number validation (1-3 digits)

Integration Tests

Save/load calibration round-trip
Filter toggling
Status transitions

E2E Tests (Playwright)

Full calibration workflow
Drag coordinate → verify position update
Add new coordinate via double-click
Delete coordinate

9. Metrics to Track

Quality Metrics

Average deviation per provider
Calibration completion rate
Time to complete calibration (per page)
Spurious coordinate ratio

Performance Metrics

Page load time
Save latency
SVG render time

Appendix A: Industry Terminology Reference

Term	Usage	Example
Item Number	Primary term in CROP	"Item 5"
Reference Number	Alternative	"Ref. No. 5"
Callout	Technical writing	"Callout 5"
Balloon	Mechanical engineering	Circled number
Index Number	Parts catalogs	"Index 5"
Key Number	Some manufacturers	"Key No. 5"

Recommendation: Stick with "Item Number" — it's what the codebase uses and matches Clinton Tractor's parts catalog terminology.

Appendix B: Coordinate Validity Heuristics

Potential automatic detection of spurious coordinates:

function detectSpuriousCoordinate(coord: OriginalCoordinate): "likely_spurious" | "likely_valid" | "uncertain" {
  const { itemNumber, xPercent, yPercent } = coord;
  const num = parseInt(itemNumber, 10);

  // Page numbers typically in corners/edges
  const isNearEdge = xPercent < 5 || xPercent > 95 || yPercent < 5 || yPercent > 95;

  // Page numbers often small (1-20) and isolated
  const isSmallNumber = num <= 20;

  // Dimensions often have decimals (handled by regex, but if parsed as "12")
  // Part numbers often > 100

  if (isNearEdge && isSmallNumber) {
    return "likely_spurious";
  }

  return "uncertain";
}

This is R&D — start manual, learn patterns, then automate.

Document created: 2025-01-22 Author: Claude Code (R&D analysis)

Calibration Editor: Improvement Plan

On this page