CROP
ProjectsCROP Frontend

Calibration Editor: Improvement Plan

R&D tool for calibrating coordinate positions on technical parts diagrams. This document outlines identified issues, recommendations, and improvement roadmap.

Calibration Editor: Improvement Plan

Overview

R&D tool for calibrating coordinate positions on technical parts diagrams. This document outlines identified issues, recommendations, and improvement roadmap.


1. Critical Issue: Spurious Coordinates

Problem

The parser extracts ALL numeric text from diagrams, including:

  • Valid item numbers — actual callouts pointing to parts
  • Page numbers — "Page 5", "5" in footer/header
  • Dimensions — "12.5", "100mm"
  • Quantities — "Qty: 2"
  • Part numbers — sometimes numeric SKUs appear on diagrams
  • Other text — dates, revision numbers, etc.

Impact

  • Manual calibration wastes time on irrelevant coordinates
  • Statistics (deviation metrics) are skewed by spurious data
  • No way to track what's been "cleaned up"

Solution: Coordinate Validity Classification

Add validity field to ManualCoordinate:

export type CoordinateValidity =
  | "valid"      // Confirmed item number (callout to part)
  | "spurious"   // Page number, dimension, or other non-item text
  | "uncertain"; // Needs review

export interface ManualCoordinate extends ItemCoordinate {
  validity?: CoordinateValidity; // Default: "valid"
  calibratedBy?: string;
  calibratedAt?: string;
}

UI Changes Needed

  1. Add validity toggle/dropdown for each coordinate in right sidebar
  2. Color-code markers: green=valid, gray=spurious, yellow=uncertain
  3. Add filter toggle: "Hide spurious" checkbox
  4. Statistics should separate: "15 valid / 3 spurious / 2 uncertain"
  5. Keyboard shortcut: select marker + press "S" to mark spurious

2. Missing Features Analysis

2.1 No Undo/Redo

Problem: Accidentally drag a marker → no way to undo Solution: Local history stack (10-20 states) with Ctrl+Z/Ctrl+Y

2.2 No Batch Operations

Problem: Can't select multiple markers to move/delete together Solution:

  • Shift+click to multi-select
  • Drag selection box
  • "Select all spurious" action

2.3 No User Attribution

Problem: calibratedBy field exists but never set Solution: Get current user from auth context, auto-populate on save

2.4 No Change History

Problem: Each save overwrites previous calibration Solution: Add history array with timestamped snapshots (last 5-10)

2.5 No Validation Thresholds

Problem: No warning when deviation seems wrong (>10% is suspicious) Solution: Configurable thresholds with warnings

2.6 No Bulk Import/Export

Problem: No way to export calibration data for analysis Solution: Add CSV/JSON export buttons


3. UX Issues

3.1 Small Click Targets

Problem: Markers are small at high zoom, hard to grab Solution: Increase hit area, show larger handle on hover

3.2 No Snap-to-Grid

Problem: Hard to align coordinates precisely Solution: Optional grid overlay + snap mode

3.3 No Zoom to Selection

Problem: Can't quickly zoom to a specific coordinate Solution: Double-click coordinate in list → zoom & center

3.4 Confusing "Original" vs "Manual" Terminology

Problem: Users don't understand what "original" means Recommendation:

  • "Original" → "Parsed" or "Auto-detected"
  • "Manual" → "Calibrated" or "Corrected"

3.5 Status Workflow Unclear

Problem: When to use draft/in_progress/completed/reviewed? Solution: Add status help tooltip explaining workflow


4. Data Model Improvements

4.1 Add Coordinate Metadata

export interface ManualCoordinate extends ItemCoordinate {
  // Existing
  calibratedBy?: string;
  calibratedAt?: string;

  // New
  validity?: "valid" | "spurious" | "uncertain";
  confidence?: number;        // 0-100% confidence score
  notes?: string;             // Per-coordinate notes
  sourceType?: "auto" | "manual"; // Was this auto-extracted or manually added?
}

4.2 Add Page-Level Metadata

export interface RndCoordinateCalibrationDocument {
  // ... existing fields

  // New quality metrics
  valid_items_count: number;
  spurious_items_count: number;
  uncertain_items_count: number;

  // Processing flags
  needs_review: boolean;
  review_notes?: string;
}

4.3 Separate "Excluded" Coordinates

Instead of deleting spurious coordinates, move to separate array:

export interface RndCoordinateCalibrationDocument {
  original_coords: OriginalCoordinate[];
  manual_coords: ManualCoordinate[];        // Valid + uncertain
  excluded_coords: ManualCoordinate[];      // Spurious (preserved for audit)
}

5. Performance Considerations

5.1 Large Documents

Problem: Documents with 100+ pages load slowly Solution:

  • Virtual scrolling for thumbnail list
  • Lazy load page data only when selected
  • Cache parsed frontend_structure.json

5.2 SVG Rendering

Problem: Complex SVGs with thousands of elements are slow Solution:

  • Use canvas rendering for preview
  • Lazy load full SVG only when editing
  • Consider SVG simplification for thumbnails

6. Implementation Priority

Phase 1: Critical (High Impact, Low Effort) ✅ COMPLETED

  1. Coordinate validity classification — valid/spurious/uncertain
  2. Better terminology — "Parsed" vs "Calibrated"
  3. Filter spurious coordinates — "Hide junk" checkbox
  4. Statistics separation — ✓15 ✗3 ?2 breakdown
  5. Context menu — right-click to classify
  6. Validity persistence — saved to DB, tracked in unsaved changes
  7. Smooth zoom — continuous zoom instead of step-based
  8. UX improvements — legend, hints, better tooltips

Phase 2: Quality of Life (Medium Impact)

  1. Undo/redo (local history) — Ctrl+Z/Ctrl+Y
  2. Zoom to coordinate — double-click in list to focus
  3. Keyboard shortcuts for validity — S/V/U keys
  4. User attribution — auto-set calibratedBy from auth
  5. Export to CSV/JSON — for analysis

Phase 3: Advanced (Lower Priority)

  1. Batch selection — Shift+click multi-select
  2. Snap-to-grid — optional alignment
  3. Change history — versioned snapshots
  4. Validation thresholds — warn on high deviation
  5. Bulk import — upload calibration data

Phase 4: Refactoring (Tech Debt)

  1. Split component — 1500+ lines → smaller components
  2. Move state to zustand — calibration-store.ts
  3. Add unit tests — deviation calculations
  4. Add E2E tests — Playwright workflow tests
  5. Fix accessibility — keyboard navigation for markers

7. Code Quality Notes

Good Practices Found ✅

  • Clean separation of types in dedicated file
  • Consistent naming conventions (Rnd prefix)
  • Comprehensive TypeScript types
  • Deviation calculation is correct
  • Responsive layout works well
  • Keyboard shortcuts implemented

Areas for Improvement

  • Component is 1400+ lines — consider splitting
  • Some magic numbers (e.g., 0.001 threshold)
  • No unit tests for calibration logic
  • No E2E tests for editor workflow
  • Some state could move to zustand store

Suggested Refactoring

calibration-editor/
├── _components/
│   ├── calibration-editor.tsx      → Main orchestrator (reduced)
│   ├── coordinate-canvas.tsx       → SVG viewer + markers
│   ├── coordinate-list.tsx         → Right sidebar list
│   ├── page-thumbnails.tsx         → Left sidebar thumbnails
│   ├── toolbar.tsx                 → Top controls
│   └── dialogs/
│       ├── add-coordinate-dialog.tsx
│       └── delete-confirmation-dialog.tsx
├── _hooks/
│   ├── use-calibration-state.ts    → Zustand store
│   ├── use-keyboard-shortcuts.ts
│   └── use-coordinate-drag.ts
└── _lib/
    ├── deviation-utils.ts
    └── coordinate-validation.ts

8. Testing Strategy

Unit Tests Needed

  • calculateDeviation() — edge cases
  • calculateDeviationStats() — empty arrays, single item
  • Coordinate clamping (0-100%)
  • Item number validation (1-3 digits)

Integration Tests

  • Save/load calibration round-trip
  • Filter toggling
  • Status transitions

E2E Tests (Playwright)

  • Full calibration workflow
  • Drag coordinate → verify position update
  • Add new coordinate via double-click
  • Delete coordinate

9. Metrics to Track

Quality Metrics

  • Average deviation per provider
  • Calibration completion rate
  • Time to complete calibration (per page)
  • Spurious coordinate ratio

Performance Metrics

  • Page load time
  • Save latency
  • SVG render time

Appendix A: Industry Terminology Reference

TermUsageExample
Item NumberPrimary term in CROP"Item 5"
Reference NumberAlternative"Ref. No. 5"
CalloutTechnical writing"Callout 5"
BalloonMechanical engineeringCircled number
Index NumberParts catalogs"Index 5"
Key NumberSome manufacturers"Key No. 5"

Recommendation: Stick with "Item Number" — it's what the codebase uses and matches Clinton Tractor's parts catalog terminology.


Appendix B: Coordinate Validity Heuristics

Potential automatic detection of spurious coordinates:

function detectSpuriousCoordinate(coord: OriginalCoordinate): "likely_spurious" | "likely_valid" | "uncertain" {
  const { itemNumber, xPercent, yPercent } = coord;
  const num = parseInt(itemNumber, 10);

  // Page numbers typically in corners/edges
  const isNearEdge = xPercent < 5 || xPercent > 95 || yPercent < 5 || yPercent > 95;

  // Page numbers often small (1-20) and isolated
  const isSmallNumber = num <= 20;

  // Dimensions often have decimals (handled by regex, but if parsed as "12")
  // Part numbers often > 100

  if (isNearEdge && isSmallNumber) {
    return "likely_spurious";
  }

  return "uncertain";
}

This is R&D — start manual, learn patterns, then automate.


Document created: 2025-01-22 Author: Claude Code (R&D analysis)

On this page