10-phase SEO roadmap: data standards, indexing, GA4, Semrush, realtime analytics, and BI dashboard.

SEO Strategy

[!NOTE] Discussion: Open an issue to comment on this plan. Attach supporting documents to the issue or link them here.

Date: 2026-03-10 Author: Anton Status: Draft

Executive Summary

The product is technically functional — catalog, search, part detail pages, cart, and authentication all work. However, three systemic issues block growth:

Inconsistent data quality. Ten manufacturers supply data in their own formats. An adapter layer masks this but does not solve it. Image alt tags are fallback-generated at the UI level rather than stored as data. JSON Schema is absent — TypeScript types exist only at compile time, runtime validation is zero.
Zero organic visibility. Indexing is disabled. Google cannot see the site. The URL architecture covers only one navigation axis (system/category) and has no brand → model levels, which are the foundation of long-tail traffic for any parts store.
No analytics. lib/analytics.ts is a stub. GA4 is absent. There is no understanding of user behavior, conversions, or drop-off points.

Plan principle: each phase is a prerequisite for the next. Breaking the order reduces the quality of the outcome.

Data + Schema → Images → JSON-LD → URL Architecture → Indexing
                                                        ↓
                                       GA4 → Semrush → Realtime → BI

Baseline: Current State

Component	Status
JSON-LD: Product + BreadcrumbList + FAQPage	✅ exists on `/catalog/[id]`
JSON-LD: Organization, SearchAction, CollectionPage, ItemList	❌ missing
JSON Schema / Zod for Part	❌ missing (TypeScript types only)
Image alt in data	⚠️ optional, often empty, fallback on UI
XMP metadata standard	❌ not defined
URL: category/system pages	✅ `/catalog/category/[slug]/[subSlug]`
URL: brand pages	❌ missing
URL: model pages	❌ missing
Google Search Console	⚠️ env var ready, not activated
Indexing	❌ disabled (`ENABLE_INDEXING=false`)
GA4	❌ missing
Realtime Analytics	❌ missing
BI Dashboard	❌ missing
Cookie Consent	✅ exists, GDPR-ready
Vercel Analytics + SpeedInsights	✅ exists (Vercel env only)

Environment Isolation

Active throughout the entire project. Does not require a separate phase.

Environment	Indexing
`crop-dev.app`	`ENABLE_INDEXING=false` — development and testing, always
`clintontractor.net`	`ENABLE_INDEXING=true` — enabled in Phase 5

Every tool has separate keys for dev and prod:

Env var	Dev	Prod
`NEXT_PUBLIC_GA_MEASUREMENT_ID`	`G-DEV-XXX`	`G-PROD-YYY`
`NEXT_PUBLIC_POSTHOG_KEY`	`phc_dev_xxx`	`phc_prod_yyy`
`GOOGLE_SITE_VERIFICATION`	`dev-token`	`prod-token`

SEO verification on staging: GSC URL Inspection, Rich Results Test, Lighthouse — all available without real indexing. Analytics testing: GA4 DebugView (?debug_mode=1), PostHog Activity feed — live events visible instantly.

Phase 1 — Data Standard and JSON Schema

Weeks 1–3 · March 2026

Goal: Establish a single data contract as a formal specification. Everything that follows depends on this.

1.1 Quality Audit

Run a count across all parts in the catalog. Record:

Distribution by quality tier: Excellent / Good / Fair / Poor
% of parts without price · without image · without description · without category · without fitment
Top manufacturers by lowest average score
Count of parts with catalogReady: true

These results become the baseline for comparison at 3 and 6 months.

1.2 Field Standardization

Field	Current state	Standard
`category`	`string \| CategoryNode[]`	always `CategoryNode[]`, minimum 1 node
`specifications`	`Record<string, any>`	`{ key: string, value: string, unit?: string }[]`
`equipmentFitment`	raw strings	`PartFitmentRecord[]` with make, model, yearStart, yearEnd
`modelKey`	inconsistent format	always `{BRAND_CODE}::{MODEL_SLUG}`
`yearFrom` / `yearTo`	strings or numbers or null	always `number \| null`
`quality.score`	optional	required number 0–100
`image.alt`	optional	required after Phase 2

1.3 Publication Minimum

catalogReady: true = all conditions met:

title — 30–80 characters
partNumber — normalized (uppercase, no spaces)
manufacturer — object with code, name, slug
status — active | discontinued | superseded
seo.title — populated
seo.description — 150–160 characters
quality.score ≥ 70

1.4 JSON Schema and Runtime Validation

Current state: Zod is used for lib/env.ts and only for the health-check response in lib/search-service/schemas.ts. The entire Part / UnifiedCatalogPart relies on TypeScript types with zero runtime validation.

What to do:

Write a Zod schema for UnifiedCatalogPart with three obligation tiers:

// Tier 1 — required (part is not published without these)
const partTier1Schema = z.object({
  id: z.string(),
  partNumber: z.string(),
  title: z.string().min(10).max(80),
  manufacturer: z.object({ code: z.string(), name: z.string(), slug: z.string() }),
  status: z.enum(["active", "discontinued", "superseded"]),
  slug: z.string(),
  "seo.title": z.string(),
  "seo.description": z.string().min(50).max(160),
  "quality.score": z.number().min(0).max(100),
});

// Tier 2 — desired (affect quality score)
// Tier 3 — enrichment (optional)

Connect validation at the Search Service → adapter boundary: if Tier 1 fields are missing or the wrong type — log as a data quality issue, do not break the UI.

Generate a formal JSON Schema via zod-to-json-schema — this is Data Contract v1.0, shareable with the backend team as the API specification.

Deliverable: Data Contract v1.0 (Zod schema + JSON Schema). Baseline quality report.

Phase 2 — Images: Alt Tags and XMP Standard

Weeks 4–5 · April 2026

Goal: Alt tags become part of the data, not a UI-level fallback. Define the XMP standard for the image pipeline.

2.1 The Problem

The image.alt field is optional in the schema. UI fallback: image.alt ?? "${title} image ${index + 1}". Google Image Search needs descriptive alt text stored in the data — not generated fresh on every render.

2.2 Alt Text Standard

Formula: {partNumber} {partName} {manufacturer} [{angle}]

Examples:

"AS9301B Carburetor Gasket Briggs & Stratton"
"AS9301B Carburetor Gasket Briggs & Stratton - front view"
"87546362 Bale Wrapper Belt New Holland - detail"

Rules: always includes partNumber + name + manufacturer. For multi-image galleries — angle/view if known. Maximum 125 characters (WCAG). Never starts with "image of" or "photo of."

2.3 Two Generation Levels

Level 1 — deterministic (100% coverage): generate alt at the adapter level (lib/search-service/adapters.ts) using the formula above. No AI, no delays, covers all parts immediately.

Level 2 — AI-enriched (later): for parts with quality.score ≥ 90 — AI generates a more descriptive alt that accounts for the part type and image angle. Stored in data, not regenerated on every request.

2.4 XMP Metadata Standard

XMP is written by the backend pipeline when processing images — not a frontend task. Our role is to define the standard and pass it as a requirement.

Fields to embed in XMP:

XMP field	Value
`dc:title`	`{partNumber} {partName}`
`dc:description`	`{seo.description}`
`dc:subject`	`[partNumber, manufacturer, category, keywords]`
`dc:creator`	"Clinton Tractor"
`dc:rights`	"© Clinton Tractor & Implement Co."
`xmp:Identifier`	`{partNumber}`

Add an xmp?: { title?, description?, keywords?, creator? } field to the image schema — for reading from existing images if that data is already present.

2.5 Open Graph and Image Sitemap

After Level 1, image.alt is always populated — the fallback in metadata generation is no longer needed. Add an image sitemap (<image:image> tags) for Google Image Search.

Deliverable: 100% of images have alt text per standard. XMP specification delivered to backend. image.alt is a required field.

Phase 3 — JSON-LD Structured Data (Full Coverage)

Weeks 5–7 · April 2026

Goal: Every page type has appropriate structured data. Rich Results Test — zero errors.

3.1 Current State

product-json-ld.tsx already generates for /catalog/[id]:

✅ Product (name, url, description, image, brand, sku, mpn, gtin, offers, weight)
✅ BreadcrumbList
✅ FAQPage

3.2 Extending the Product Schema

Add missing fields:

{
  "category": "Engine Parts > Carburetor",
  "additionalProperty": [
    { "@type": "PropertyValue", "name": "Weight", "value": "0.5", "unitCode": "KGM" }
  ],
  "offers": {
    "itemCondition": "https://schema.org/NewCondition",
    "shippingDetails": { "@type": "OfferShippingDetails" },
    "hasMerchantReturnPolicy": { "@type": "MerchantReturnPolicy" }
  }
}

3.3 New Schemas by Page Type

WebSite + SearchAction — in app/layout.tsx. Enables Google Sitelinks Search Box directly in search results:

{
  "@type": "WebSite",
  "potentialAction": {
    "@type": "SearchAction",
    "target": "https://clintontractor.net/parts/catalog?q={search_term_string}",
    "query-input": "required name=search_term_string"
  }
}

Organization + LocalBusiness — in app/layout.tsx:

{
  "@type": ["Organization", "LocalBusiness"],
  "name": "Clinton Tractor & Implement Co.",
  "url": "https://www.clintontractor.net"
}

CollectionPage + ItemList — on all listing pages (catalog, category, brand, model):

{
  "@type": "CollectionPage",
  "name": "Briggs & Stratton Engine Parts",
  "numberOfItems": 342,
  "itemListElement": [{ "@type": "ListItem", "position": 1, "url": "..." }]
}

3.4 Coverage by Page Type

Page	JSON-LD types
`/`	Organization + LocalBusiness + WebSite + SearchAction
`/catalog`	CollectionPage + ItemList
`/catalog/category/[slug]`	CollectionPage + BreadcrumbList + ItemList
`/catalog/category/[slug]/[subSlug]`	CollectionPage + BreadcrumbList + ItemList
`/catalog/brand/[brand]`	CollectionPage + BreadcrumbList + ItemList (Phase 4)
`/catalog/brand/[brand]/[system]`	CollectionPage + BreadcrumbList + ItemList (Phase 4)
`/catalog/brand/[brand]/model/[model]`	CollectionPage + BreadcrumbList + ItemList (Phase 7)
`/catalog/[id]`	Product + BreadcrumbList + FAQPage ✅ (extend)
`/inventory/[id]`	Product (used) + BreadcrumbList
`/tires/[slug]`	Product + BreadcrumbList

Deliverable: All page types covered. Rich Results Test and Schema Markup Validator — zero errors.

Phase 4 — URL Architecture: Brand → System

Weeks 7–9 · May 2026

Goal: Open the first brand navigation axis. Create indexable pages for brand-level and brand × system keywords.

4.1 The Problem and the Opportunity

The current URL structure covers one axis: "what kind of part is this" (/catalog/category/engine-parts).

Most searches follow a different axis: "I know my equipment" ("briggs stratton carburetor parts"). There are no indexable pages for these queries — that traffic is lost.

AXIS 1: System (exists)

/catalog/category/engine-parts
/catalog/category/engine-parts/carburetor

AXIS 2: Brand (Phase 4) → Model (Phase 7)

/catalog/brand/briggs-stratton
/catalog/brand/briggs-stratton/engine-parts
/catalog/brand/briggs-stratton/model/450e ← Phase 7
/catalog/brand/briggs-stratton/model/450e/engine-parts ← Phase 7

Both axes end at /catalog/[id] — the canonical part URL, which never changes.

Brand pages are filtered collection pages (lists of parts for that context), not duplicates.

4.3 SEO Value at Each Level

URL	Target keyword	Competition	Intent
`/catalog/brand/briggs-stratton`	"Briggs & Stratton parts"	high	broad
`/catalog/brand/briggs-stratton/engine-parts`	"Briggs & Stratton engine parts"	medium	category
`/catalog/brand/briggs-stratton/model/450e`	"Briggs & Stratton 450E parts"	low	model
`/catalog/brand/briggs-stratton/model/450e/engine-parts`	"Briggs & Stratton 450E carburetor parts"	very low	high intent, wins

4.4 Implementation (Phase 4: brand + brand × system only)

Brand data (manufacturer.code, manufacturer.slug, manufacturer.name) is always populated and does not depend on fitment data quality. The brand level is built first for this reason.

New routes:

app/catalog/brand/page.tsx
app/catalog/brand/[brandSlug]/page.tsx
app/catalog/brand/[brandSlug]/[systemSlug]/page.tsx

Each page generates:

generateStaticParams for ISR-based revalidation
Dynamic seo.title and seo.description
CollectionPage + BreadcrumbList JSON-LD
ItemList with the first N parts

Canonical and thin content: brand pages are a subset of /catalog — mandatory <link rel="canonical"> and noindex for pages with fewer than 5 parts.

Deliverable: Brand landing pages and brand × system intersection pages live. Sitemap updated.

Phase 5 — Google Search Console and Indexing

Weeks 9–10 · May 2026

Goal: Open the site to indexing after data, structure, and JSON-LD are ready.

[!WARNING] Hard blocker: Phases 1–4 must be complete. Opening indexing before readiness means indexing low-quality content on a URL structure that is still changing.

5.1 Pre-launch Checklist

Phase 1: Data Contract v1.0, baseline recorded
Phase 2: image.alt required, 100% coverage
Phase 3: JSON-LD on all page types, zero errors in Rich Results Test
Phase 4: Brand pages deployed
Sitemap generates real URLs with lastmod
Canonical URL on every page
Thin content pages (< 5 parts) — noindex
GSC property verified (GOOGLE_SITE_VERIFICATION env var ready)

5.2 Two Separate GSC Properties

Property	Purpose
`crop-dev.app`	Staging — URL Inspection without indexing
`clintontractor.net`	Production — real monitoring

5.3 Sitemap Architecture

After enabling:

Sitemap index when > 50k parts (child sitemap files per brand)
Priorities: /catalog/brand/* → 0.9, /catalog/[id] → 0.8, /catalog/category/* → 0.7, rest → 0.4–0.6
lastmod — part's last updated date from DB, not static
Image sitemap (<image:image> tags) for Google Image Search

5.4 Monitoring After Launch

Coverage report: indexed / errors / excluded
Core Web Vitals: LCP < 2.5s, CLS < 0.1, INP < 200ms
Rich Results: Products enhancement status
Record top-20 target keyword positions as the new baseline

Deliverable: Site indexed. GSC collecting data. First keyword position report.

Phase 6 — GA4

Weeks 10–12 · May–June 2026

Goal: Full e-commerce tracking. Understand the funnel from search to purchase.

6.1 Environment Isolation

GA4 Property	Measurement ID	Domain
CROP Dev	`G-XXXXXXX`	`crop-dev.app`
CROP Prod	`G-YYYYYYY`	`clintontractor.net`

Add NEXT_PUBLIC_GA_MEASUREMENT_ID to lib/env.ts and all 6 sync targets.

6.2 Technical Integration

Initialize via next/script with strategy="afterInteractive"
Do not initialize before Cookie Consent is given (banner already exists)
Cloudflare: via <script> tag (Vercel-specific approach does not apply)
Replace the lib/analytics.ts stub with a real GA4 sink — all existing analytics.track() calls in the search bar receive real tracking without code changes

6.3 E-commerce Events

Event	Trigger	Key parameters
`view_item_list`	Catalog / brand / model page	`item_list_name`, `items[]`
`select_item`	Click on a part card	`item_id`, `item_brand`, `item_category`
`view_item`	Part detail page opened	`item_id`, `price`, `currency`
`search`	Search query executed	`search_term`, `results_count`
`add_to_cart`	Added to cart	`item_id`, `quantity`, `price`
`begin_checkout`	Moved to checkout	`value`, `currency`
`purchase`	Order confirmed	`transaction_id`, `value`, `items[]`

6.4 Custom Dimensions

Dimension	Description
`part_manufacturer`	Part brand
`part_category`	System category
`quality_level`	excellent / good / fair / poor
`navigation_axis`	category or brand (which axis the user came from)
`search_mode`	parts / equipment / auto
`has_price`	boolean

Deliverable: GA4 collecting the full e-commerce funnel. Custom dimensions configured. DebugView confirms all events.

Phase 7 — URL Architecture: Brand → Model

Weeks 12–14 · June–July 2026

Goal: Add the model level after fitment data is clean and GA4 reveals which brands to prioritize.

7.1 Why Model Pages Come After GA4

There are hundreds of models in the catalog. Building all of them at once risks thin content. GA4 data from Phase 6 reveals:

Which brands drive the most traffic
Which search queries contain model names
Where bounce rate is highest (user searched for a specific model, didn't find a dedicated page)

The first 20–30 model pages are built based on this data, not guesswork.

7.2 Prerequisites

equipmentFitment parsed into PartFitmentRecord[] (Phase 1)
modelKey in format {BRAND_CODE}::{MODEL_SLUG} (Phase 1)
Minimum 10 parts per model (thin content threshold)

7.3 New Routes

app/catalog/brand/[brandSlug]/model/page.tsx
app/catalog/brand/[brandSlug]/model/[modelSlug]/page.tsx
app/catalog/brand/[brandSlug]/model/[modelSlug]/[systemSlug]/page.tsx

7.4 Canonical Resolution

A part fits multiple models → canonical is always /catalog/[id]. Model pages are collection pages that link to the canonical. Breadcrumb reflects the navigation context:

Clinton Tractor → Briggs & Stratton → 450E Series → Engine Parts → AS9301B

7.5 Internal Linking

Part detail page: "Compatible models" section → links to model pages
Brand page: top models with part count → links to model pages
Category page: "Shop by brand" → links to brand × category pages

Deliverable: Top 20–30 model pages by GA4 priority. Full 4-level URL hierarchy for key brands.

Phase 8 — Semrush

Weeks 13–15 · June–July 2026

Goal: Define the keyword universe and systematic position monitoring.

Partially parallel with Phase 7 — does not depend on model pages.

8.1 Keyword Research

Build the semantic core across the matrix: brands × systems × models.

Query types:

Transactional: "briggs stratton 450e carburetor parts", "new holland baler belt buy"
Navigational: "AS9301B", "87546362 part" — direct part number searches
Informational: "how to replace kubota hydraulic filter"

Cluster by priority: search volume × difficulty × current GSC position.

8.2 Position Tracking

Top-50 target keywords in Semrush Position Tracking. Weekly updates. Comparison against competitors.

8.3 Content Gap Analysis

Competitors in top-10 where we are absent → list of 10–15 quick wins (low difficulty, real traffic). For each: recommendation of a new URL page or optimization of an existing one.

8.4 Site Audit

Semrush Site Audit on clintontractor.net/parts:

Orphan pages (pages with no internal links)
Duplicate content and canonical issues
Redirect chains
Core Web Vitals cross-check with GSC

Deliverable: Keyword universe defined. Position Tracking live. First gap analysis with quick wins.

Phase 9 — Realtime Analytics

Weeks 15–16 · July 2026

Goal: Understand user behavior in real time.

9.1 Tool: PostHog Cloud

Free up to 1M events/month
Session replay (after explicit consent)
Funnel analysis and heatmaps
Feature flags for A/B tests
User identification via Clerk userId

9.2 Integration

PostHog SDK in app/layout.tsx, gated by Cookie Consent
posthog.identify(clerkUserId) on sign-in
Session replay — explicit consent only (GDPR)
Two separate projects: dev / prod

9.3 Key Metrics

Funnel: Catalog → Detail → Cart → Purchase with exact drop-off % at each step
Navigation axis analysis: which axis (category vs brand) drives higher conversion
Search not found: % of queries with no results + the queries themselves
Exit pages: where users leave the site
Heatmaps: catalog page and product detail page

9.4 Alerts

purchase events fall below threshold → Slack/Email
404 rate spikes sharply → alert (broken sitemap or redirect)
Search not found rate > 30% → alert

Deliverable: PostHog collecting sessions. First heatmaps. Conversion funnel visible in real time.

Phase 10 — BI Dashboard

Weeks 17–20 · August 2026

Goal: The client receives clear data every Monday without having to ask.

10.1 Tool: Looker Studio

Free. Native GA4 + GSC integration. Scheduled email reports. Familiar Google interface for the client.

10.2 Data Sources

Source	Data
GA4	e-commerce funnel, sessions, conversion, navigation axis
GSC	rankings, CTR, impressions, Core Web Vitals
PostHog	funnel analysis, session replay links
MongoDB	quality tier distribution, catalogReady % by manufacturer
Semrush	competitor positions (manual export or API)

10.3 Four Dashboards

SEO Overview (weekly, for the client)

Organic sessions trend
Impressions + CTR + avg. position (GSC)
Top-10 keywords and weekly movement
New indexed pages vs errors
Core Web Vitals trend

Catalog Performance (daily, internal)

Top brands and categories by traffic
Top search queries + not-found rate
Navigation axis comparison: category vs brand → conversion rate
Parts with high views but no conversion

Sales Funnel (real-time via PostHog)

Catalog → Detail → Cart → Purchase %
Cart abandonment rate
Revenue by brand and category

Data Quality Report (monthly)

Part distribution by quality tier (month-over-month)
% of parts with images / price / description / fitment
catalogReady % by manufacturer
Progress vs the Phase 1 baseline

10.4 Automation

Looker Studio scheduled email: every Monday to client (SEO Overview + Catalog Performance)
GSC Weekly digest: automatic from Google
Alert: organic CTR drops > 15% week-over-week

Deliverable: 4 dashboards live. Automated report delivered to client every Monday.

Timeline

March       Phase 1   ████████████  Data Standard + JSON Schema
April       Phase 2   ████████      Images: alt tags + XMP
            Phase 3     ████████    JSON-LD: full coverage
May         Phase 4   ██████        URL: Brand → System pages
            Phase 5       ████      GSC + Indexing
            Phase 6         ██████  GA4
June        Phase 6   ████          GA4 (completion)
            Phase 7       ██████    URL: Brand → Model pages
            Phase 8     ████████    Semrush
July        Phase 8   ████          Semrush (completion)
            Phase 9       ████████  Realtime Analytics
August      Phase 10  ████████████  BI Dashboard

Sequence without exceptions:

Data + Schema  →  Images  →  JSON-LD  →  Brand URLs  →  Indexing
                                                          ↓
                                         GA4  →  Model URLs  →  Semrush  →  Realtime  →  BI

KPIs — Realistic Expectations

Metric	Baseline (March)	3 months (June)	6 months (August)
catalogReady %	❓ (Phase 1)	baseline + 20%	baseline + 40%
Image alt coverage	~30% (fallback)	100%	100%
JSON-LD page type coverage	3 of 10	10 of 10	10 of 10
Brand + model URL pages	0	30–50 brand × system	+ 20–30 model pages
Organic sessions/month	0	500–2,000	4,000–10,000
GSC Impressions/month	0	15,000–40,000	60,000–120,000
Avg. position (target keywords)	—	40–60	20–40
Core Web Vitals (LCP)	❓	< 3.0s	< 2.5s
GA4 e-commerce coverage	0%	100%	100%
Conversion rate	❓	baseline	baseline + 10–15%

Organic traffic from Google typically appears 3–5 months after indexing — this is normal for a site that Google has not seen before. Rankings in the first month will be low. Growth on model and brand × system pages will be gradual and will accelerate in months 5–6.

Dependencies and Risks

Risk	Impact	Mitigation
Indexing before data is ready	Google records low-quality content — hard to recover from	Phases 1–4 are a hard blocker for Phase 5
Model pages with < 10 parts	Thin content penalty	Enforce minimum threshold, noindex small pages
Cloudflare Workers blocks GA4	Data loss on production	Measurement Protocol as server-side fallback
Cookie Consent blocks analytics	Incomplete data	PostHog anonymous mode before consent
Fitment data not clean by Phase 7	Model pages cannot be built	Phase 1 standardization is a prerequisite
XMP standard not implemented by backend	Images without embedded metadata	Frontend reads XMP if present, does not block release

Technical Dependencies

New env vars (add to lib/env.ts and all 6 sync targets per project convention):

NEXT_PUBLIC_GA_MEASUREMENT_ID
NEXT_PUBLIC_POSTHOG_KEY
ENABLE_INDEXING ← already exists, false everywhere except CF production

ENABLE_INDEXING=true — production Cloudflare only, never on staging.

SEO Infrastructure — current SEO implementation details
crop-front Overview — frontend architecture

SEO Strategy

On this page