CROP
ProjectsCROP Frontend

AI Integration Audit & Refactoring Plan

Audit of Vercel AI Gateway integration in CROP-front (customer-facing site).

AI Integration Audit & Refactoring Plan

Executive Summary

Audit of Vercel AI Gateway integration in CROP-front (customer-facing site).


Current State Analysis

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                        CROP-front                                │
├─────────────────────────────────────────────────────────────────┤
│  app/page.tsx                                                    │
│  └── useAiChat (hooks/use-ai-chat.ts)                           │
│       └── sendMessageStream (services/ai-chat/client.ts)        │
│            └── /api/ai/query (Next.js proxy)                    │
│                 └── Cloud Run vLLM (LLaMA 3.1 8B)               │
├─────────────────────────────────────────────────────────────────┤
│  NEW: app/api/chat/route.ts (Vercel AI SDK)                     │
│       └── lib/ai/provider.ts                                     │
│            └── @ai-sdk/gateway → Claude/GPT-4                   │
└─────────────────────────────────────────────────────────────────┘

Files Inventory

FilePurposeStatus
lib/ai/provider.tsAI provider configuration✅ Created
app/api/chat/route.tsVercel AI streaming endpoint✅ Created
hooks/use-ai-chat.tsCustom chat hook (legacy)✅ Compatible
hooks/use-vercel-chat.tsVercel AI SDK hook✅ Created
hooks/use-chat-provider.tsUnified hook with feature flag✅ Created
services/ai-chat/client.tsvLLM client✅ Legacy (fallback)
app/page.tsxMain chat UI✅ Uses useChatProvider

Critical Issues

1. Dual Implementation Without Integration

Problem: New Vercel AI SDK files created but not connected to frontend.

// app/page.tsx - still uses legacy hook
const { messages, isLoading, error, sendMessage } = useAiChat({
  streaming: true,
  skipAuth: true,
});

Impact: Users still hitting vLLM backend, not Vercel AI Gateway.

2. Fake Streaming in Legacy Client

Problem: sendMessageStream simulates streaming by chunking complete response.

// services/ai-chat/client.ts:190-206
export async function* sendMessageStream(...) {
  const response = await sendMessage(...); // Wait for FULL response
  const content = response.content;
  for (let i = 0; i < content.length; i += chunkSize) {
    yield { type: "content", content: content.slice(i, i + chunkSize) };
    await new Promise((resolve) => setTimeout(resolve, 10)); // Fake delay!
  }
}

Impact: Poor UX - user waits for full response before seeing anything.

3. Gateway Fallback Not Working

Problem: getGatewayProviderOptions() returns incompatible type.

// lib/ai/provider.ts:84
export function getGatewayProviderOptions(): Record<string, unknown> | undefined
// But streamText expects: SharedV3ProviderOptions | undefined

Impact: No automatic fallback when Anthropic fails.

4. Inconsistent Message Types

Problem: Legacy ChatMessage type differs from Vercel AI SDK UIMessage.

// services/ai-chat/types.ts
interface ChatMessage {
  id: string;
  role: "user" | "assistant" | "system";
  content: string;
  createdAt: string;
  metadata?: ChatMessageMetadata;
}

// @ai-sdk/react UIMessage
interface UIMessage {
  id: string;
  role: "user" | "assistant" | "system";
  parts: MessagePart[];
  // Different structure!
}

Recommendations

Pros:

  • Real streaming from Claude/GPT-4
  • Unified codebase
  • Better UX
  • Automatic fallback support

Cons:

  • Cost increase (cloud AI vs self-hosted vLLM)
  • Migration effort

Implementation:

  1. Create hooks/use-vercel-chat.ts wrapper around useChat
  2. Update app/page.tsx to use new hook when AI_PROVIDER !== "vllm"
  3. Keep legacy path for cost optimization scenarios

Option B: Feature Flag Approach (Conservative)

Pros:

  • Gradual rollout
  • Easy rollback
  • A/B testing possible

Cons:

  • Dual maintenance
  • Code complexity

Implementation:

  1. Add NEXT_PUBLIC_USE_VERCEL_AI env variable
  2. Create unified hook that switches implementation
  3. Both paths coexist

Option C: Hybrid Architecture (Cost-Optimized)

Pros:

  • Best of both worlds
  • Cost control

Cons:

  • Most complex

Implementation:

  1. Simple queries → vLLM (LLaMA, cheap)
  2. Complex queries → Vercel AI Gateway (Claude, quality)
  3. Automatic routing based on query complexity

Refactoring Plan

Phase 1: Fix Type Issues (Day 1)

// lib/ai/provider.ts - Remove unused function
// getGatewayProviderOptions() is incompatible, Gateway handles fallback internally

Phase 2: Create Vercel Chat Hook (Day 1)

// hooks/use-vercel-chat.ts
"use client";

import { useChat } from "@ai-sdk/react";

export function useVercelChat() {
  const { messages, input, handleInputChange, handleSubmit, isLoading, error, stop, reload } =
    useChat({
      api: "/api/chat",
      onError: (err) => console.error("Chat error:", err),
    });

  return {
    messages,
    input,
    setInput: (value: string) => handleInputChange({ target: { value } } as any),
    sendMessage: handleSubmit,
    isLoading,
    error,
    stop,
    regenerate: reload,
  };
}

Phase 3: Update Frontend (Day 2)

// app/page.tsx
import { useVercelChat } from "@/hooks/use-vercel-chat";
import { useAiChat } from "@/hooks/use-ai-chat";

const useChat = process.env.NEXT_PUBLIC_USE_VERCEL_AI === "true"
  ? useVercelChat
  : useAiChat;

Phase 4: Add Stop/Regenerate UI (Day 2)

Update message display to support:

  • Stop streaming button
  • Regenerate last response button
  • Better loading states

Phase 5: Testing (Day 3)

  1. Unit tests for hooks
  2. E2E tests with Playwright
  3. A/B testing in staging

Environment Variables

Required for Vercel AI

# Provider selection
AI_PROVIDER="gateway"  # or "openai", "anthropic", "vllm"

# Feature flag for frontend
NEXT_PUBLIC_USE_VERCEL_AI="true"

# API keys (for non-gateway providers)
OPENAI_API_KEY="sk-..."
ANTHROPIC_API_KEY="sk-ant-..."

Vercel Dashboard Setup

  1. Enable AI Gateway in Vercel project settings
  2. Connect OpenAI and Anthropic providers
  3. Set up spend limits and alerts

File Structure After Refactoring

lib/
├── ai/
│   ├── provider.ts          # Server-side provider config
│   └── index.ts             # Re-exports
hooks/
├── use-ai-chat.ts           # Legacy vLLM hook
├── use-vercel-chat.ts       # NEW: Vercel AI SDK hook
└── use-chat.ts              # NEW: Unified hook with feature flag
app/
├── api/
│   ├── ai/query/route.ts    # Legacy vLLM proxy
│   └── chat/route.ts        # Vercel AI streaming
└── page.tsx                  # Uses unified hook

Success Metrics

  1. Latency: Time to first token < 500ms
  2. Streaming: Real progressive display (not fake chunking)
  3. Fallback: Automatic provider switch on errors
  4. Cost: Track per-provider spend in Vercel dashboard

Rollback Plan

If issues arise:

  1. Set NEXT_PUBLIC_USE_VERCEL_AI="false"
  2. Users automatically switch to vLLM backend
  3. No code changes needed

Next Steps

  1. Review and approve this plan
  2. Implement Phase 1-2 (hooks)
  3. Implement Phase 3 (update page.tsx to use useChatProvider)
  4. Implement AI Tools for Elasticsearch integration
  5. Test in development with NEXT_PUBLIC_USE_VERCEL_AI=true
  6. Deploy to staging with feature flag OFF
  7. Gradual rollout (10% → 50% → 100%)

Phase 4: AI Tools Integration (Completed)

New Files Created

FilePurpose
lib/ai/tools.tsAI tools for Elasticsearch catalog queries

AI Tools Implemented

// lib/ai/tools.ts exports:
export const cropAiTools = {
  searchParts,      // Search parts catalog with filters
  getPartDetails,   // Get detailed part information
  autocomplete,     // Get search suggestions
  checkAvailability // Check stock status for part numbers
};

Tool Definitions

ToolDescriptionUse Case
searchPartsSearch parts catalog by query, manufacturer, category, stock"Find oil filters for New Holland"
getPartDetailsGet complete details for a specific part"Tell me more about part ABC123"
autocompleteGet search suggestions for partial queriesHelp refine search terms
checkAvailabilityCheck stock status for multiple parts"Do you have these parts in stock?"

Route Integration

// app/api/chat/route.ts
import { stepCountIs, streamText } from "ai";
import { cropAiTools } from "@/lib/ai/tools";

const result = streamText({
  model,
  system: getSystemPrompt(),
  messages: await convertToModelMessages(messages),
  tools: cropAiTools,
  stopWhen: stepCountIs(5), // Allow up to 5 sequential tool calls
});

System Prompt Updates

The system prompt now includes:

  • Available tools description
  • Usage guidelines for each tool
  • Examples of when to use each tool

Technical Decisions

  1. Multi-step Tool Calls: Using stopWhen: stepCountIs(5) allows AI to:

    • Search for parts
    • Get details for specific results
    • Chain multiple tool calls in one conversation turn
  2. Type Safety:

    • Used buildSearchParams() helper to handle exactOptionalPropertyTypes
    • Used isAutocompleteSectionsResponse() type guard for union types
  3. Error Handling:

    • Each tool has try/catch with user-friendly error messages
    • Logging for debugging server-side issues

On this page