AI Integration Audit & Refactoring Plan
Audit of Vercel AI Gateway integration in CROP-front (customer-facing site).
AI Integration Audit & Refactoring Plan
Executive Summary
Audit of Vercel AI Gateway integration in CROP-front (customer-facing site).
Current State Analysis
Architecture Overview
┌─────────────────────────────────────────────────────────────────┐
│ CROP-front │
├─────────────────────────────────────────────────────────────────┤
│ app/page.tsx │
│ └── useAiChat (hooks/use-ai-chat.ts) │
│ └── sendMessageStream (services/ai-chat/client.ts) │
│ └── /api/ai/query (Next.js proxy) │
│ └── Cloud Run vLLM (LLaMA 3.1 8B) │
├─────────────────────────────────────────────────────────────────┤
│ NEW: app/api/chat/route.ts (Vercel AI SDK) │
│ └── lib/ai/provider.ts │
│ └── @ai-sdk/gateway → Claude/GPT-4 │
└─────────────────────────────────────────────────────────────────┘Files Inventory
| File | Purpose | Status |
|---|---|---|
lib/ai/provider.ts | AI provider configuration | ✅ Created |
app/api/chat/route.ts | Vercel AI streaming endpoint | ✅ Created |
hooks/use-ai-chat.ts | Custom chat hook (legacy) | ✅ Compatible |
hooks/use-vercel-chat.ts | Vercel AI SDK hook | ✅ Created |
hooks/use-chat-provider.ts | Unified hook with feature flag | ✅ Created |
services/ai-chat/client.ts | vLLM client | ✅ Legacy (fallback) |
app/page.tsx | Main chat UI | ✅ Uses useChatProvider |
Critical Issues
1. Dual Implementation Without Integration
Problem: New Vercel AI SDK files created but not connected to frontend.
// app/page.tsx - still uses legacy hook
const { messages, isLoading, error, sendMessage } = useAiChat({
streaming: true,
skipAuth: true,
});Impact: Users still hitting vLLM backend, not Vercel AI Gateway.
2. Fake Streaming in Legacy Client
Problem: sendMessageStream simulates streaming by chunking complete response.
// services/ai-chat/client.ts:190-206
export async function* sendMessageStream(...) {
const response = await sendMessage(...); // Wait for FULL response
const content = response.content;
for (let i = 0; i < content.length; i += chunkSize) {
yield { type: "content", content: content.slice(i, i + chunkSize) };
await new Promise((resolve) => setTimeout(resolve, 10)); // Fake delay!
}
}Impact: Poor UX - user waits for full response before seeing anything.
3. Gateway Fallback Not Working
Problem: getGatewayProviderOptions() returns incompatible type.
// lib/ai/provider.ts:84
export function getGatewayProviderOptions(): Record<string, unknown> | undefined
// But streamText expects: SharedV3ProviderOptions | undefinedImpact: No automatic fallback when Anthropic fails.
4. Inconsistent Message Types
Problem: Legacy ChatMessage type differs from Vercel AI SDK UIMessage.
// services/ai-chat/types.ts
interface ChatMessage {
id: string;
role: "user" | "assistant" | "system";
content: string;
createdAt: string;
metadata?: ChatMessageMetadata;
}
// @ai-sdk/react UIMessage
interface UIMessage {
id: string;
role: "user" | "assistant" | "system";
parts: MessagePart[];
// Different structure!
}Recommendations
Option A: Full Migration to Vercel AI SDK (Recommended)
Pros:
- Real streaming from Claude/GPT-4
- Unified codebase
- Better UX
- Automatic fallback support
Cons:
- Cost increase (cloud AI vs self-hosted vLLM)
- Migration effort
Implementation:
- Create
hooks/use-vercel-chat.tswrapper arounduseChat - Update
app/page.tsxto use new hook whenAI_PROVIDER !== "vllm" - Keep legacy path for cost optimization scenarios
Option B: Feature Flag Approach (Conservative)
Pros:
- Gradual rollout
- Easy rollback
- A/B testing possible
Cons:
- Dual maintenance
- Code complexity
Implementation:
- Add
NEXT_PUBLIC_USE_VERCEL_AIenv variable - Create unified hook that switches implementation
- Both paths coexist
Option C: Hybrid Architecture (Cost-Optimized)
Pros:
- Best of both worlds
- Cost control
Cons:
- Most complex
Implementation:
- Simple queries → vLLM (LLaMA, cheap)
- Complex queries → Vercel AI Gateway (Claude, quality)
- Automatic routing based on query complexity
Refactoring Plan
Phase 1: Fix Type Issues (Day 1)
// lib/ai/provider.ts - Remove unused function
// getGatewayProviderOptions() is incompatible, Gateway handles fallback internallyPhase 2: Create Vercel Chat Hook (Day 1)
// hooks/use-vercel-chat.ts
"use client";
import { useChat } from "@ai-sdk/react";
export function useVercelChat() {
const { messages, input, handleInputChange, handleSubmit, isLoading, error, stop, reload } =
useChat({
api: "/api/chat",
onError: (err) => console.error("Chat error:", err),
});
return {
messages,
input,
setInput: (value: string) => handleInputChange({ target: { value } } as any),
sendMessage: handleSubmit,
isLoading,
error,
stop,
regenerate: reload,
};
}Phase 3: Update Frontend (Day 2)
// app/page.tsx
import { useVercelChat } from "@/hooks/use-vercel-chat";
import { useAiChat } from "@/hooks/use-ai-chat";
const useChat = process.env.NEXT_PUBLIC_USE_VERCEL_AI === "true"
? useVercelChat
: useAiChat;Phase 4: Add Stop/Regenerate UI (Day 2)
Update message display to support:
- Stop streaming button
- Regenerate last response button
- Better loading states
Phase 5: Testing (Day 3)
- Unit tests for hooks
- E2E tests with Playwright
- A/B testing in staging
Environment Variables
Required for Vercel AI
# Provider selection
AI_PROVIDER="gateway" # or "openai", "anthropic", "vllm"
# Feature flag for frontend
NEXT_PUBLIC_USE_VERCEL_AI="true"
# API keys (for non-gateway providers)
OPENAI_API_KEY="sk-..."
ANTHROPIC_API_KEY="sk-ant-..."Vercel Dashboard Setup
- Enable AI Gateway in Vercel project settings
- Connect OpenAI and Anthropic providers
- Set up spend limits and alerts
File Structure After Refactoring
lib/
├── ai/
│ ├── provider.ts # Server-side provider config
│ └── index.ts # Re-exports
hooks/
├── use-ai-chat.ts # Legacy vLLM hook
├── use-vercel-chat.ts # NEW: Vercel AI SDK hook
└── use-chat.ts # NEW: Unified hook with feature flag
app/
├── api/
│ ├── ai/query/route.ts # Legacy vLLM proxy
│ └── chat/route.ts # Vercel AI streaming
└── page.tsx # Uses unified hookSuccess Metrics
- Latency: Time to first token < 500ms
- Streaming: Real progressive display (not fake chunking)
- Fallback: Automatic provider switch on errors
- Cost: Track per-provider spend in Vercel dashboard
Rollback Plan
If issues arise:
- Set
NEXT_PUBLIC_USE_VERCEL_AI="false" - Users automatically switch to vLLM backend
- No code changes needed
Next Steps
- Review and approve this plan
- Implement Phase 1-2 (hooks)
- Implement Phase 3 (update page.tsx to use useChatProvider)
- Implement AI Tools for Elasticsearch integration
- Test in development with
NEXT_PUBLIC_USE_VERCEL_AI=true - Deploy to staging with feature flag OFF
- Gradual rollout (10% → 50% → 100%)
Phase 4: AI Tools Integration (Completed)
New Files Created
| File | Purpose |
|---|---|
lib/ai/tools.ts | AI tools for Elasticsearch catalog queries |
AI Tools Implemented
// lib/ai/tools.ts exports:
export const cropAiTools = {
searchParts, // Search parts catalog with filters
getPartDetails, // Get detailed part information
autocomplete, // Get search suggestions
checkAvailability // Check stock status for part numbers
};Tool Definitions
| Tool | Description | Use Case |
|---|---|---|
searchParts | Search parts catalog by query, manufacturer, category, stock | "Find oil filters for New Holland" |
getPartDetails | Get complete details for a specific part | "Tell me more about part ABC123" |
autocomplete | Get search suggestions for partial queries | Help refine search terms |
checkAvailability | Check stock status for multiple parts | "Do you have these parts in stock?" |
Route Integration
// app/api/chat/route.ts
import { stepCountIs, streamText } from "ai";
import { cropAiTools } from "@/lib/ai/tools";
const result = streamText({
model,
system: getSystemPrompt(),
messages: await convertToModelMessages(messages),
tools: cropAiTools,
stopWhen: stepCountIs(5), // Allow up to 5 sequential tool calls
});System Prompt Updates
The system prompt now includes:
- Available tools description
- Usage guidelines for each tool
- Examples of when to use each tool
Technical Decisions
-
Multi-step Tool Calls: Using
stopWhen: stepCountIs(5)allows AI to:- Search for parts
- Get details for specific results
- Chain multiple tool calls in one conversation turn
-
Type Safety:
- Used
buildSearchParams()helper to handleexactOptionalPropertyTypes - Used
isAutocompleteSectionsResponse()type guard for union types
- Used
-
Error Handling:
- Each tool has try/catch with user-friendly error messages
- Logging for debugging server-side issues