AI integration
This boilerplate includes integrations for multiple AI providers with support for streaming responses, making it easy to build AI-powered features.
Overview
The AI integration provides:
- Multiple providers - OpenAI (GPT-4), Anthropic (Claude), and Grok (xAI).
- Streaming responses - Real-time token-by-token output.
- Type-safe API - Zod validation for requests.
- Server-side processing - Secure API key management.
- Flexible configuration - Easy to add more providers.
Configuration
Environment variables
Add your AI provider API keys to .env:
# OpenAI (for GPT models)
OPENAI_API_KEY="sk-..."
# Anthropic (for Claude models)
ANTHROPIC_API_KEY="sk-ant-..."
# Grok / xAI (for Grok models)
GROK_API_KEY="xai-..."
API endpoint
The streaming API endpoint is located in server/api/ai/stream.ts:
import { OpenAI } from 'openai'
import Anthropic from '@anthropic-ai/sdk'
import { z } from 'zod'
const StreamRequestSchema = z.object({
model: z.enum(['chatgpt', 'claude', 'grok']),
prompt: z.string().min(1).max(4000),
temperature: z.number().min(0).max(2).default(0.3),
max_tokens: z.number().int().min(1).max(4000).default(2000),
top_p: z.number().min(0).max(1).default(0.95),
stream: z.boolean().default(true),
})
export default defineEventHandler(async event => {
const body = await readBody(event)
const { model, prompt, temperature, max_tokens, top_p } = StreamRequestSchema.parse(body)
// Handle streaming based on provider
// ... (see implementation in the file)
})
Supported models
The boilerplate is configured with these models by default:
const models = {
chatgpt: 'gpt-4o-mini', // Fast, cost-effective GPT-4
claude: 'claude-3-5-haiku-latest', // Fast Claude model
grok: 'grok-4', // xAI's Grok model
}
You can easily change these to other models:
const models = {
chatgpt: 'gpt-4o', // More capable GPT-4
claude: 'claude-3-5-sonnet-latest', // More capable Claude
grok: 'grok-vision-beta', // Grok with vision
}
Using the AI API
Client-side example
Here's how to use the streaming API in your components:
<script setup lang="ts">
const prompt = ref('')
const response = ref('')
const isStreaming = ref(false)
const selectedModel = ref<'chatgpt' | 'claude' | 'grok'>('chatgpt')
async function handleSubmit() {
if (!prompt.value.trim()) return
isStreaming.value = true
response.value = ''
try {
const res = await fetch('/api/ai/stream', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: selectedModel.value,
prompt: prompt.value,
temperature: 0.7,
max_tokens: 2000,
}),
})
if (!res.ok) throw new Error('Failed to get response')
const reader = res.body?.getReader()
const decoder = new TextDecoder()
while (true) {
const { done, value } = await reader!.read()
if (done) break
const chunk = decoder.decode(value, { stream: true })
response.value += chunk
}
} catch (error) {
console.error('AI error:', error)
toast.error('Failed to get AI response')
} finally {
isStreaming.value = false
}
}
</script>
<template>
<div class="space-y-4">
<div class="space-y-2">
<Label>Select AI Model</Label>
<Select v-model="selectedModel">
<SelectTrigger>
<SelectValue />
</SelectTrigger>
<SelectContent>
<SelectItem value="chatgpt">ChatGPT (GPT-4)</SelectItem>
<SelectItem value="claude">Claude (Anthropic)</SelectItem>
<SelectItem value="grok">Grok (xAI)</SelectItem>
</SelectContent>
</Select>
</div>
<div class="space-y-2">
<Label for="prompt">Your prompt</Label>
<Textarea id="prompt" v-model="prompt" placeholder="Ask me anything..." rows="4" />
</div>
<Button @click="handleSubmit" :disabled="isStreaming">
<span v-if="isStreaming">
<Icon name="lucide:loader-2" class="animate-spin mr-2" />
Generating...
</span>
<span v-else>Send</span>
</Button>
<Card v-if="response" class="mt-4">
<CardHeader>
<CardTitle>Response</CardTitle>
</CardHeader>
<CardContent>
<div class="prose dark:prose-invert max-w-none">
{{ response }}
</div>
</CardContent>
</Card>
</div>
</template>
Protecting AI endpoints
Require authentication
Only allow authenticated users to access AI features:
export default defineEventHandler(async event => {
// Require authentication
const userId = await requireAuth(event)
// ... rest of the code
})
Require subscription
Only allow paying subscribers to use AI:
import { requireSubscription } from '@@/server/utils/require-subscription'
export default defineEventHandler(async event => {
// Require pro or enterprise subscription
await requireSubscription(event, { plans: ['pro', 'enterprise'] })
// ... rest of the code
})
Rate limiting
The AI endpoint already includes rate limiting to prevent abuse (5 requests per 5 minutes):
import { rateLimit } from '@@/server/utils/rate-limit'
export default defineEventHandler(async event => {
await rateLimit(event, {
max: 5,
window: '5m',
prefix: 'ai-stream',
})
// ... rest of the code
})
You can adjust the limits by changing max (number of requests) and window (time window: '1m', '5m', '1h', etc.).
Model parameters
You can fine-tune AI responses using these optional parameters:
await fetch('/api/ai/stream', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'chatgpt',
prompt: 'Your prompt here',
temperature: 0.7, // 0-2: Lower = focused, higher = creative (default: 0.3)
max_tokens: 1000, // Max response length (default: 1000)
top_p: 0.95, // Alternative to temperature (default: 0.95)
}),
})
Common temperature values:
0.3- Factual responses, code generation0.7- Balanced creativity (recommended starting point)1.2- Creative writing, brainstorming
Adding more providers
To add a new AI provider:
- Install the SDK:
pnpm add @google/generative-ai
- Add to the stream handler:
import { GoogleGenerativeAI } from '@google/generative-ai'
const models = {
// ... existing models
gemini: 'gemini-1.5-flash',
}
const gemini = new GoogleGenerativeAI({ apiKey: process.env.GEMINI_API_KEY })
// In the handler
case 'gemini': {
const model = gemini.getGenerativeModel({ model: models.gemini })
const result = await model.generateContentStream(prompt)
for await (const chunk of result.stream) {
const text = chunk.text()
if (text) controller.enqueue(encoder.encode(text))
}
break
}
- Update the schema:
const StreamRequestSchema = z.object({
model: z.enum(['chatgpt', 'claude', 'grok', 'gemini']),
// ...
})
Example components
The template includes two ready-to-use AI interface components:
Shadcn version
app/components/ai/AiInterfaceShadcn.vue - Direct streaming interface:
<template>
<AiInterfaceShadcn />
</template>
Features:
- Model selection (ChatGPT, Claude, Grok)
- Prompt input with validation
- Real-time streaming response display
- Error handling and markdown rendering
Nuxt UI version
app/components/ai/AiInterfaceNuxtUi.vue - Chat-based interface:
<template>
<AiInterfaceNuxtUi />
</template>
Features:
- Creates persistent chat sessions via
/api/chats - Quick prompt suggestions
- Model selection component
- Integrates with the chat system (see
ChatandMessagemodels in database)
Common use cases
Building a chat with conversation history
To maintain conversation context, concatenate previous messages in your prompt:
const messages = ref<Array<{ role: 'user' | 'assistant'; content: string }>>([])
// When sending a message
const conversationPrompt = messages.value
.map(m => `${m.role}: ${m.content}`)
.join('\n') + '\nassistant:'
await fetch('/api/ai/stream', {
method: 'POST',
body: JSON.stringify({ model: 'chatgpt', prompt: conversationPrompt }),
})
Adjusting creativity for different tasks
Different tasks need different temperature settings:
// Factual/code tasks - use low temperature
await fetch('/api/ai/stream', {
body: JSON.stringify({
model: 'chatgpt',
prompt: 'Explain how async/await works in JavaScript',
temperature: 0.3,
}),
})
// Creative tasks - use higher temperature
await fetch('/api/ai/stream', {
body: JSON.stringify({
model: 'claude',
prompt: 'Write a creative tagline for a sustainable tech startup',
temperature: 1.2,
}),
})
Production considerations
Rate limiting
The API endpoint includes built-in rate limiting (5 requests per 5 minutes). Adjust in server/api/ai/stream.ts:
await rateLimit(event, {
max: 5, // Number of requests
window: '5m', // Time window
prefix: 'ai-stream',
})
Cost monitoring
AI APIs can be expensive. Track usage by adding logging:
// Add after validation
logger.info('AI request', {
model,
userId: event.context.user?.id,
promptLength: prompt.length,
})
Monitor your spending in provider dashboards:
Input validation
All requests are validated with Zod before reaching the AI providers. The schema enforces:
- Valid model selection
- Prompt length: 1-4000 characters
- Temperature: 0-2
- Max tokens: 1-4000
To customize validation, edit StreamRequestSchema in server/api/ai/stream.ts.
Troubleshooting
API returns 429 (Rate Limit)
- The built-in rate limiter is configured to 5 requests/5 minutes
- Adjust limits in the API route or implement user-based quotas
Stream not working
- Check that your environment variables are set correctly
- Verify API keys have the necessary permissions
- Check the browser console for specific error messages
For provider-specific issues, refer to official documentation: