Embeddable RAG FAQ-Based AI Chatbot Widget - Next.js, Vectorize, Redis, Gemini, Hugging Face, OpenRouter FullStack Project
A production-ready, self-hosted RAG (Retrieval Augmented Generation) chatbot widget built with Next.js, Redis vector storage, and multiple AI model fallbacks. Perfect for embedding into portfolio websites or any web application.
-
Live Demo: https://portfolio-chatbot-widget.vercel.app/
-
Production Live: https://arnob-mahmud.vercel.app/
- Overview
- Features
- Technology Stack
- Project Structure
- How It Works
- Installation & Setup
- Environment Variables
- Deployment
- Usage
- API Endpoints
- Components & Architecture
- Reusing Components
- Code Examples
- Keywords
- Conclusion
This project is a fully functional, embeddable AI chatbot widget that can be integrated into any website. It leverages Next.js Edge Runtime for fast API responses, Redis for vector storage and session management, and implements a robust RAG system with multiple AI model and embedding fallbacks for reliability.
- RAG Implementation: Semantic search through FAQ database using vector embeddings
- Multiple AI Fallbacks: Gemini β OpenRouter GPT for reliable responses
- Multiple Embedding Fallbacks: Gemini β Hugging Face β OpenRouter β OpenAI
- Edge Runtime: Fast API responses using Next.js Edge Runtime
- React Components: Modern React with TanStack Query for state management
- Session Persistence: 30-day conversation history stored in Redis
- Zero Flash: Instant theme loading prevents FOUC (Flash of Unstyled Content)
- Mobile Responsive: Optimized for all screen sizes
- RAG (Retrieval Augmented Generation): Semantic search through FAQ database using cosine similarity
- Streaming AI Responses: Real-time token streaming using Server-Sent Events (SSE)
- AI Model Fallback Chain: Gemini (primary) β OpenRouter GPT (fallback)
- Embedding Fallback Chain: Gemini β Hugging Face β OpenRouter β OpenAI
- Session Persistence: 30-day conversation history stored in Upstash Redis
- Edge Runtime: Fast API responses with Next.js Edge Runtime
- CORS Support: Cross-origin requests enabled for embedding
- Rate Limiting: Batch processing for embeddings to avoid API limits
- React Components: Modern React 19 with TypeScript
- TanStack Query: Efficient data fetching and caching
- Dark/Light Mode: System preference detection with manual toggle, zero flash
- Mobile Optimized: Responsive design with keyboard-aware positioning
- Progressive Rendering: Messages appear as they stream
- Optimistic UI: Instant feedback with optimistic updates
- Menu System: Theme toggle, font size, widget position, and more
- Accessibility: Proper ARIA labels and keyboard navigation
- Next.js 16.1.4: React framework with App Router
- React 19.2.3: Latest React with concurrent features
- TypeScript 5: Type-safe development
- Tailwind CSS 3.4.17: Utility-first CSS framework
- TanStack Query 5.90.19: Data fetching and state management
- Radix UI: Accessible component primitives
- Lucide React: Icon library
- Next.js API Routes: Serverless API endpoints
- Edge Runtime: Fast edge computing for API routes
- Upstash Redis: Serverless Redis for vector storage and sessions
- Google Gemini API: Primary AI model and embeddings
- Hugging Face Inference API: Embedding fallback
- OpenRouter API: AI model and embedding fallback
- OpenAI API: Embedding fallback (optional)
- ESLint: Code linting
- TypeScript: Type checking
- PostCSS: CSS processing
- Autoprefixer: CSS vendor prefixing
portfolio-chatbot-widget/
βββ app/
β βββ api/
β β βββ chat/
β β β βββ route.ts # Chat API endpoint (Edge Runtime)
β β βββ feedback/
β β β βββ route.ts # Feedback submission endpoint
β β βββ history/
β β β βββ route.ts # Chat history retrieval
β β βββ seed/
β β βββ route.ts # FAQ seeding endpoint (Node.js Runtime)
β βββ layout.tsx # Root layout with widget injection
β βββ page.tsx # Demo page
β βββ providers.tsx # TanStack Query provider
β βββ globals.css # Global styles
βββ components/
β βββ chatbot/
β β βββ chatbot-widget.tsx # Main widget component
β β βββ widget-menu.tsx # Menu dropdown component
β β βββ message-skeleton.tsx # Loading skeleton component
β βββ ui/
β βββ button.tsx # Button component (shadcn)
β βββ dialog.tsx # Dialog component (shadcn)
β βββ skeleton.tsx # Skeleton component (shadcn)
β βββ toast.tsx # Toast notification component
βββ hooks/
β βββ use-chat.ts # Chat functionality hook
β βββ use-widget-settings.ts # Widget settings hook
βββ lib/
β βββ ai.ts # AI model integration (fallback chain)
β βββ embeddings.ts # Embedding generation (fallback chain)
β βββ rag.ts # RAG search implementation
β βββ redis.ts # Redis client and vector operations
β βββ faqs.ts # FAQ knowledge base
β βββ constants.ts # Application constants
β βββ types.ts # TypeScript type definitions
β βββ utils.ts # Utility functions
β βββ export-utils.ts # Chat export utilities
βββ public/
β βββ widget.js # Vanilla JS embeddable widget
β βββ styles.css # Widget stylesheet
βββ types/
β βββ window.d.ts # Window type definitions
βββ next.config.ts # Next.js configuration
βββ tailwind.config.js # Tailwind CSS configuration
βββ tsconfig.json # TypeScript configuration
βββ package.json # Dependencies and scriptsapp/api/chat/route.ts: Main chat API endpoint with SSE streamingapp/api/seed/route.ts: Seeds FAQ embeddings into Rediscomponents/chatbot/chatbot-widget.tsx: Main React widget componenthooks/use-chat.ts: TanStack Query hook for chat functionalitylib/ai.ts: AI model integration with fallback chainlib/embeddings.ts: Embedding generation with multiple fallbackslib/rag.ts: RAG search implementation using cosine similaritylib/redis.ts: Redis client, vector storage, and session managementpublic/widget.js: Vanilla JavaScript embeddable widget (for external sites)
User Input β React Component (chatbot-widget.tsx)
β
useChat Hook β POST /api/chat
β
1. Extract/Generate Session ID (from cookie)
2. Retrieve FAQ Context (RAG)
- Generate embedding vector (Gemini β Hugging Face β OpenRouter β OpenAI)
- Search Redis vectors (cosine similarity)
- Get top 3 relevant FAQs
3. Build AI Message Array
- System prompt + FAQ context
- Last 6 conversation messages
4. Stream AI Response
- Try Gemini models (gemini-2.5-flash, gemini-2.5-pro)
- Fallback to OpenRouter GPT-4o-mini
- Stream via SSE
5. Save to Redis
β
SSE Stream β React Component
β
TanStack Query Cache Update β UI Update-
Question Embedding: User's question is converted to a 768-dimensional vector
- Primary: Gemini Embeddings API (
gemini-embedding-001) - Fallback 1: Hugging Face (
sentence-transformers/all-MiniLM-L6-v2) - Fallback 2: OpenRouter (OpenAI
text-embedding-ada-002) - Fallback 3: OpenAI (direct, if API key available)
- Primary: Gemini Embeddings API (
-
Vector Search: Redis is queried for similar vectors using cosine similarity
- All FAQ vectors stored in Redis with metadata
- Cosine similarity calculated for each vector
- Top 3 most similar FAQs retrieved
-
Context Retrieval: Top 3 most relevant FAQ entries are formatted
-
Context Injection: FAQs are formatted and injected into the AI system prompt
-
AI Generation: AI model generates response using FAQ context + conversation history
- Primary: Google Gemini (
gemini-2.5-flash,gemini-2.5-pro) - Fallback: OpenRouter GPT-4o-mini
- Primary: Google Gemini (
- Session Creation: New sessions get a timestamp-based session ID
- Cookie Storage: Session ID stored in HttpOnly cookie (30-day expiration)
- Redis Storage: Full conversation history stored in Upstash Redis
- Session Retrieval: Existing sessions load conversation history on widget initialization
- Node.js 18+ installed
- npm or yarn package manager
- Upstash Redis account (free tier available)
- Google Gemini API key (free tier available)
- Hugging Face API key (free tier available)
- OpenRouter API key (optional, for fallback)
git clone <repository-url>
cd portfolio-chatbot-widgetnpm installThis installs all required dependencies including Next.js, React, TanStack Query, and Redis client.
Create a .env.local file in the root directory:
cp .env.example .env.localSee Environment Variables section for detailed configuration.
After setting up environment variables, populate Redis with FAQ embeddings:
curl -X POST http://localhost:3000/api/seedThis generates embeddings for all FAQs and stores them in Redis.
npm run devVisit http://localhost:3000 to see the widget in action.
Create a .env.local file in the root directory with the following variables:
# Redis Configuration (Upstash)
UPSTASH_REDIS_URL=https://your-redis-instance.upstash.io
UPSTASH_REDIS_TOKEN=your-redis-token
# AI Model API Keys
GOOGLE_GEMINI_API_KEY=your-gemini-api-key
OPENROUTER_API_KEY=your-openrouter-api-key # Optional, for fallback
# Embedding API Keys
HUGGING_FACE_API_KEY=your-huggingface-api-key
OPENAI_API_KEY=your-openai-api-key # Optional, for embedding fallback
# Application Configuration
NEXT_PUBLIC_CHATBOT_URL=http://localhost:3000 # Your deployment URL
CHATBOT_TITLE=Chat Assistant # Widget title
CHATBOT_GREETING=π How can I help you today? # Initial greeting
CHATBOT_PLACEHOLDER=Message... # Input placeholder
# Session Configuration (optional)
SESSION_TTL=2592000 # 30 days in seconds- Visit https://upstash.com/
- Create a free account
- Create a new Redis database
- Copy the
UPSTASH_REDIS_URLandUPSTASH_REDIS_TOKEN
- Visit https://makersuite.google.com/app/apikey
- Create a new API key
- Copy the
GOOGLE_GEMINI_API_KEY
- Visit https://huggingface.co/settings/tokens
- Create a new access token
- Copy the
HUGGING_FACE_API_KEY
- Visit https://openrouter.ai/
- Create an account and add credits
- Generate an API key
- Copy the
OPENROUTER_API_KEY
- Visit https://platform.openai.com/api-keys
- Create a new API key
- Copy the
OPENAI_API_KEY
npm run devStarts development server at http://localhost:3000 with hot reload.
npm run build
npm startBuilds optimized production bundle and starts production server.
-
Push to GitHub:
git add . git commit -m "Deploy to Vercel" git push origin main
-
Connect to Vercel:
- Visit https://vercel.com/
- Import your GitHub repository
- Add environment variables in Vercel dashboard
- Deploy!
-
Post-Deployment:
- Update
NEXT_PUBLIC_CHATBOT_URLto your Vercel URL - Run seed endpoint:
curl -X POST https://your-app.vercel.app/api/seed
- Update
-
Push to GitHub:
git add . git commit -m "Deploy to VPS" git push origin main
-
Deploy via Coolify:
- Login to Coolify dashboard
- Create new application
- Connect GitHub repository
- Set environment variables
- Deploy!
-
Configure Domain:
- Point DNS to your VPS IP
- Configure domain in Coolify
- Update
NEXT_PUBLIC_CHATBOT_URL
Add to your app/layout.tsx:
import { ChatbotWidget } from "@/components/chatbot/chatbot-widget";
export default function RootLayout({ children }) {
return (
<html>
<body>
{children}
<ChatbotWidget />
</body>
</html>
);
}Add to your HTML:
<!-- Configure widget -->
<script>
window.CHATBOT_BASE_URL = "https://your-domain.com";
window.CHATBOT_TITLE = "Support Assistant";
window.CHATBOT_GREETING = "Hi! π How can I help you today?";
window.CHATBOT_PLACEHOLDER = "Type your message...";
</script>
<!-- Load widget script -->
<script src="https://your-domain.com/widget.js" async></script>| Variable | Default | Description |
|---|---|---|
CHATBOT_BASE_URL |
window.location.origin |
API endpoint URL |
CHATBOT_TITLE |
'Chat Assistant' |
Widget header title |
CHATBOT_GREETING |
'π How can I help you today?' |
Initial greeting message |
CHATBOT_PLACEHOLDER |
'Message...' |
Input field placeholder |
Sends a message and receives a streaming AI response.
Request:
{
"message": "Tell me about your services"
}Response: Server-Sent Events (SSE) stream
data: {"response": "Hello"}
data: {"response": "! "}
data: {"response": "I can"}
...
data: [DONE]Headers:
Content-Type: application/json(request)Content-Type: text/event-stream(response)Set-Cookie: chatbot_session=...(new sessions)
Retrieves conversation history for the current session.
Request: Cookie-based (no body needed)
Response:
{
"messages": [
{
"role": "user",
"content": "Hello",
"timestamp": 1234567890
},
{
"role": "assistant",
"content": "Hi! How can I help?",
"timestamp": 1234567891
}
]
}Headers:
- Cookie:
chatbot_session=<session-id>
Populates Redis with FAQ embeddings.
Request: No body required
Response:
{
"success": true,
"count": 20
}Note: Run this once after deployment to populate the knowledge base.
Submits user feedback.
Request:
{
"rating": 5,
"comment": "Great chatbot!"
}Response:
{
"success": true
}Purpose: Implements Retrieval Augmented Generation
Process:
export async function searchFAQ(
query: string,
topK: number = 3,
): Promise<string> {
// 1. Generate embedding
const queryEmbedding = await generateEmbedding(query);
// 2. Search Redis vectors
const results = await searchVectors(queryEmbedding, topK);
// 3. Format context
return results
.map((result) => {
const { question, answer } = result.metadata;
return `Q: ${question}\nA: ${answer}`;
})
.join("\n\n");
}Reusability: Can be extracted to a separate module for use in other projects.
Purpose: Handles AI model calls with fallback chain
Key Features:
- Gemini models (primary):
gemini-2.5-flash,gemini-2.5-pro - OpenRouter GPT (fallback):
openai/gpt-4o-mini - Message normalization for different formats
- Streaming support
Reusability: The fallback pattern can be adapted for other AI models.
Purpose: Generates embeddings with multiple fallbacks
Fallback Chain:
- Gemini Embeddings (
gemini-embedding-001) - Hugging Face (
sentence-transformers/all-MiniLM-L6-v2) - OpenRouter (OpenAI
text-embedding-ada-002) - OpenAI (direct, if API key available)
Reusability: Embedding logic can be extracted to a standalone utility.
Purpose: Manages Redis connections, sessions, and vector storage
Key Functions:
getSession(): Retrieve session from RedissaveSession(): Save session to Redis with TTLstoreVector(): Store FAQ embeddingssearchVectors(): Cosine similarity search
Reusability: Redis operations can be adapted for other use cases.
Purpose: Main React component for the chatbot widget
Key Features:
- Toggle open/close state
- Message rendering
- Input handling
- Auto-scroll
- Responsive design
Reusability: Component can be customized for different use cases.
Purpose: TanStack Query hook for chat functionality
Key Features:
- Chat history fetching
- Message sending with streaming
- Optimistic UI updates
- Error handling
Reusability: Hook can be adapted for other chat applications.
Purpose: Manages widget settings (theme, font size, position)
Key Features:
- localStorage persistence
- Theme management
- Font size control
- Position control
Reusability: Settings logic can be extracted to a standalone utility.
// Copy searchFAQ() from lib/rag.ts
// Adapt to your embedding model and vector database
import { generateEmbedding } from "./embeddings";
import { searchVectors } from "./redis";
export async function customRAG(query: string, topK: number = 5) {
const queryEmbedding = await generateEmbedding(query);
const results = await searchVectors(queryEmbedding, topK);
return results.map((r) => r.metadata);
}// Copy useChat() from hooks/use-chat.ts
// Adapt to your API endpoint
import { useMutation, useQuery } from "@tanstack/react-query";
export function useCustomChat() {
const { data: messages } = useQuery({
queryKey: ["chat-history"],
queryFn: fetchHistory,
});
const sendMessage = useMutation({
mutationFn: async (message: string) => {
// Your API call
},
});
return { messages, sendMessage };
}// Copy getAIResponse() from lib/ai.ts
// Adapt to your AI models
import { GoogleGenerativeAI } from "@google/generative-ai";
export async function getCustomAIResponse(messages: Message[]) {
const genAI = new GoogleGenerativeAI(process.env.GOOGLE_GEMINI_API_KEY!);
const model = genAI.getGenerativeModel({ model: "gemini-2.5-flash" });
const result = await model.generateContentStream(prompt);
return result;
}Modify lib/faqs.ts to use your own FAQs:
export const faqs = [
["What is your return policy?", "We offer 30-day returns..."],
["How do I track my order?", "You can track your order..."],
// Add more FAQs
];Add a new model to the fallback chain in lib/ai.ts:
// Add before OpenRouter fallback
try {
const customModel = new CustomAIClient(process.env.CUSTOM_API_KEY!);
const response = await customModel.generate(messages);
return { text: response };
} catch (error) {
console.log("Custom model failed, trying next...");
}Add a new embedding provider in lib/embeddings.ts:
// Add before Hugging Face fallback
try {
const response = await fetch("https://api.custom-embeddings.com/embed", {
method: "POST",
headers: { Authorization: `Bearer ${process.env.CUSTOM_EMBEDDING_KEY}` },
body: JSON.stringify({ text }),
});
const data = await response.json();
return data.embedding;
} catch (error) {
console.log("Custom embedding failed, trying next...");
}Modify components/chatbot/chatbot-widget.tsx:
// Update className for custom styling
<button
className={cn(
"fixed w-14 h-14 bg-blue-600 rounded-full", // Custom color
"shadow-2xl flex items-center justify-center",
// ... more styles
)}
>- RAG (Retrieval Augmented Generation)
- Next.js
- React
- TypeScript
- Redis
- Vector Database
- Semantic Search
- Embeddings
- AI Chatbot
- Streaming Responses
- Server-Sent Events (SSE)
- TanStack Query
- Edge Runtime
- Upstash Redis
- Google Gemini
- Hugging Face
- OpenRouter
- OpenAI
- Cosine Similarity
- Session Management
- Dark Mode
- Mobile Responsive
- Embeddable Widget
- FAQ-Based Chatbot
This project demonstrates a production-ready implementation of an AI chatbot widget with RAG capabilities, built with Next.js and modern React patterns. It showcases:
- Modern Architecture: Next.js App Router, Edge Runtime, React Server Components
- Best Practices: TypeScript, TanStack Query, proper error handling
- Performance: Edge Runtime, streaming responses, optimistic UI
- Reliability: Multiple fallback chains for AI models and embeddings
- Scalability: Redis for vector storage, efficient cosine similarity search
- Developer Experience: Well-documented, type-safe, reusable components
The codebase is well-documented and structured for easy understanding and extension. Each component can be reused independently in other projects.
Feel free to use this project repository and extend this project further!
If you have any questions or want to share your work, reach out via GitHub or my portfolio at https://arnob-mahmud.vercel.app/.
Enjoy building and learning! π
Thank you! π