An AI-powered hybrid search and labeling assistant that fuses Elastic Cloud BM25 + kNN retrieval with Google Vertex AI Gemini 2.0 reasoning โ built to revolutionize enterprise knowledge search and evaluation.
- Overview
- Features
- Architecture
- Tech Stack
- Installation
- Environment Variables
- Running Locally
- Deployment
- Usage
- Project Structure
- Evaluation Metrics
- Future Upgrades
- Contributors
- License
SearchSphere Agent is a full-stack AI search platform that integrates Elastic Cloud hybrid retrieval (BM25 + vector search) with Google Vertex AI Gemini 2.0 for contextual reasoning, evaluation, and dataset labeling.
It helps teams and enterprises find smarter, label faster, and evaluate efficiently โ a complete foundation for AI-powered RAG systems.
- ๐ Hybrid Search โ Combines Elastic BM25 (lexical) + kNN (semantic) for deep understanding.
- ๐ค Gemini Reasoning โ Uses Vertex AI Gemini-2.0-Flash for summaries and responses.
- ๐งฉ Label Assist โ Create ground-truth JSONs interactively for model evaluation.
- ๐ Metrics Dashboard โ Live precision@K and latency stats (p50/p95).
- ๐ฌ Conversational Refinement โ Natural chat-style interface for query reasoning.
- ๐ Cloud Ready โ Dockerized backend, deployed via Google Cloud Run + Netlify.
Frontend (Next.js 14, TypeScript, Tailwind)
โ
โผ
Next.js API routes (proxy)
โ
โผ
Backend (FastAPI)
โโโ Elastic Cloud (BM25 + kNN)
โโโ Vertex AI (Gemini + Embeddings)
โโโ Evaluation Engine
โโโ Label Assist Service
| Layer | Technology |
|---|---|
| Frontend | Next.js 14, React 18, Tailwind CSS, Recharts |
| Backend | FastAPI (Python 3.11), Elastic Cloud, Vertex AI |
| Deployment | Google Cloud Run (backend), Netlify (frontend) |
| Database | Elastic Cloud index (searchsphere_docs) |
| CI/CD | GitHub Actions, Docker |
git clone https://github.com/MrigankJaiswal-hub/SearchSphere-Agent.git
cd SearchSphere-Agentcd backend
python -m venv .venv
.venv\Scripts\activate # (Windows)
pip install -r requirements.txtcd ../web
npm installELASTIC_CLOUD_ID=your_elastic_cloud_id
ELASTIC_API_KEY=your_elastic_api_key
ELASTIC_INDEX=searchsphere_docs
VERTEX_LOCATION=us-central1
GCP_PROJECT_ID=searchsphere-ai
VERTEX_EMBED_MODEL=text-embedding-005
VERTEX_CHAT_MODEL=gemini-2.0-flash-001
ES_KNN_NUM_CANDIDATES=120
CORS_ORIGIN=*NEXT_PUBLIC_API_BASE=http://127.0.0.1:8080cd backend
uvicorn app:app --reload --port 8080cd web
npm run devVisit ๐ http://localhost:3000
gcloud run deploy searchsphere-backend \
--source . \
--region us-central1 \
--set-env-vars "CORS_ORIGIN=https://your-frontend-url.netlify.app"-
Import
/webdirectory from GitHub. -
Add Environment Variable:
NEXT_PUBLIC_API_BASE=https://<your-cloudrun-url> -
Deploy site.
-
Visit your deployed frontend.
-
Enter a natural language query (e.g., โWhat is hybrid search?โ).
-
Observe semantic + lexical search fusion results.
-
View real-time precision and latency metrics.
-
Go to Label Assist, input a query, and export groundtruth.json.
-
Upload groundtruth.json in Run Evaluation to compute P@K.
| Folder/File | Purpose | Status |
|---|---|---|
/backend |
FastAPI backend | โ |
/web |
Next.js frontend | โ |
/docs |
Documentation (README, credits, architecture, eval_matrix.xlsx) | โ |
/docs/credits.txt |
Acknowledgments | โ |
/docs/evaluation_matrix.xlsx |
Metrics template | โ |
/docs/diagram.png |
Architecture diagram (export from draw.io / Lucidchart) | โ |
.github/workflows/ |
Optional CI/CD | Optional โ |
.env.example |
Example env vars | โ |
LICENSE |
MIT License | โ |
README.md |
Overview | โ |
/โ Main chat & hybrid search page/metricsโ Live latency & precision dashboard/labelโ Label Assist tool for dataset generation
To test evaluation manually:
curl -X POST "$URL/api/eval/precision" \
-H "Content-Type: application/json" \
-d '{"query":"hybrid search","k":10}'SearchSphere-Agent/
โโโ backend/
โ โโโ app.py
โ โโโ routes/
โ โโโ utils/
โ โโโ tests/
โ โโโ requirements.txt
โ
โโโ web/
โ โโโ app/
โ โโโ components/
โ โโโ lib/
โ โโโ public/
โ โโโ package.json
โ โโโ tailwind.config.ts
โ
โโโ scripts/
โโโ assets/
โโโ README.md
- Precision@K: Evaluates top-K relevance.
- Latency Tracking: p50/p95 in milliseconds.
- Label Assist: Exports
groundtruth.jsonfor retraining.
Example output:
p50: 730 ms | p95: 1100 ms | Precision@10: 0.86
- Multi-modal retrieval (text, images, audio, video).
- Auth & multi-tenant support (Firebase/Cognito).
- Feedback-driven fine-tuning of hybrid fusion weights.
- Enhanced real-time dashboards and analytics.
Mrigank Jaiswal B.Tech
๐ฅ๏ธ Built full-stack architecture, Elastic-Vertex integration, frontend UI, and deployment automation.
This project is licensed under the MIT License. ยฉ 2025 Mrigank Jaiswal