feat: auto-detect embedding dimension during validation #10993

roomote · 2026-01-27T05:46:55Z

Related GitHub Issue

Roo Code Task Context (Optional)

N/A

Description

This PR implements auto-detection of embedding dimensions during embedder validation to address the issue where users can configure incorrect embedding dimensions, causing Qdrant to reject vector upserts with dimension mismatches.

Key Implementation Details:

IEmbedder Interface Update: Added optional detectedDimension field to the validation result type
Embedder Updates: All 8 embedders (ollama, openai-compatible, openai, gemini, mistral, bedrock, openrouter, vercel-ai-gateway) now parse the test embedding response during validation and return the actual dimension
Service Factory Changes:
- validateEmbedder() now returns the detected dimension
- createVectorStore() accepts an optional detectedDimension parameter with the following priority order:
  1. Auto-detected from test embedding (most reliable)
  2. Profile-based from getModelDimension()
  3. Manual configuration from modelDimension setting
Manager Integration: _recreateServices() validates the embedder first, captures the detected dimension, and passes it to createServices()

Design Choices:

The auto-detected dimension takes highest priority since it comes directly from the model's actual output
Existing fallback mechanisms (profile-based and manual) are preserved for compatibility
Zero or negative detected dimensions are ignored as invalid

Test Procedure

Automated Tests: Added new test cases covering:
- Ollama embedder returning detected dimension (including realistic 4096-dimension embedding)
- OpenAI-compatible embedder returning dimension from both array and base64-encoded embeddings
- Service factory prioritizing detected dimension over other sources
- Service factory falling back correctly when detected dimension is not available

Test Commands:

cd src && npx vitest run services/code-index/__tests__/service-factory.spec.ts services/code-index/embedders/__tests__/ollama.spec.ts services/code-index/embedders/__tests__/openai-compatible.spec.ts

Manual Testing Scenario:
- Configure Ollama with qwen3-embedding model
- Set an incorrect manual dimension (e.g., 1536)
- The system should now auto-detect the correct dimension (4096) and use it

Pre-Submission Checklist

Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
Scope: My changes are focused on the linked issue (one major feature/fix per PR).
Self-Review: I have performed a thorough self-review of my code.
Testing: New and/or updated tests have been added to cover my changes.
Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

N/A - This is a backend-only change with no UI impact.

Documentation Updates

No documentation updates are required.
Yes, documentation updates may be needed. The codebase indexing documentation could note that embedding dimensions are now auto-detected, making manual configuration less critical.

Additional Notes

This addresses the specific scenario described in issue #10991 where a user configured 1536 dimensions but their Ollama qwen3-embedding model actually produces 4096-dimension embeddings, causing Qdrant to reject upserts with the error: "Wrong input: Vector inserting dimension is expected to be 1536."

Get in Touch

N/A - Automated PR

This change addresses issue #10991 where users can configure incorrect embedding dimensions, causing Qdrant to reject vector upserts with dimension mismatches. Changes: - Updated IEmbedder interface to include optional detectedDimension in validation result - All 8 embedders now return the detected dimension from their test embedding during validation - Updated CodeIndexServiceFactory.createVectorStore() to accept and prioritize auto-detected dimension over profile-based and manual configuration - Updated CodeIndexManager._recreateServices() to pass detected dimension from validation to vector store creation - Added comprehensive tests for the new functionality Priority order for dimension selection: 1. Auto-detected from test embedding (most reliable) 2. Profile-based from getModelDimension() 3. Manual configuration from modelDimension setting Fixes #10991

roomote · 2026-01-27T05:47:24Z

Rooviewer See task on Roo Cloud

Review complete. No issues found.

Interface changes: IEmbedder.validateConfiguration() now returns detectedDimension
All 8 embedders updated to detect dimensions from test embeddings
Dimension priority correctly implemented (auto-detected > profile > manual)
Manager integration properly passes detected dimension to vector store creation
Comprehensive test coverage for new functionality

_{Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.}

roomote bot requested review from cte, jr and mrubens as code owners January 27, 2026 05:46

github-project-automation bot added this to Roo Code Roadmap and Roo Code Roadmap Jan 27, 2026

github-project-automation bot moved this to Triage in Roo Code Roadmap Jan 27, 2026

github-project-automation bot moved this to New in Roo Code Roadmap Jan 27, 2026

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. Enhancement New feature or request labels Jan 27, 2026

roomote bot mentioned this pull request Jan 27, 2026

[BUG] - Codebase Indexing sends mixed-dimension embeddings to Qdrant (1536 then 4096) despite Ollama embedder config → Qdrant rejects upserts; indexing stuck at 0 #10991

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: auto-detect embedding dimension during validation #10993

feat: auto-detect embedding dimension during validation #10993

roomote bot commented Jan 27, 2026

Uh oh!

roomote bot commented Jan 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: auto-detect embedding dimension during validation #10993

Are you sure you want to change the base?

feat: auto-detect embedding dimension during validation #10993

Conversation

roomote bot commented Jan 27, 2026

Related GitHub Issue

Roo Code Task Context (Optional)

Description

Test Procedure

Pre-Submission Checklist

Screenshots / Videos

Documentation Updates

Additional Notes

Get in Touch

Uh oh!

roomote bot commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

roomote bot commented Jan 27, 2026 •

edited

Loading