-
Notifications
You must be signed in to change notification settings - Fork 24
Fix/conversation view mongo #269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: fix/audio-pipe-3
Are you sure you want to change the base?
Conversation
- Updated Docker Compose files to mount the entire config directory, allowing for better management of configuration files. - Introduced a new `WaveformData` model to store pre-computed waveform visualization data, improving UI performance by enabling waveform display without real-time decoding. - Enhanced the `app_factory` and `job` models to include the new `WaveformData` model, ensuring proper initialization and data handling. - Implemented waveform generation logic in a new worker module, allowing for on-demand waveform creation from audio chunks. - Added API endpoints for retrieving and generating waveform data, improving the overall audio processing capabilities. - Updated tests to cover new functionality and ensure robustness in waveform data handling.
…nd audio upload - Introduced three new test scripts: `sdk_test_auth.py`, `sdk_test_conversations.py`, and `sdk_test_upload.py`. - Each script tests different functionalities of the SDK, including authentication, conversation retrieval, and audio file uploads. - The scripts utilize the `ChronicleClient` to perform operations and print results for verification. - Enhanced testing capabilities for the SDK, ensuring robust validation of core features.
📝 WalkthroughWalkthroughThis PR introduces waveform-based audio visualization with on-demand generation, soft-delete conversation archival with restoration capability, configuration management refactoring to merge defaults with user config, enhanced speaker recognition with chunked audio processing for large files, memory system updates with updated_at field tracking, HTTP Range request support for audio streaming, and a comprehensive backend state cleanup utility with optional backup and export. Changes
Sequence DiagramssequenceDiagram
participant User as User/Client
participant API as Conversation API
participant DB as MongoDB
participant AudioChunks as AudioChunks
participant Decoder as Decoder
participant WaveformWorker as Waveform Worker
participant Storage as WaveformData
User->>API: GET /conversations/{id}/waveform
API->>DB: Query WaveformData by conversation_id
alt Waveform cached
DB-->>API: Return WaveformData
API-->>User: Return samples, duration, sample_rate
else Waveform not cached
API->>WaveformWorker: Queue generate_waveform_data
WaveformWorker->>AudioChunks: Query Opus chunks
AudioChunks-->>WaveformWorker: Return audio chunks
WaveformWorker->>Decoder: Decode Opus → PCM samples
Decoder-->>WaveformWorker: Return raw PCM
WaveformWorker->>WaveformWorker: Downsample to target rate
WaveformWorker->>WaveformWorker: Normalize amplitude [-1.0, 1.0]
WaveformWorker->>Storage: Create WaveformData document
Storage-->>WaveformWorker: Persisted waveform
WaveformWorker-->>API: Return generated waveform
API-->>User: Return samples, duration, sample_rate
end
sequenceDiagram
participant User as User/Client
participant Archive as Archive UI
participant API as Conversation API
participant DB as MongoDB
participant Cleanup as Cleanup Service
User->>Archive: Open Archive page
Archive->>API: GET /conversations?include_deleted=true
API->>DB: Query conversations where deleted=true
DB-->>API: Return deleted conversations
API-->>Archive: Populate archive list
User->>Archive: Click "Restore" button
Archive->>API: POST /conversations/{id}/restore
API->>DB: Unset deleted flag on conversation
DB-->>API: Update successful
API-->>Archive: Refresh list
Archive->>API: GET /conversations?include_deleted=true
API->>DB: Query (updated conversations)
DB-->>Archive: Restored conversation removed from view
User->>Archive: Click "Permanent Delete" (admin)
Archive->>API: DELETE /conversations/{id}?permanent=true
API->>DB: Delete conversation document
API->>Cleanup: Trigger cleanup_jobs
Cleanup->>DB: Delete associated AudioChunks
Cleanup->>DB: Delete associated WaveformData
Cleanup-->>API: Cleanup complete
API-->>Archive: Refresh list
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
|
| Metric | Count |
|---|---|
| ✅ Passed | 79 |
| ❌ Failed | 18 |
| 📊 Total | 97 |
📊 View Reports
GitHub Pages (Live Reports):
Download Artifacts:
- robot-test-reports-html-no-api - HTML reports
- robot-test-results-xml-no-api - XML output
- Added configuration options for speaker recognition chunking in `.env.template`, allowing for better management of large audio files. - Updated `get_conversations` function to include an `include_deleted` parameter for filtering conversations based on their deletion status. - Enhanced `finalize_session` method in `AudioStreamProducer` to send an end marker to Redis, ensuring proper session closure. - Introduced `reconstruct_audio_segments` function to yield audio segments with overlap for efficient processing of lengthy conversations. - Implemented merging of overlapping speaker segments to improve accuracy in speaker recognition. - Added integration tests for WebSocket streaming transcription to validate the end_marker functionality and overall transcription flow.
|
| Metric | Count |
|---|---|
| ✅ Passed | 79 |
| ❌ Failed | 18 |
| 📊 Total | 97 |
📊 View Reports
GitHub Pages (Live Reports):
Download Artifacts:
- robot-test-reports-html-no-api - HTML reports
- robot-test-results-xml-no-api - XML output
|
| Metric | Count |
|---|---|
| ✅ Passed | 79 |
| ❌ Failed | 18 |
| 📊 Total | 97 |
📊 View Reports
GitHub Pages (Live Reports):
Download Artifacts:
- robot-test-reports-html-no-api - HTML reports
- robot-test-results-xml-no-api - XML output
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backends/advanced/src/advanced_omi_backend/models/job.py (1)
49-54: Bug:raisestatement makes the ConfigurationError fallback ineffective, and duplicate flag assignment.The
exceptblock catchesConfigurationErrorand sets a fallback database, but then immediately re-raises the exception, making the fallback useless. Additionally,_beanie_initializedis set toTrueon line 54 beforeinit_beanieis actually called, and then redundantly set again on line 61.🐛 Proposed fix
try: database = client.get_default_database(mongodb_database) except ConfigurationError: database = client[mongodb_database] - raise - _beanie_initialized = True + # Initialize Beanie await init_beanie( database=database, document_models=[User, Conversation, AudioChunkDocument, WaveformData], ) _beanie_initialized = True logger.info("✅ Beanie initialized in RQ worker process")Also applies to: 61-61
🤖 Fix all issues with AI agents
In `@backends/advanced/Docs/memories.md`:
- Line 502: Search the repo for the string "omi_memories" and replace it with
"chronicle_memories" in all configuration and test YAML files (the defaults and
test configs referenced in the review), update any docs examples (e.g.,
curl/verification lines) to use "chronicle_memories", and run tests to ensure no
remaining references to "omi_memories" remain.
In `@backends/advanced/src/advanced_omi_backend/models/waveform.py`:
- Around line 43-47: The waveform model's conversation_id isn't enforced unique
despite the comment; update the field definition from Indexed(str) to
Indexed(str, unique=True) (the same pattern used in Conversation) so inserts
cannot create duplicate conversation_id values, and keep the Settings.indexes
entry as-is; modify the conversation_id declaration in the Waveform model to use
Indexed(str, unique=True).
In `@backends/advanced/src/advanced_omi_backend/routers/modules/audio_routes.py`:
- Around line 128-163: Validate the parsed byte range after splitting
range_header: after computing range_start and range_end (and clamping to
0..file_size-1) ensure range_start <= range_end; if not, return a 416 Range Not
Satisfiable with "Content-Range: bytes */{file_size}". Also guard against
negative content_length by computing content_length = range_end - range_start +
1 only after the validation, and use the slice wav_data[range_start:range_end +
1] only when valid. Remove the unused exception variable `e` in the except
clause so the handler catches (ValueError, IndexError) and returns 416 without
referencing `e`.
In `@backends/advanced/src/advanced_omi_backend/services/memory/config.py`:
- Line 143: The code default for collection_name (variable collection_name in
config.py) was changed to "chronicle_memories" while existing defaults.yml and
tests still reference "omi_memories", causing a breaking mismatch; fix by
reverting the config.py default back to "omi_memories" (restore collection_name:
str = "omi_memories") so existing deployments keep working, update any unit
tests that assert the default to match, and add a short migration note in
project docs explaining the planned rename and how to opt into
"chronicle_memories" via explicit config so future changes are non-breaking.
In `@backends/advanced/src/advanced_omi_backend/utils/audio_chunk_utils.py`:
- Around line 430-434: The code incorrectly uses
Conversation.get(conversation_id) which looks up by MongoDB _id; replace it with
the indexed-field lookup used elsewhere (pattern at lines 669-671) by awaiting
Conversation.find_one with the filter on the conversation_id field (e.g., await
Conversation.find_one({"conversation_id": conversation_id}) or equivalent
keyword-arg form) and keep the existing not-found ValueError branch.
In `@backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py`:
- Around line 570-598: The first logger call uses an unnecessary f-string:
change logger.info(f"📝 No segments in streaming results, creating from
word-level data") to a plain string (remove the leading f) in the branch where
segments_data is empty (the block that calls
aggregator.get_session_results(session_id) and appends to segments_data); leave
other f-strings (e.g., the following logger.info that includes
{len(segments_data)}) unchanged.
In `@backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py`:
- Around line 18-21: The function _merge_overlapping_speaker_segments declares
an unused overlap parameter; either remove it from the signature (and all
callers) if not needed, or incorporate it into the merge logic by comparing the
actual overlap against the provided overlap threshold (e.g., compute overlap =
min(current["end"], next_seg["end"]) - max(current["start"], next_seg["start"])
or check next_seg["start"] - current["end"] <= overlap) and only merge when the
overlap condition is met; update the function signature, any callers/tests, and
the docstring to reflect the chosen behavior and ensure segment merging uses
segments, next_seg["start"], and current["end"] consistently.
- Around line 319-323: The generator loop over reconstruct_audio_segments
currently unpacks a variable named start_time which shadows the earlier
job_start_time used for profiling; rename the unpacked variable (e.g.,
segment_start_time or wav_start_time) in the async for unpack (async for
wav_data, segment_start_time, end_time in reconstruct_audio_segments(...)) and
update all in-loop references to that new name, and ensure all
processing_time_seconds calculations use time.time() - job_start_time (not the
unpacked name) so timing metrics (processing_time_seconds / processing_time)
remain correct.
In `@backends/advanced/src/advanced_omi_backend/workers/waveform_jobs.py`:
- Around line 19-34: The docstring for the function that accepts parameter
sample_rate incorrectly states the default is 10 while the function signature
sets sample_rate: int = 3; update the docstring to state the correct default (3)
so the signature and documentation match, referencing the sample_rate parameter
in the function (waveform generation function in waveform_jobs.py) and adjust
the "Args" section to list "sample_rate: Samples per second for waveform
(default: 3)".
In `@backends/advanced/webui/src/components/audio/WaveformDisplay.tsx`:
- Around line 77-95: The drawWaveform function can divide by zero when samples
is empty; add an early guard at the start of drawWaveform that returns
immediately if !samples || samples.length === 0 to avoid computing barWidth =
width / samples.length and any downstream issues; update the function
(drawWaveform) to perform this check before computing barWidth/centerY and
before using samples.forEach.
In `@tests/integration/sdk_tests.robot`:
- Around line 59-71: The test "SDK Upload Respects Backend File Size Limit"
currently only re-runs the authentication script (Run Process ...
${CURDIR}/../scripts/sdk_test_auth.py) and doesn't exercise upload/oversize
behavior; either mark it as skipped until implemented or rename it to reflect
it's a placeholder: add a skip tag (e.g., add "skip" to the [Tags] line) and
update the test name or [Documentation] to indicate it's a placeholder that
re-runs sdk_test_auth.py, or alternatively change the test name to something
like "Placeholder: Re-run SDK auth script" so the intent and behavior match.
In `@tests/resources/queue_keywords.robot`:
- Around line 330-342: The Robot keyword "Get Job Result" is calling the wrong
endpoint path "/api/queue/job/${job_id}" which should be plural; update the GET
call in that keyword to use "/api/queue/jobs/${job_id}" so it matches the
backend route and will return 200; ensure the GET On Session invocation and any
related variables (${response}, ${job_data}, ${result}) remain unchanged.
🧹 Nitpick comments (29)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (2)
151-181: Consider adding defensive type check for each segment.The normalization logic correctly handles Deepgram's nested response structure. However, if a malformed response contains a non-dict element in the segments list,
seg.get()will raise anAttributeError.♻️ Suggested defensive check
if provider == "deepgram" and segments: normalized_segments = [] for seg in segments: + if not isinstance(seg, dict): + logger.debug(f"Skipping non-dict Deepgram segment: {seg}") + continue # Deepgram segments may have nested structure # Extract text from either 'text' or 'transcript' or 'sentences' text_content = seg.get("text") or seg.get("transcript") or ""
171-177: Default confidence of 1.0 may overstate certainty.When confidence data is missing from a segment, defaulting to
1.0(100% confidence) could be misleading to downstream consumers. Consider using0.0orNoneto indicate unknown confidence, unless downstream code explicitly requires a numeric value.backends/advanced/src/advanced_omi_backend/utils/audio_chunk_utils.py (1)
462-466: Inconsistent sort syntax.Line 466 uses
.sort(+AudioChunkDocument.chunk_index)while line 281 uses.sort("+chunk_index"). The unary+operator on a Beanie field object is non-standard and may not behave as intended.♻️ Suggested fix for consistency
- ).sort(+AudioChunkDocument.chunk_index).to_list() + ).sort("+chunk_index").to_list()backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py (1)
386-386: Remove extraneous f-string prefix.This f-string has no placeholders.
- logger.info(f"🎤 Calling speaker recognition service...") + logger.info("🎤 Calling speaker recognition service...")backends/advanced/webui/src/pages/ConversationsTimeline.tsx (1)
274-276: Consider removing dead conditional styling for deleted conversations.Since
getAll(false)now excludes deleted conversations,conv.deletedwill always befalsefor all items in the list. The conditionalconv.deleted ? '#fee2e2' : '#fff'is effectively dead code.♻️ Suggested cleanup
contentStyle={{ - background: conv.deleted ? '#fee2e2' : '#fff', + background: '#fff', color: '#1f2937', boxShadow: '0 3px 0 `#ddd`' }}tests/scripts/sdk_test_conversations.py (1)
12-14: Consider adding argument validation for clearer error messages.Accessing
sys.argv[1:4]without validation will raise anIndexErrorif arguments are missing, which may be confusing for users running the script incorrectly.♻️ Suggested improvement
+if len(sys.argv) < 4: + print("Usage: sdk_test_conversations.py <backend_url> <email> <password>", file=sys.stderr) + sys.exit(1) + backend_url = sys.argv[1] email = sys.argv[2] password = sys.argv[3]tests/scripts/sdk_test_auth.py (1)
12-18: Consider adding argument validation and error handling.The script will crash with an
IndexErrorif fewer than 3 arguments are provided. For test scripts used in CI/automation, clearer error messages help debugging.🔧 Suggested improvement
-backend_url = sys.argv[1] -email = sys.argv[2] -password = sys.argv[3] - -client = ChronicleClient(backend_url) -client.login(email, password) -print("SUCCESS") +if len(sys.argv) < 4: + print("Usage: sdk_test_auth.py <backend_url> <email> <password>", file=sys.stderr) + sys.exit(1) + +backend_url = sys.argv[1] +email = sys.argv[2] +password = sys.argv[3] + +try: + client = ChronicleClient(backend_url) + client.login(email, password) + print("SUCCESS") +except Exception as e: + print(f"FAILED: {e}", file=sys.stderr) + sys.exit(1)backends/advanced/src/advanced_omi_backend/services/audio_stream/aggregator.py (1)
110-154: LGTM with a suggestion: consider sorting segments by timestamp.The aggregation logic correctly combines text, words, and segments from all results. The average confidence calculation and early return for empty results are properly handled.
However,
all_segmentsare collected in result order but not sorted by their internal timestamps. If segments from different chunks have overlapping or out-of-order timings, the final segment list may not be chronologically sorted.🔧 Optional: Sort segments by start time
# Join all text segments with spaces combined_text = " ".join(all_text) + # Sort segments by start time if present + if all_segments and "start" in all_segments[0]: + all_segments.sort(key=lambda s: s.get("start", 0)) + combined = { "text": combined_text, "words": all_words, "segments": all_segments,backends/advanced/src/advanced_omi_backend/controllers/audio_controller.py (1)
147-157: Uselogging.exceptionfor better traceability.The static analysis tool correctly flags that
logging.exceptionwould be more appropriate here as it automatically includes the stack trace, which aids debugging validation failures.Proposed fix
except ValueError as val_error: # Handle validation errors (e.g., file too long) - audio_logger.error(f"Audio validation failed: {val_error}") + audio_logger.exception(f"Audio validation failed: {val_error}") processed_files.append({ "filename": file.filename, "status": "error", "error": str(val_error), }) # Delete the conversation since it won't have audio chunks await conversation.delete() continuetests/scripts/sdk_test_upload.py (2)
12-15: Add argument validation for clearer error messages.Direct
sys.argvaccess will raise an unclearIndexErrorif arguments are missing. Consider usingargparseor at least validating argument count for better test failure diagnostics.Proposed fix with basic validation
-backend_url = sys.argv[1] -email = sys.argv[2] -password = sys.argv[3] -audio_file = sys.argv[4] +if len(sys.argv) != 5: + print(f"Usage: {sys.argv[0]} <backend_url> <email> <password> <audio_file>", file=sys.stderr) + sys.exit(1) + +backend_url = sys.argv[1] +email = sys.argv[2] +password = sys.argv[3] +audio_file = sys.argv[4]
21-23: Guard against empty files list.Accessing
result.files[0]will raiseIndexErrorif the upload returns an empty files list. A guard would make test failures clearer.Proposed fix
+if not result.files: + print("ERROR:No files in response") + sys.exit(1) + print(f"STATUS:{result.files[0].status}") if result.files[0].conversation_id: print(f"CONVERSATION_ID:{result.files[0].conversation_id}")backends/advanced/src/advanced_omi_backend/models/waveform.py (1)
34-37: Consider using timezone-aware datetime.
datetime.utcnow()is deprecated since Python 3.12. While functional, consider usingdatetime.now(timezone.utc)for timezone-aware timestamps.Proposed fix
+from datetime import datetime, timezone -from datetime import datetime ... created_at: datetime = Field( - default_factory=datetime.utcnow, + default_factory=lambda: datetime.now(timezone.utc), description="When this waveform was generated" )tests/resources/redis_keywords.robot (1)
78-99: LGTM with optional modernization suggestion.The
Redis Commandkeyword correctly handles command execution, error checking, and numeric coercion. The logic is sound.Minor note:
Run Keyword If(line 95-97) is considered legacy syntax in Robot Framework 4.0+. The modernIF/ELSEsyntax is preferred, though this is purely stylistic and the current implementation works correctly.♻️ Optional: Use modern IF/ELSE syntax
# Try to convert to integer if it's a number (for commands like XLEN) ${is_digit}= Run Keyword And Return Status Should Match Regexp ${output} ^\\d+$ - ${return_value}= Run Keyword If ${is_digit} - ... Convert To Integer ${output} - ... ELSE Set Variable ${output} + IF ${is_digit} + ${return_value}= Convert To Integer ${output} + ELSE + ${return_value}= Set Variable ${output} + END RETURN ${return_value}config/defaults.yml (1)
29-40: Consider clarifying the API key environment variable for local models.Using
${OPENAI_API_KEY:-ollama}for the local Ollama LLM (and similarly for local-embed at line 76) may cause confusion since Ollama doesn't typically require OpenAI credentials. This works because the default fallback isollama, but a dedicated variable like${OLLAMA_API_KEY:-ollama}or a comment clarifying this pattern would improve clarity.backends/advanced/src/advanced_omi_backend/workers/waveform_jobs.py (1)
159-168: Consider upsert to prevent duplicate waveforms on concurrent requests.If two requests race to generate a waveform for the same conversation, both will insert separate documents. Consider using
find_one_and_updatewithupsert=Trueor adding a unique index onconversation_idin theWaveformDatamodel to ensure idempotency.♻️ Example using find_one_and_replace with upsert
- await waveform_doc.insert() + # Use upsert to prevent duplicates from concurrent requests + await WaveformData.find_one( + WaveformData.conversation_id == conversation_id + ).upsert( + Set(waveform_doc.model_dump(exclude={"id"})), + on_insert=waveform_doc + )Or add a unique index in the model:
class Settings: name = "waveforms" indexes = [ IndexModel([("conversation_id", 1)], unique=True), ]backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (1)
117-120: Move imports to module level.
HTTPExceptionis a standard FastAPI import and should be at the top of the file with other FastAPI imports for consistency. The model imports can remain deferred if there are circular import concerns, butHTTPExceptionhas no such issues.♻️ Suggested change
At line 10, update the import:
-from fastapi import APIRouter, Depends, Query +from fastapi import APIRouter, Depends, HTTPException, QueryThen at lines 117-120:
- from fastapi import HTTPException - from advanced_omi_backend.models.conversation import Conversation - from advanced_omi_backend.models.waveform import WaveformData - from advanced_omi_backend.workers.waveform_jobs import generate_waveform_data + from advanced_omi_backend.models.conversation import Conversation + from advanced_omi_backend.models.waveform import WaveformData + from advanced_omi_backend.workers.waveform_jobs import generate_waveform_databackends/advanced/webui/src/components/audio/WaveformDisplay.tsx (2)
116-137: Remove debug console.log statements before production.The
console.logandconsole.warnstatements at lines 117, 120, 125, and 134 appear to be debug artifacts. Consider removing them or converting to a debug-only logger to reduce console noise in production.
51-75: Consider handling window resize.The canvas dimensions are calculated once when the effect runs. If the user resizes the window, the waveform may display incorrectly until a re-render is triggered. Consider adding a
ResizeObserveror window resize listener.♻️ Example resize handling
useEffect(() => { if (!canvasRef.current) return; const resizeObserver = new ResizeObserver(() => { // Trigger redraw by updating a state or calling draw directly }); resizeObserver.observe(canvasRef.current); return () => resizeObserver.disconnect(); }, []);tests/integration/websocket_transcription_e2e_test.robot (1)
55-66: Consider replacing fixed Sleep with polling.Fixed
Sleep 5scan cause flakiness if the system is slow, or waste time if it's fast. Consider usingWait Until Keyword Succeedswith a polling approach to check the Redis stream.♻️ Example polling approach
# Instead of: Sleep 5s # Consider: ${stream_length}= Wait Until Keyword Succeeds 10s 1s ... Get Redis Stream Length Above Zero ${stream_name}Where
Get Redis Stream Length Above Zerois a keyword that returns the length only if > 0, otherwise fails.backends/advanced/src/scripts/cleanup_state.py (2)
552-557: Hardcoded vector dimension may not match original collection.The recreated collection uses a hardcoded 1536 dimensions (OpenAI ada-002). If a different embedding model is used, this will create a mismatched collection. Consider retrieving the original collection's vector config before deletion.
♻️ Suggested approach
# Get count and config before deletion collection_info = await self.qdrant_client.get_collection(collection_name) stats.memories_count = collection_info.points_count + original_vector_size = collection_info.config.params.vectors.size + original_distance = collection_info.config.params.vectors.distance # Delete and recreate collection await self.qdrant_client.delete_collection(collection_name) logger.info(f"Deleted memories collection ({stats.memories_count} vectors)") # Recreate with original configuration await self.qdrant_client.create_collection( collection_name=collection_name, - vectors_config=VectorParams(size=1536, distance=Distance.COSINE) + vectors_config=VectorParams(size=original_vector_size, distance=original_distance) )
501-524: Unusedresultassignments from delete operations.The
resultvariables are assigned but never used. While harmless, you could either remove the assignments or use them to verify the actual deleted count matches expectations.♻️ Option 1: Remove assignments
- result = await Conversation.find_all().delete() + await Conversation.find_all().delete()♻️ Option 2: Verify counts
- result = await Conversation.find_all().delete() - logger.info(f"Deleted {stats.conversations_count} conversations") + result = await Conversation.find_all().delete() + logger.info(f"Deleted {result.deleted_count} conversations (expected {stats.conversations_count})")backends/advanced/src/advanced_omi_backend/model_registry.py (1)
289-300: Consider usinglogging.exceptionfor better stack trace visibility in fallback path.The fallback logic is reasonable for resilience, but
logging.errorloses the stack trace. Usinglogging.exceptionwould preserve diagnostic information when the merged config load fails.♻️ Suggested improvement
# Try to get merged configuration (defaults + user config) try: raw = get_config(force_reload=force_reload) except Exception as e: - logging.error(f"Failed to load merged configuration: {e}") + logging.exception(f"Failed to load merged configuration: {e}") # Fallback to direct config.yml loading cfg_path = _find_config_path() if not cfg_path.exists(): return None with cfg_path.open("r") as f: raw = yaml.safe_load(f) or {} raw = _deep_resolve_env(raw)backends/advanced/src/advanced_omi_backend/services/memory/providers/vector_stores.py (1)
386-446: Newget_memorymethod looks correct, but consider extracting the ID conversion logic.The method correctly handles UUID/integer ID formats and validates user ownership when
user_idis provided.However, the ID conversion logic (lines 398-409) is duplicated across
delete_memory,update_memory, and nowget_memory. Consider extracting this to a helper method to reduce duplication.Also, the
import uuidon line 398 is unnecessary sinceuuidis already imported at the module level (line 11).♻️ Suggested refactor: Extract ID conversion helper and remove duplicate import
+ def _normalize_point_id(self, memory_id: str): + """Convert memory_id to Qdrant-compatible format (UUID string or integer).""" + try: + uuid.UUID(memory_id) + return memory_id + except ValueError: + try: + return int(memory_id) + except ValueError: + return memory_id async def get_memory(self, memory_id: str, user_id: Optional[str] = None) -> Optional[MemoryEntry]: """Get a specific memory by ID from Qdrant. ... """ try: - # Convert memory_id to proper format for Qdrant - import uuid - try: - # Try to parse as UUID first - uuid.UUID(memory_id) - point_id = memory_id - except ValueError: - # If not a UUID, try as integer - try: - point_id = int(memory_id) - except ValueError: - # If neither UUID nor integer, use it as-is - point_id = memory_id + point_id = self._normalize_point_id(memory_id) # Retrieve the point by ID ...The same refactor can be applied to
delete_memoryandupdate_memory.backends/advanced/webui/src/pages/Archive.tsx (3)
68-79: Consider simplifying date format detection.The condition on line 70 is complex and may not cover all edge cases (e.g., timestamps with timezone offsets like
+05:30). Consider using a more robust approach:♻️ Suggested simplification
const formatDate = (timestamp: number | string) => { if (typeof timestamp === 'string') { - const isoString = timestamp.endsWith('Z') || timestamp.includes('+') || timestamp.includes('T') && timestamp.split('T')[1].includes('-') - ? timestamp - : timestamp + 'Z' - return new Date(isoString).toLocaleString() + // If it's a numeric string (Unix timestamp), convert to number + if (/^\d+$/.test(timestamp)) { + return new Date(parseInt(timestamp, 10) * 1000).toLocaleString() + } + // Otherwise treat as ISO string + return new Date(timestamp).toLocaleString() } if (timestamp === 0) { return 'Unknown date' } return new Date(timestamp * 1000).toLocaleString() }
352-369: Consider using a stable key for transcript segments.Using
indexas the React key works here since segments are static after fetch, but a composite key would be more robust if segments are ever modified:♻️ Suggested improvement
- {segments.map((segment, index) => { + {segments.map((segment, index) => { + const segmentKey = `${segment.start}-${segment.end}-${index}` const speaker = segment.speaker || 'Unknown' return ( <div - key={index} + key={segmentKey} className="text-sm leading-relaxed flex items-start space-x-2 py-1 px-2 rounded hover:bg-gray-50 dark:hover:bg-gray-700" >
3-4: Combine imports from the same module.♻️ Suggested fix
-import { conversationsApi } from '../services/api' -import { authApi } from '../services/api' +import { conversationsApi, authApi } from '../services/api'backends/advanced/src/advanced_omi_backend/config.py (2)
101-109: Consider emitting a deprecation warning.The function is marked deprecated in the docstring but callers won't see runtime feedback. Consider adding a warning log to encourage migration.
♻️ Optional: Add deprecation warning
def get_defaults_config_path(): """ Get the path to the defaults config file. DEPRECATED: Use get_defaults_yml_path() instead. Kept for backward compatibility. """ + logger.warning("get_defaults_config_path() is deprecated, use get_defaults_yml_path() instead") defaults_path = get_defaults_yml_path() return defaults_path if defaults_path.exists() else None
164-180: Uselogger.exceptionfor better error diagnostics.The broad
Exceptioncatches are acceptable for config loading robustness. However, usinglogger.exceptioninstead oflogger.warning/logger.errorwould include stack traces for easier debugging.♻️ Proposed improvement
try: with open(defaults_path, 'r') as f: defaults = yaml.safe_load(f) or {} logger.info(f"Loaded defaults from {defaults_path}") except Exception as e: - logger.warning(f"Could not load defaults from {defaults_path}: {e}") + logger.exception(f"Could not load defaults from {defaults_path}") # Load user config config_path = get_config_yml_path() user_config = {} if config_path.exists(): try: with open(config_path, 'r') as f: user_config = yaml.safe_load(f) or {} logger.info(f"Loaded config from {config_path}") except Exception as e: - logger.error(f"Error loading config from {config_path}: {e}") + logger.exception(f"Error loading config from {config_path}")tests/unit/test_config_loading.py (1)
124-132: Improve test to verify cache invalidation.The test assigns
config1to establish a cache but doesn't use it to verify the reload behavior. The comment suggests checking for different instances, but the assertion only verifies the return type.♻️ Proposed improvement
def test_reload_config(): """Test config reload invalidates cache.""" config1 = get_config() config2 = reload_config() - # Should be different instances after reload - # (Note: Content might be the same, but object should be different) - # We check that reload returns a config object - assert isinstance(config2, dict) + # After reload, should be a different object instance + # (content might be identical, but the dict object should be new) + assert isinstance(config2, dict) + assert config1 is not config2
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (49)
backends/advanced/.env.templatebackends/advanced/Docs/memories.mdbackends/advanced/docker-compose-test.ymlbackends/advanced/docker-compose.ymlbackends/advanced/src/advanced_omi_backend/app_factory.pybackends/advanced/src/advanced_omi_backend/config.pybackends/advanced/src/advanced_omi_backend/controllers/audio_controller.pybackends/advanced/src/advanced_omi_backend/controllers/conversation_controller.pybackends/advanced/src/advanced_omi_backend/model_registry.pybackends/advanced/src/advanced_omi_backend/models/job.pybackends/advanced/src/advanced_omi_backend/models/waveform.pybackends/advanced/src/advanced_omi_backend/routers/modules/audio_routes.pybackends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.pybackends/advanced/src/advanced_omi_backend/services/audio_stream/aggregator.pybackends/advanced/src/advanced_omi_backend/services/audio_stream/producer.pybackends/advanced/src/advanced_omi_backend/services/memory/README.mdbackends/advanced/src/advanced_omi_backend/services/memory/base.pybackends/advanced/src/advanced_omi_backend/services/memory/config.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/chronicle.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/openmemory_mcp.pybackends/advanced/src/advanced_omi_backend/services/memory/providers/vector_stores.pybackends/advanced/src/advanced_omi_backend/services/transcription/__init__.pybackends/advanced/src/advanced_omi_backend/utils/audio_chunk_utils.pybackends/advanced/src/advanced_omi_backend/workers/cleanup_jobs.pybackends/advanced/src/advanced_omi_backend/workers/conversation_jobs.pybackends/advanced/src/advanced_omi_backend/workers/speaker_jobs.pybackends/advanced/src/advanced_omi_backend/workers/transcription_jobs.pybackends/advanced/src/advanced_omi_backend/workers/waveform_jobs.pybackends/advanced/src/scripts/cleanup_state.pybackends/advanced/tests/test_memory_entry.pybackends/advanced/tests/test_memory_providers.pybackends/advanced/webui/src/components/audio/WaveformDisplay.tsxbackends/advanced/webui/src/pages/Archive.tsxbackends/advanced/webui/src/pages/Conversations.tsxbackends/advanced/webui/src/pages/ConversationsRouter.tsxbackends/advanced/webui/src/pages/ConversationsTimeline.tsxbackends/advanced/webui/src/pages/Memories.tsxbackends/advanced/webui/src/services/api.tsconfig/defaults.ymltests/endpoints/memory_tests.robottests/integration/sdk_tests.robottests/integration/websocket_transcription_e2e_test.robottests/resources/queue_keywords.robottests/resources/redis_keywords.robottests/scripts/sdk_test_auth.pytests/scripts/sdk_test_conversations.pytests/scripts/sdk_test_upload.pytests/unit/test_config_loading.py
🧰 Additional context used
🧬 Code graph analysis (22)
backends/advanced/src/advanced_omi_backend/services/transcription/__init__.py (1)
backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
segments(211-213)
backends/advanced/webui/src/pages/ConversationsTimeline.tsx (1)
backends/advanced/webui/src/services/api.ts (1)
conversationsApi(109-130)
backends/advanced/webui/src/components/audio/WaveformDisplay.tsx (2)
backends/advanced/src/advanced_omi_backend/models/waveform.py (1)
WaveformData(16-47)backends/advanced/webui/src/services/api.ts (1)
api(55-58)
tests/unit/test_config_loading.py (1)
backends/advanced/src/advanced_omi_backend/config.py (2)
merge_configs(112-136)reload_config(200-204)
backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py (3)
backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
segments(211-213)backends/advanced/src/advanced_omi_backend/utils/conversation_utils.py (1)
extract_speakers_from_segments(407-423)backends/advanced/src/advanced_omi_backend/services/audio_stream/aggregator.py (1)
get_session_results(29-79)
backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (3)
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (1)
get_conversations(133-187)backends/advanced/src/advanced_omi_backend/models/waveform.py (1)
WaveformData(16-47)backends/advanced/src/advanced_omi_backend/workers/waveform_jobs.py (1)
generate_waveform_data(17-190)
backends/advanced/src/advanced_omi_backend/models/job.py (3)
backends/advanced/src/advanced_omi_backend/models/waveform.py (1)
WaveformData(16-47)backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
Conversation(17-328)backends/advanced/src/advanced_omi_backend/models/audio_chunk.py (1)
AudioChunkDocument(16-158)
backends/advanced/src/advanced_omi_backend/services/memory/providers/openmemory_mcp.py (1)
backends/advanced/src/advanced_omi_backend/services/memory/base.py (1)
MemoryEntry(26-69)
backends/advanced/webui/src/pages/ConversationsRouter.tsx (1)
backends/advanced/webui/src/pages/Archive.tsx (1)
Archive(28-389)
backends/advanced/src/advanced_omi_backend/services/memory/config.py (1)
backends/advanced/src/advanced_omi_backend/config.py (1)
get_config_yml_path(91-93)
backends/advanced/src/advanced_omi_backend/services/audio_stream/aggregator.py (1)
backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
segments(211-213)
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py (3)
backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py (1)
get_conversations(31-36)backends/advanced/src/advanced_omi_backend/models/user.py (2)
User(51-98)user_id(71-73)backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
Conversation(17-328)
backends/advanced/src/advanced_omi_backend/services/memory/providers/vector_stores.py (6)
backends/advanced/src/advanced_omi_backend/models/user.py (1)
user_id(71-73)backends/advanced/src/advanced_omi_backend/services/memory/base.py (2)
get_memory(171-185)MemoryEntry(26-69)backends/advanced/src/advanced_omi_backend/services/memory/providers/chronicle.py (1)
get_memory(300-322)backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py (1)
get_memory(613-648)backends/advanced/src/advanced_omi_backend/services/memory/providers/openmemory_mcp.py (1)
get_memory(285-320)backends/advanced/src/advanced_omi_backend/services/memory/providers/mcp_client.py (1)
get_memory(364-405)
backends/advanced/tests/test_memory_entry.py (1)
backends/advanced/src/advanced_omi_backend/services/memory/base.py (2)
MemoryEntry(26-69)to_dict(57-69)
backends/advanced/src/advanced_omi_backend/config.py (1)
backends/advanced/src/advanced_omi_backend/model_registry.py (1)
_deep_resolve_env(61-90)
backends/advanced/src/advanced_omi_backend/model_registry.py (1)
backends/advanced/src/advanced_omi_backend/config.py (2)
get_config(143-197)get_config_yml_path(91-93)
backends/advanced/src/advanced_omi_backend/workers/waveform_jobs.py (2)
backends/advanced/src/advanced_omi_backend/models/waveform.py (1)
WaveformData(16-47)backends/advanced/src/advanced_omi_backend/utils/audio_chunk_utils.py (2)
retrieve_audio_chunks(246-288)decode_opus_to_pcm(117-195)
backends/advanced/src/advanced_omi_backend/workers/cleanup_jobs.py (1)
backends/advanced/src/advanced_omi_backend/models/waveform.py (1)
WaveformData(16-47)
backends/advanced/src/advanced_omi_backend/utils/audio_chunk_utils.py (2)
backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
Conversation(17-328)backends/advanced/src/advanced_omi_backend/models/audio_chunk.py (2)
AudioChunkDocument(16-158)compression_ratio(140-144)
backends/advanced/src/scripts/cleanup_state.py (3)
backends/advanced/src/advanced_omi_backend/models/conversation.py (1)
Conversation(17-328)backends/advanced/src/advanced_omi_backend/models/audio_chunk.py (1)
AudioChunkDocument(16-158)backends/advanced/src/advanced_omi_backend/models/waveform.py (1)
WaveformData(16-47)
backends/advanced/src/advanced_omi_backend/app_factory.py (1)
backends/advanced/src/advanced_omi_backend/models/waveform.py (1)
WaveformData(16-47)
backends/advanced/src/advanced_omi_backend/services/memory/providers/chronicle.py (4)
backends/advanced/src/advanced_omi_backend/services/memory/providers/mycelia.py (2)
get_memory(613-648)initialize(82-105)backends/advanced/src/advanced_omi_backend/services/memory/providers/openmemory_mcp.py (2)
get_memory(285-320)initialize(71-104)backends/advanced/src/advanced_omi_backend/services/memory/providers/vector_stores.py (2)
get_memory(386-446)initialize(52-118)backends/advanced/src/advanced_omi_backend/models/user.py (1)
user_id(71-73)
🪛 dotenv-linter (4.0.0)
backends/advanced/.env.template
[warning] 171-171: [ValueWithoutQuotes] This value needs to be surrounded in quotes
(ValueWithoutQuotes)
[warning] 172-172: [UnorderedKey] The SPEAKER_CHUNK_SIZE key should go before the SPEAKER_CHUNK_THRESHOLD key
(UnorderedKey)
[warning] 172-172: [ValueWithoutQuotes] This value needs to be surrounded in quotes
(ValueWithoutQuotes)
[warning] 173-173: [UnorderedKey] The SPEAKER_CHUNK_OVERLAP key should go before the SPEAKER_CHUNK_SIZE key
(UnorderedKey)
[warning] 173-173: [ValueWithoutQuotes] This value needs to be surrounded in quotes
(ValueWithoutQuotes)
🪛 Ruff (0.14.11)
backends/advanced/src/advanced_omi_backend/models/waveform.py
45-47: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
tests/unit/test_config_loading.py
126-126: Local variable config1 is assigned to but never used
Remove assignment to unused variable config1
(F841)
backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py
573-573: f-string without any placeholders
Remove extraneous f prefix
(F541)
backends/advanced/src/advanced_omi_backend/routers/modules/conversation_routes.py
33-33: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
98-98: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
backends/advanced/src/advanced_omi_backend/controllers/audio_controller.py
149-149: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
166-166: Use explicit conversion flag
Replace with conversion flag
(RUF010)
backends/advanced/src/advanced_omi_backend/routers/modules/audio_routes.py
156-156: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
backends/advanced/src/advanced_omi_backend/services/memory/config.py
82-85: Avoid specifying long messages outside the exception class
(TRY003)
backends/advanced/src/advanced_omi_backend/controllers/conversation_controller.py
142-142: Avoid equality comparisons to False; use not Conversation.deleted: for false checks
Replace with not Conversation.deleted
(E712)
149-149: Avoid equality comparisons to False; use not Conversation.deleted: for false checks
Replace with not Conversation.deleted
(E712)
backends/advanced/src/advanced_omi_backend/services/memory/providers/vector_stores.py
146-146: Consider moving this statement to an else block
(TRY300)
380-380: Consider moving this statement to an else block
(TRY300)
382-382: Do not catch blind exception: Exception
(BLE001)
383-383: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
442-442: Consider moving this statement to an else block
(TRY300)
444-444: Do not catch blind exception: Exception
(BLE001)
445-445: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/config.py
168-168: Do not catch blind exception: Exception
(BLE001)
179-179: Do not catch blind exception: Exception
(BLE001)
180-180: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/model_registry.py
292-292: Do not catch blind exception: Exception
(BLE001)
293-293: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
backends/advanced/src/advanced_omi_backend/workers/speaker_jobs.py
20-20: Unused function argument: overlap
(ARG001)
351-351: Abstract raise to an inner function
(TRY301)
351-351: Avoid specifying long messages outside the exception class
(TRY003)
386-386: f-string without any placeholders
Remove extraneous f prefix
(F541)
backends/advanced/src/advanced_omi_backend/workers/waveform_jobs.py
172-178: Consider moving this statement to an else block
(TRY300)
backends/advanced/src/advanced_omi_backend/utils/audio_chunk_utils.py
434-434: Avoid specifying long messages outside the exception class
(TRY003)
448-448: Avoid specifying long messages outside the exception class
(TRY003)
596-599: Avoid specifying long messages outside the exception class
(TRY003)
762-765: Avoid specifying long messages outside the exception class
(TRY003)
backends/advanced/src/scripts/cleanup_state.py
66-66: Do not catch blind exception: Exception
(BLE001)
135-135: Consider moving this statement to an else block
(TRY300)
188-188: Do not catch blind exception: Exception
(BLE001)
278-278: Unused method argument: stats
(ARG002)
289-289: Do not catch blind exception: Exception
(BLE001)
328-328: Do not catch blind exception: Exception
(BLE001)
420-420: Do not catch blind exception: Exception
(BLE001)
480-480: Consider moving this statement to an else block
(TRY300)
523-523: Local variable result is assigned to but never used
Remove assignment to unused variable result
(F841)
559-559: Do not catch blind exception: Exception
(BLE001)
608-608: Do not catch blind exception: Exception
(BLE001)
630-630: Do not catch blind exception: Exception
(BLE001)
657-657: Do not catch blind exception: Exception
(BLE001)
676-676: Do not catch blind exception: Exception
(BLE001)
804-804: Do not catch blind exception: Exception
(BLE001)
backends/advanced/src/advanced_omi_backend/services/memory/providers/chronicle.py
319-319: Consider moving this statement to an else block
(TRY300)
320-320: Do not catch blind exception: Exception
(BLE001)
321-321: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| - Ensure embedding model is available in Ollama | ||
| - Check vector dimensions match between embedder and Qdrant | ||
| - Verify collection has vectors: `curl http://localhost:6333/collections/omi_memories` | ||
| - Verify collection has vectors: `curl http://localhost:6333/collections/chronicle_memories` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n "omi_memories" --type md --type py --type yaml --type jsonRepository: SimpleOpenSoftware/chronicle
Length of output: 417
Update collection name references in configuration files.
The old omi_memories collection name still appears in multiple configuration files that need to be updated to match the new chronicle_memories name:
config/defaults.yml:198tests/configs/deepgram-openai.yml:50tests/configs/parakeet-openai.yml:44tests/configs/parakeet-ollama.yml:44tests/configs/mock-services.yml:40
These configuration files will cause failures if the old collection doesn't exist in the system.
🤖 Prompt for AI Agents
In `@backends/advanced/Docs/memories.md` at line 502, Search the repo for the
string "omi_memories" and replace it with "chronicle_memories" in all
configuration and test YAML files (the defaults and test configs referenced in
the review), update any docs examples (e.g., curl/verification lines) to use
"chronicle_memories", and run tests to ensure no remaining references to
"omi_memories" remain.
| class Settings: | ||
| name = "waveforms" | ||
| indexes = [ | ||
| "conversation_id", # Unique lookup by conversation | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's find the WaveformData model and examine it
find . -type f -name "*.py" | xargs grep -l "class WaveformData" 2>/dev/nullRepository: SimpleOpenSoftware/chronicle
Length of output: 135
🏁 Script executed:
# Also search for the waveform.py file to understand its full structure
find . -type f -name "waveform.py" -path "*/advanced_omi_backend/models/*"Repository: SimpleOpenSoftware/chronicle
Length of output: 135
🏁 Script executed:
# Check how conversation_id is used in the model and elsewhere
rg -n "conversation_id" --type=py backends/advanced/src/advanced_omi_backend/models/ -B 2 -A 2Repository: SimpleOpenSoftware/chronicle
Length of output: 6481
🏁 Script executed:
# Look for any constraints or validation on WaveformData
rg -n "WaveformData\|conversation_id" --type=py backends/advanced/src/advanced_omi_backend/ | head -50Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
cat -n backends/advanced/src/advanced_omi_backend/models/waveform.pyRepository: SimpleOpenSoftware/chronicle
Length of output: 1766
🏁 Script executed:
# Also check if there's any application logic validating uniqueness of waveform per conversation
rg -n "WaveformData\|waveform" --type=py backends/advanced/src/advanced_omi_backend/ | grep -i "create\|insert\|unique\|validation" | head -30Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
# Check for any repository/database query patterns that assume one waveform per conversation
rg -n "find_one.*waveform\|waveform.*find_one" --type=py backends/advanced/src/advanced_omi_backend/ -B 2 -A 2Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
# Search for waveform creation and update patterns
rg -n "WaveformData\(" --type=py backends/advanced/src/advanced_omi_backend/ -B 3 -A 5Repository: SimpleOpenSoftware/chronicle
Length of output: 1872
🏁 Script executed:
# Check for any tests that show waveform creation patterns
find backends/advanced -type f -name "*test*.py" | xargs grep -l "WaveformData\|waveform" 2>/dev/nullRepository: SimpleOpenSoftware/chronicle
Length of output: 56
🏁 Script executed:
# Look for service/repository methods that interact with waveforms
rg -n "class.*Repository\|def.*waveform\|def.*create.*waveform" --type=py backends/advanced/src/advanced_omi_backend/ -B 1 -A 3Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
# Check the waveform_jobs.py file to see how waveforms are stored
cat -n backends/advanced/src/advanced_omi_backend/workers/waveform_jobs.py | sed -n '155,175p'Repository: SimpleOpenSoftware/chronicle
Length of output: 869
🏁 Script executed:
# Search for any insert/update logic for waveforms
rg -n "insert\|insert_one\|replace\|update" --type=py backends/advanced/src/advanced_omi_backend/workers/waveform_jobs.py -B 2 -A 2Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
# Check if there's logic to handle duplicate waveforms or find existing ones
rg -n "find.*conversation_id\|replace_one\|update_one" --type=py backends/advanced/src/advanced_omi_backend/ -B 1 -A 3 | grep -i waveform -A 5 -B 5Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
# Check if Beanie's unique index syntax uses a list wrapper for uniqueness
rg -n "indexes.*\[.*unique\|unique.*index" --type=py backends/advanced/src/advanced_omi_backend/models/ -B 1 -A 3Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
# Look at how Beanie is configured and check if there's a way to define unique indexes
rg -n "class Settings\|indexes = " --type=py backends/advanced/src/advanced_omi_backend/models/ -A 5 | head -60Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
# Get all Settings classes to see index patterns
rg -n "class Settings" --type=py backends/advanced/src/advanced_omi_backend/models/ -A 10Repository: SimpleOpenSoftware/chronicle
Length of output: 3865
🏁 Script executed:
# Check Beanie documentation on unique indexes in the code comments
rg -n "unique.*index\|\[.*\(.*unique" --type=py backends/advanced/src/advanced_omi_backend/models/Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
# Let's examine conversation.py Settings section to see how unique constraints are defined
sed -n '320,335p' backends/advanced/src/advanced_omi_backend/models/conversation.pyRepository: SimpleOpenSoftware/chronicle
Length of output: 449
🏁 Script executed:
# Verify how Beanie defines unique indexes by looking at documentation comments or other unique field examples
rg -n "unique=True" --type=py backends/advanced/src/advanced_omi_backend/models/Repository: SimpleOpenSoftware/chronicle
Length of output: 284
🏁 Script executed:
# Check if there's any code that queries for existing waveforms before inserting (which would suggest one-per-conversation intent)
rg -n "find.*waveform\|WaveformData.*find" --type=py backends/advanced/src/advanced_omi_backend/ -B 2 -A 3Repository: SimpleOpenSoftware/chronicle
Length of output: 54
🏁 Script executed:
# Check what happens when waveform job runs - does it replace or just insert?
sed -n '140,180p' backends/advanced/src/advanced_omi_backend/workers/waveform_jobs.pyRepository: SimpleOpenSoftware/chronicle
Length of output: 1490
Add unique=True to conversation_id field if one waveform per conversation is required.
The field description states "unique per conversation" and the index comment claims "Unique lookup by conversation", but the implementation doesn't enforce this constraint. Following the pattern used in Conversation model (line 84 of conversation.py), change Indexed(str) to Indexed(str, unique=True) on line 20. Currently, await waveform_doc.insert() allows duplicate conversation_ids, which contradicts the stated intent.
🧰 Tools
🪛 Ruff (0.14.11)
45-47: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
🤖 Prompt for AI Agents
In `@backends/advanced/src/advanced_omi_backend/models/waveform.py` around lines
43 - 47, The waveform model's conversation_id isn't enforced unique despite the
comment; update the field definition from Indexed(str) to Indexed(str,
unique=True) (the same pattern used in Conversation) so inserts cannot create
duplicate conversation_id values, and keep the Settings.indexes entry as-is;
modify the conversation_id declaration in the Waveform model to use Indexed(str,
unique=True).
| # Parse Range header (e.g., "bytes=0-1023") | ||
| try: | ||
| range_str = range_header.replace("bytes=", "") | ||
| range_start, range_end = range_str.split("-") | ||
| range_start = int(range_start) if range_start else 0 | ||
| range_end = int(range_end) if range_end else file_size - 1 | ||
|
|
||
| # Ensure valid range | ||
| range_start = max(0, range_start) | ||
| range_end = min(file_size - 1, range_end) | ||
| content_length = range_end - range_start + 1 | ||
|
|
||
| # Extract requested byte range | ||
| range_data = wav_data[range_start:range_end + 1] | ||
|
|
||
| # Return 206 Partial Content with Range headers | ||
| return Response( | ||
| content=range_data, | ||
| status_code=206, | ||
| media_type="audio/wav", | ||
| headers={ | ||
| "Content-Range": f"bytes {range_start}-{range_end}/{file_size}", | ||
| "Content-Length": str(content_length), | ||
| "Accept-Ranges": "bytes", | ||
| "Content-Disposition": f"inline; filename={conversation_id}.wav", | ||
| "X-Audio-Source": "mongodb-chunks", | ||
| } | ||
| ) | ||
| except (ValueError, IndexError) as e: | ||
| # Invalid Range header, return 416 Range Not Satisfiable | ||
| return Response( | ||
| status_code=416, | ||
| headers={ | ||
| "Content-Range": f"bytes */{file_size}" | ||
| } | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invalid Range requests can produce malformed responses.
The Range parsing doesn't validate that range_start <= range_end after parsing. For requests like bytes=100-50, this produces:
content_length = 50 - 100 + 1 = -49(negative)- Empty content from
wav_data[100:51] - A response with invalid
Content-Length: -49
Also, the static analysis correctly notes the unused variable e on line 156.
🐛 Proposed fix to validate range and remove unused variable
# Ensure valid range
range_start = max(0, range_start)
range_end = min(file_size - 1, range_end)
+
+ # Validate range_start <= range_end
+ if range_start > range_end:
+ return Response(
+ status_code=416,
+ headers={
+ "Content-Range": f"bytes */{file_size}"
+ }
+ )
+
content_length = range_end - range_start + 1
# Extract requested byte range
range_data = wav_data[range_start:range_end + 1]- except (ValueError, IndexError) as e:
+ except (ValueError, IndexError):
# Invalid Range header, return 416 Range Not Satisfiable
return Response(
status_code=416,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Parse Range header (e.g., "bytes=0-1023") | |
| try: | |
| range_str = range_header.replace("bytes=", "") | |
| range_start, range_end = range_str.split("-") | |
| range_start = int(range_start) if range_start else 0 | |
| range_end = int(range_end) if range_end else file_size - 1 | |
| # Ensure valid range | |
| range_start = max(0, range_start) | |
| range_end = min(file_size - 1, range_end) | |
| content_length = range_end - range_start + 1 | |
| # Extract requested byte range | |
| range_data = wav_data[range_start:range_end + 1] | |
| # Return 206 Partial Content with Range headers | |
| return Response( | |
| content=range_data, | |
| status_code=206, | |
| media_type="audio/wav", | |
| headers={ | |
| "Content-Range": f"bytes {range_start}-{range_end}/{file_size}", | |
| "Content-Length": str(content_length), | |
| "Accept-Ranges": "bytes", | |
| "Content-Disposition": f"inline; filename={conversation_id}.wav", | |
| "X-Audio-Source": "mongodb-chunks", | |
| } | |
| ) | |
| except (ValueError, IndexError) as e: | |
| # Invalid Range header, return 416 Range Not Satisfiable | |
| return Response( | |
| status_code=416, | |
| headers={ | |
| "Content-Range": f"bytes */{file_size}" | |
| } | |
| ) | |
| # Parse Range header (e.g., "bytes=0-1023") | |
| try: | |
| range_str = range_header.replace("bytes=", "") | |
| range_start, range_end = range_str.split("-") | |
| range_start = int(range_start) if range_start else 0 | |
| range_end = int(range_end) if range_end else file_size - 1 | |
| # Ensure valid range | |
| range_start = max(0, range_start) | |
| range_end = min(file_size - 1, range_end) | |
| # Validate range_start <= range_end | |
| if range_start > range_end: | |
| return Response( | |
| status_code=416, | |
| headers={ | |
| "Content-Range": f"bytes */{file_size}" | |
| } | |
| ) | |
| content_length = range_end - range_start + 1 | |
| # Extract requested byte range | |
| range_data = wav_data[range_start:range_end + 1] | |
| # Return 206 Partial Content with Range headers | |
| return Response( | |
| content=range_data, | |
| status_code=206, | |
| media_type="audio/wav", | |
| headers={ | |
| "Content-Range": f"bytes {range_start}-{range_end}/{file_size}", | |
| "Content-Length": str(content_length), | |
| "Accept-Ranges": "bytes", | |
| "Content-Disposition": f"inline; filename={conversation_id}.wav", | |
| "X-Audio-Source": "mongodb-chunks", | |
| } | |
| ) | |
| except (ValueError, IndexError): | |
| # Invalid Range header, return 416 Range Not Satisfiable | |
| return Response( | |
| status_code=416, | |
| headers={ | |
| "Content-Range": f"bytes */{file_size}" | |
| } | |
| ) |
🧰 Tools
🪛 Ruff (0.14.11)
156-156: Local variable e is assigned to but never used
Remove assignment to unused variable e
(F841)
🤖 Prompt for AI Agents
In `@backends/advanced/src/advanced_omi_backend/routers/modules/audio_routes.py`
around lines 128 - 163, Validate the parsed byte range after splitting
range_header: after computing range_start and range_end (and clamping to
0..file_size-1) ensure range_start <= range_end; if not, return a 416 Range Not
Satisfiable with "Content-Range: bytes */{file_size}". Also guard against
negative content_length by computing content_length = range_end - range_start +
1 only after the validation, and use the slice wav_data[range_start:range_end +
1] only when valid. Remove the unused exception variable `e` in the except
clause so the handler catches (ValueError, IndexError) and returns 416 without
referencing `e`.
| host: str = "localhost", | ||
| port: int = 6333, | ||
| collection_name: str = "omi_memories", | ||
| collection_name: str = "chronicle_memories", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for references to "omi_memories" across the codebase
echo "=== References to omi_memories ==="
rg -n "omi_memories" --type py --type md --type yaml --type json --type txt 2>/dev/null | head -50
echo ""
echo "=== Migration documentation or scripts ==="
fd -t f \( -name "*migrat*" -o -name "*MIGRAT*" \) 2>/dev/null | head -20
echo ""
echo "=== Checking for environment variable configuration ==="
rg -n "COLLECTION|collection" backends/advanced/src/advanced_omi_backend/services/memory/config.py -A 2 -B 2 2>/dev/null | head -80Repository: SimpleOpenSoftware/chronicle
Length of output: 1336
Address inconsistency between code defaults and configuration file defaults for collection name migration.
The code default for collection_name changed to "chronicle_memories" (lines 143, 266), but existing configuration files (config/defaults.yml, test configs) still reference "omi_memories". This creates an inconsistent upgrade experience:
- Deployments using config files will retain the old collection name (preserving existing data)
- Deployments without explicit configuration may switch to the new name, causing data access issues
Since existing configurations still specify "omi_memories", consider:
- Update configuration files to use the new default name, OR
- Revert the code default to match existing configs, OR
- Add migration documentation explaining the change and how to explicitly configure the collection name
No migration scripts or documentation currently exists for this breaking change.
🤖 Prompt for AI Agents
In `@backends/advanced/src/advanced_omi_backend/services/memory/config.py` at line
143, The code default for collection_name (variable collection_name in
config.py) was changed to "chronicle_memories" while existing defaults.yml and
tests still reference "omi_memories", causing a breaking mismatch; fix by
reverting the config.py default back to "omi_memories" (restore collection_name:
str = "omi_memories") so existing deployments keep working, update any unit
tests that assert the default to match, and add a short migration note in
project docs explaining the planned rename and how to opt into
"chronicle_memories" via explicit config so future changes are non-breaking.
| # Get conversation metadata | ||
| conversation = await Conversation.get(conversation_id) | ||
|
|
||
| if not conversation: | ||
| raise ValueError(f"Conversation {conversation_id} not found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect document lookup method.
Conversation.get(conversation_id) retrieves a document by its MongoDB _id. However, conversation_id is a separate indexed field on the Conversation model, not the document's _id. This will fail to find conversations.
Use the same pattern as elsewhere in this file (lines 669-671):
🐛 Proposed fix
- conversation = await Conversation.get(conversation_id)
+ conversation = await Conversation.find_one(
+ Conversation.conversation_id == conversation_id
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Get conversation metadata | |
| conversation = await Conversation.get(conversation_id) | |
| if not conversation: | |
| raise ValueError(f"Conversation {conversation_id} not found") | |
| # Get conversation metadata | |
| conversation = await Conversation.find_one( | |
| Conversation.conversation_id == conversation_id | |
| ) | |
| if not conversation: | |
| raise ValueError(f"Conversation {conversation_id} not found") |
🧰 Tools
🪛 Ruff (0.14.11)
434-434: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In `@backends/advanced/src/advanced_omi_backend/utils/audio_chunk_utils.py` around
lines 430 - 434, The code incorrectly uses Conversation.get(conversation_id)
which looks up by MongoDB _id; replace it with the indexed-field lookup used
elsewhere (pattern at lines 669-671) by awaiting Conversation.find_one with the
filter on the conversation_id field (e.g., await
Conversation.find_one({"conversation_id": conversation_id}) or equivalent
keyword-arg form) and keep the existing not-found ValueError branch.
| sample_rate: int = 3, | ||
| ) -> Dict[str, Any]: | ||
| """ | ||
| Generate waveform visualization data from conversation audio chunks. | ||
| This function: | ||
| 1. Retrieves Opus-compressed audio chunks from MongoDB | ||
| 2. Decodes each chunk to PCM | ||
| 3. Downsamples PCM to target sample rate (e.g., 10 samples/sec) | ||
| 4. Calculates amplitude peaks for each sample window | ||
| 5. Normalizes to [-1.0, 1.0] range | ||
| 6. Stores in WaveformData collection | ||
| Args: | ||
| conversation_id: Conversation ID to generate waveform for | ||
| sample_rate: Samples per second for waveform (default: 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring/signature mismatch for sample_rate default.
The docstring states the default is 10, but the actual default in the signature is 3. Update the docstring to reflect the correct default.
📝 Suggested fix
- sample_rate: Samples per second for waveform (default: 10)
+ sample_rate: Samples per second for waveform (default: 3)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| sample_rate: int = 3, | |
| ) -> Dict[str, Any]: | |
| """ | |
| Generate waveform visualization data from conversation audio chunks. | |
| This function: | |
| 1. Retrieves Opus-compressed audio chunks from MongoDB | |
| 2. Decodes each chunk to PCM | |
| 3. Downsamples PCM to target sample rate (e.g., 10 samples/sec) | |
| 4. Calculates amplitude peaks for each sample window | |
| 5. Normalizes to [-1.0, 1.0] range | |
| 6. Stores in WaveformData collection | |
| Args: | |
| conversation_id: Conversation ID to generate waveform for | |
| sample_rate: Samples per second for waveform (default: 10) | |
| sample_rate: int = 3, | |
| ) -> Dict[str, Any]: | |
| """ | |
| Generate waveform visualization data from conversation audio chunks. | |
| This function: | |
| 1. Retrieves Opus-compressed audio chunks from MongoDB | |
| 2. Decodes each chunk to PCM | |
| 3. Downsamples PCM to target sample rate (e.g., 10 samples/sec) | |
| 4. Calculates amplitude peaks for each sample window | |
| 5. Normalizes to [-1.0, 1.0] range | |
| 6. Stores in WaveformData collection | |
| Args: | |
| conversation_id: Conversation ID to generate waveform for | |
| sample_rate: Samples per second for waveform (default: 3) |
🤖 Prompt for AI Agents
In `@backends/advanced/src/advanced_omi_backend/workers/waveform_jobs.py` around
lines 19 - 34, The docstring for the function that accepts parameter sample_rate
incorrectly states the default is 10 while the function signature sets
sample_rate: int = 3; update the docstring to state the correct default (3) so
the signature and documentation match, referencing the sample_rate parameter in
the function (waveform generation function in waveform_jobs.py) and adjust the
"Args" section to list "sample_rate: Samples per second for waveform (default:
3)".
| async def _export_audio_wav(self, stats: CleanupStats): | ||
| """Export audio as WAV files (1-minute chunks)""" | ||
| logger.info("Exporting audio as WAV files (this may take a while)...") | ||
|
|
||
| # Get all unique conversation IDs | ||
| conversations = await Conversation.find_all().to_list() | ||
| audio_dir = self.backup_path / "audio" | ||
|
|
||
| for conv in conversations: | ||
| try: | ||
| await self._export_conversation_audio(conv.conversation_id, audio_dir) | ||
| except Exception as e: | ||
| logger.warning(f"Failed to export audio for {conv.conversation_id}: {e}") | ||
| continue | ||
|
|
||
| logger.info("Audio export completed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused stats parameter in _export_audio_wav.
The stats parameter is passed to this method but never used. Either remove it from the signature or use it to track export statistics (e.g., count of exported WAV files).
🐛 Suggested fix - remove unused parameter
- async def _export_audio_wav(self, stats: CleanupStats):
+ async def _export_audio_wav(self):And update the call at line 120:
- await self._export_audio_wav(stats)
+ await self._export_audio_wav()🧰 Tools
🪛 Ruff (0.14.11)
278-278: Unused method argument: stats
(ARG002)
289-289: Do not catch blind exception: Exception
(BLE001)
| const drawWaveform = ( | ||
| ctx: CanvasRenderingContext2D, | ||
| samples: number[], | ||
| width: number, | ||
| height: number | ||
| ) => { | ||
| const barWidth = width / samples.length; | ||
| const centerY = height / 2; | ||
|
|
||
| ctx.fillStyle = '#3b82f6'; // Blue bars (Tailwind blue-500) | ||
|
|
||
| samples.forEach((amplitude, i) => { | ||
| const x = i * barWidth; | ||
| const barHeight = Math.max(1, amplitude * centerY); // Ensure minimum 1px height | ||
|
|
||
| // Draw bar centered vertically | ||
| ctx.fillRect(x, centerY - barHeight, barWidth - 1, barHeight * 2); | ||
| }); | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against empty samples array.
If samples is empty, barWidth becomes Infinity (division by zero), and the forEach will be a no-op anyway. Add an early return for clarity and to prevent potential issues.
🐛 Suggested fix
const drawWaveform = (
ctx: CanvasRenderingContext2D,
samples: number[],
width: number,
height: number
) => {
+ if (samples.length === 0) return;
+
const barWidth = width / samples.length;
const centerY = height / 2;🤖 Prompt for AI Agents
In `@backends/advanced/webui/src/components/audio/WaveformDisplay.tsx` around
lines 77 - 95, The drawWaveform function can divide by zero when samples is
empty; add an early guard at the start of drawWaveform that returns immediately
if !samples || samples.length === 0 to avoid computing barWidth = width /
samples.length and any downstream issues; update the function (drawWaveform) to
perform this check before computing barWidth/centerY and before using
samples.forEach.
| SDK Upload Respects Backend File Size Limit | ||
| [Documentation] Verify SDK properly reports backend errors for oversized files | ||
| [Tags] audio-upload | ||
| # Note: This tests that SDK handles backend rejection gracefully | ||
| # The 30-minute limit is enforced by the backend, not the SDK | ||
| # Full test would require a 30+ minute audio file | ||
|
|
||
| ${result}= Run Process uv run python | ||
| ... ${CURDIR}/../scripts/sdk_test_auth.py | ||
| ... ${BACKEND_URL} ${ADMIN_EMAIL} ${ADMIN_PASSWORD} | ||
| Should Be Equal As Integers ${result.rc} 0 SDK should handle backend errors gracefully | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test case doesn't match its documented purpose.
This test claims to verify "SDK properly reports backend errors for oversized files" but actually just re-runs the authentication script. Consider either:
- Skipping this test like other unimplemented features
- Renaming it to reflect what it actually tests
Proposed fix - skip until properly implemented
SDK Upload Respects Backend File Size Limit
[Documentation] Verify SDK properly reports backend errors for oversized files
[Tags] audio-upload
-
- # Note: This tests that SDK handles backend rejection gracefully
- # The 30-minute limit is enforced by the backend, not the SDK
- # Full test would require a 30+ minute audio file
-
- ${result}= Run Process uv run python
- ... ${CURDIR}/../scripts/sdk_test_auth.py
- ... ${BACKEND_URL} ${ADMIN_EMAIL} ${ADMIN_PASSWORD}
- Should Be Equal As Integers ${result.rc} 0 SDK should handle backend errors gracefully
+ Skip Test requires 30+ minute audio file - not implemented📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| SDK Upload Respects Backend File Size Limit | |
| [Documentation] Verify SDK properly reports backend errors for oversized files | |
| [Tags] audio-upload | |
| # Note: This tests that SDK handles backend rejection gracefully | |
| # The 30-minute limit is enforced by the backend, not the SDK | |
| # Full test would require a 30+ minute audio file | |
| ${result}= Run Process uv run python | |
| ... ${CURDIR}/../scripts/sdk_test_auth.py | |
| ... ${BACKEND_URL} ${ADMIN_EMAIL} ${ADMIN_PASSWORD} | |
| Should Be Equal As Integers ${result.rc} 0 SDK should handle backend errors gracefully | |
| SDK Upload Respects Backend File Size Limit | |
| [Documentation] Verify SDK properly reports backend errors for oversized files | |
| [Tags] audio-upload | |
| Skip Test requires 30+ minute audio file - not implemented |
🤖 Prompt for AI Agents
In `@tests/integration/sdk_tests.robot` around lines 59 - 71, The test "SDK Upload
Respects Backend File Size Limit" currently only re-runs the authentication
script (Run Process ... ${CURDIR}/../scripts/sdk_test_auth.py) and doesn't
exercise upload/oversize behavior; either mark it as skipped until implemented
or rename it to reflect it's a placeholder: add a skip tag (e.g., add "skip" to
the [Tags] line) and update the test name or [Documentation] to indicate it's a
placeholder that re-runs sdk_test_auth.py, or alternatively change the test name
to something like "Placeholder: Re-run SDK auth script" so the intent and
behavior match.
| Get Job Result | ||
| [Documentation] Get the result field of a completed job | ||
| ... Useful for checking job output/return values | ||
| [Arguments] ${job_id} | ||
|
|
||
| # Get full job details | ||
| ${response}= GET On Session api /api/queue/job/${job_id} | ||
| ... expected_status=200 | ||
|
|
||
| ${job_data}= Set Variable ${response.json()} | ||
| ${result}= Set Variable ${job_data}[result] | ||
|
|
||
| RETURN ${result} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for queue API route definitions
# First, let's find the structure of the repository
echo "=== Repository structure ==="
find . -type f -name "*.py" | head -20
echo ""
echo "=== Searching for API route definitions with 'queue' and 'job' ==="
rg -n --type py '/api/queue/job' | head -20
echo ""
echo "=== Searching for route decorators with queue and job endpoints ==="
rg -n --type py '@.*route.*queue.*job|@.*get.*queue.*job|@.*post.*queue.*job' -C3
echo ""
echo "=== Searching for 'queue' in route definitions ==="
rg -n --type py '/queue/jobs?/' | head -30Repository: SimpleOpenSoftware/chronicle
Length of output: 2106
🏁 Script executed:
#!/bin/bash
# Search for queue controller and router definitions
echo "=== Queue Controller File ==="
cat -n backends/advanced/src/advanced_omi_backend/controllers/queue_controller.py | head -100
echo ""
echo "=== Queue-related files in routers ==="
find . -type f -name "*.py" -path "*/routers/*" | xargs grep -l "queue" 2>/dev/null
echo ""
echo "=== Main router setup ==="
find . -type f -name "*router*.py" -o -name "routes.py" | head -10Repository: SimpleOpenSoftware/chronicle
Length of output: 4783
🏁 Script executed:
#!/bin/bash
# Examine queue routes for endpoint definitions
echo "=== Queue Routes File ==="
wc -l backends/advanced/src/advanced_omi_backend/routers/modules/queue_routes.py
echo ""
echo "=== Complete Queue Routes File ==="
cat -n backends/advanced/src/advanced_omi_backend/routers/modules/queue_routes.pyRepository: SimpleOpenSoftware/chronicle
Length of output: 50384
🏁 Script executed:
#!/bin/bash
# Check the actual endpoint in the test file
echo "=== Checking lines 330-342 in test file ==="
sed -n '330,342p' tests/resources/queue_keywords.robotRepository: SimpleOpenSoftware/chronicle
Length of output: 554
Fix endpoint path: change /api/queue/job/ to /api/queue/jobs/
The endpoint uses the singular form /api/queue/job/${job_id}, but the API backend only defines endpoints with the plural form /api/queue/jobs/{job_id}. This mismatch will cause a 404 error at runtime. All other job-specific endpoints in this file correctly use the plural form.
Verified Fix
# Get full job details
- ${response}= GET On Session api /api/queue/job/${job_id}
+ ${response}= GET On Session api /api/queue/jobs/${job_id}
... expected_status=200🤖 Prompt for AI Agents
In `@tests/resources/queue_keywords.robot` around lines 330 - 342, The Robot
keyword "Get Job Result" is calling the wrong endpoint path
"/api/queue/job/${job_id}" which should be plural; update the GET call in that
keyword to use "/api/queue/jobs/${job_id}" so it matches the backend route and
will return 200; ensure the GET On Session invocation and any related variables
(${response}, ${job_data}, ${result}) remain unchanged.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.