fix: track token usage in litellm non-streaming and async calls #4171

devin-ai-integration · 2026-01-03T19:28:55Z

fix: track token usage in litellm non-streaming and async calls

Summary

Fixes GitHub issue #4170 where get_token_usage_summary() was not returning accurate metrics when using litellm with non-streaming responses and async calls.

The root cause was that _track_token_usage_internal() was only being called in the sync streaming code path. This PR adds token tracking to:

_handle_non_streaming_response (sync non-streaming)
_ahandle_non_streaming_response (async non-streaming)
_ahandle_streaming_response (async streaming)
Fixed both code paths in sync streaming (with/without tool calls)

Additionally, litellm returns usage as an object with attributes (e.g., usage.prompt_tokens), but _track_token_usage_internal() expects a dict. Added conversion logic to handle this.

Review & Testing Checklist for Human

Verify no double-counting: Check that token usage isn't tracked twice in any code path. The sync streaming method has two tracking calls (lines 941 and 977) - verify these are mutually exclusive paths
Test with real litellm calls: The unit tests use mocks. Manually test with actual API calls to verify token metrics are populated correctly for streaming and async scenarios
Consider refactoring: The usage object-to-dict conversion is duplicated 5 times. Consider extracting to a helper method if this is a concern
Verify hasattr(usage_info, "__dict__") check: This distinguishes objects from dicts - verify this works correctly with all litellm response formats

Recommended Test Plan

Create a simple script using LLM(model="gpt-4o-mini", is_litellm=True) with:
- stream=False + call()
- stream=False + acall()
- stream=True + acall()
After each call, verify llm.get_token_usage_summary() returns non-zero values

Notes

Link to Devin run: https://app.devin.ai/sessions/4c0ee5986d1e48fba5b3353335e9cb6e
Requested by: João (joao@crewai.com)
All 9 new unit tests pass locally

This fixes GitHub issue #4170 where token usage metrics were not being updated when using litellm with streaming responses and async calls. Changes: - Add token usage tracking to _handle_non_streaming_response - Add token usage tracking to _ahandle_non_streaming_response - Add token usage tracking to _ahandle_streaming_response - Fix sync streaming to track usage in both code paths - Convert usage objects to dicts before passing to _track_token_usage_internal - Add comprehensive tests for token usage tracking in all scenarios Co-Authored-By: João <joao@crewai.com>

devin-ai-integration · 2026-01-03T19:28:58Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: track token usage in litellm non-streaming and async calls #4171

fix: track token usage in litellm non-streaming and async calls #4171

devin-ai-integration bot commented Jan 3, 2026

Uh oh!

devin-ai-integration bot commented Jan 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: track token usage in litellm non-streaming and async calls #4171

Are you sure you want to change the base?

fix: track token usage in litellm non-streaming and async calls #4171

Conversation

devin-ai-integration bot commented Jan 3, 2026

fix: track token usage in litellm non-streaming and async calls

Summary

Review & Testing Checklist for Human

Recommended Test Plan

Notes

Uh oh!

devin-ai-integration bot commented Jan 3, 2026

🤖 Devin AI Engineer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant