feat(gemini): plumb through cache tokens in metadata events by yatszhash · Pull Request #2287 · strands-agents/sdk-python

yatszhash · 2026-05-13T05:40:13Z

Motivation

When the Gemini model returns usage_metadata.cached_content_token_count, GeminiModel currently discards it. Users have no way to see whether their requests are benefiting from Gemini implicit (or explicit) caching, which is valuable for cost optimization and debugging.

The bidi Gemini provider (experimental/bidi/models/gemini_live.py) already plumbs this through; the production GeminiModel is the missing piece. The OpenAI provider received the same treatment in #2116.

Public API Changes

No public API changes. The metadata event emitted by GeminiModel._format_chunk now includes cacheReadInputTokens in the usage data when Gemini reports cached prompt tokens:

# Before: metadata event usage
{"inputTokens": 18625, "outputTokens": 188, "totalTokens": 18813}

# After: metadata event usage (when cache hit occurs)
{"inputTokens": 18625, "outputTokens": 188, "totalTokens": 18813, "cacheReadInputTokens": 18010}

When cached_content_token_count is None or 0, the field is omitted — preserving backward compatibility and matching the convention established by the OpenAI provider (#2116). The existing telemetry pipeline (tracer and metrics) already handles cacheReadInputTokens, so cache data flows through automatically.

Only cacheReadInputTokens is set because Gemini's usage_metadata does not expose a cache write token equivalent (consistent with the OpenAI provider).

Related Issues

Relates to #1060 (Add explicit context caching support for Gemini models) — this PR addresses only the visibility portion of that issue. The explicit cache lifecycle APIs proposed there (enable_caching, cache_ttl, create_cache(), delete_cache()) are a much larger surface and remain open under #1060 for a follow-up.

Also relates to #1140 (Caching support for all models).

Documentation PR

Not required — no public API changes.

Type of Change

New feature

Testing

Unit tests added in tests/strands/models/test_gemini.py:

test_format_chunk_metadata_with_cache_tokens — cached_content_token_count=25 → metadata exposes cacheReadInputTokens=25.
test_format_chunk_metadata_with_zero_cached_tokens — cached_content_token_count=0 → cacheReadInputTokens is omitted.

The "field unset" case (Gemini's default response when no cache) is implicitly covered by every existing test_stream_response_* test that does not set cached_content_token_count.

Manually verified against Vertex AI Gemini 2.5 Flash: a second request with an identical ~18k-token prefix reports cacheReadInputTokens=18010 where the pre-patch code reported no cache field. Accumulation through EventLoopMetrics was also confirmed end-to-end (cumulative cacheReadInputTokens matches the sum of per-call values).

I ran hatch run prepare

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Surface cached_content_token_count from usage_metadata as cacheReadInputTokens on the metadata event emitted by GeminiModel. The existing telemetry pipeline picks it up automatically. Relates to strands-agents#1060, strands-agents#1140.

github-actions Bot added the size/s label May 13, 2026

yatszhash requested a deployment to manual-approval May 13, 2026 05:40 — with GitHub Actions Waiting

yatszhash mentioned this pull request May 13, 2026

[FEATURE] Add explicit context caching support for Gemini models #1060

Open

yatszhash marked this pull request as ready for review May 13, 2026 06:29

yatszhash requested a deployment to manual-approval May 13, 2026 06:29 — with GitHub Actions Waiting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(gemini): plumb through cache tokens in metadata events#2287

feat(gemini): plumb through cache tokens in metadata events#2287
yatszhash wants to merge 1 commit into
strands-agents:mainfrom
yatszhash:feat/gemini-cache-tokens

yatszhash commented May 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yatszhash commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Public API Changes

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yatszhash commented May 13, 2026 •

edited

Loading