Bedrock Invoke class for invoking models without Converse API (e.g. Bedrock Custom Model Import) by dgallitelli · Pull Request #1217 · strands-agents/sdk-python

dgallitelli · 2025-11-19T22:59:27Z

Description

This PR adds a new BedrockModelInvoke class that uses AWS Bedrock's native InvokeModel and InvokeModelWithResponseStream APIs instead of the Converse/ConverseStream APIs used by the existing BedrockModel class. This is particularly ideal to introduce support for Bedrock Custom Model Import.

Key Features:

Broader Model Support: Works with Bedrock models that don't support Converse APIs, including imported models
Multiple Format Support: Handles both Anthropic Messages API format and OpenAI ChatCompletion format
Streaming Support: Processes streaming responses from InvokeModelWithResponseStream
Tool Integration: Converts Strands tool specs to model-specific formats (OpenAI functions, Anthropic tools)
Error Handling: Maps Bedrock-specific errors to appropriate Strands exceptions

Use Case:
This implementation is particularly useful for imported models (via Bedrock's Custom Model Import feature) that may only support native InvokeModel APIs and expect specific request formats like OpenAI ChatCompletion.

Related Issues

N/A

Documentation PR

N/A

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

I ran hatch run prepare

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

dgallitelli · 2025-11-19T23:03:54Z

Note: hatch run prepare gives error: T-strings are only supported in Python 3.14 and greater [syntax]. However in my code I don't use T-strings. Please verify if someone else introduced this dependency.

westonbrown · 2025-11-26T15:43:29Z

Do we have any update on when this PR will be merged?

This will unlock the ability to leverage the new opus 4.5 features like tool search, lazy loading of tools, etc. in Bedrock.

Currently it's not possible since only supported via invoke api.

sirianni · 2026-03-11T19:28:49Z

+1 any update from the Strands team on this?

dgallitelli · 2026-03-13T16:19:11Z

Updates since original submission

Rebased onto current main and fixed several issues found during review. Here's a summary:

Rebase / merge conflict (`src/strands/models/init.py`)

main refactored the models package to use lazy loading via __getattr__ and added CacheConfig. The conflict was resolved by keeping BedrockModelInvoke as an eager import alongside BedrockModel (both depend only on boto3, which is already a required dependency), while preserving all lazy-loaded providers and CacheConfig from main.

Bug fixes (`src/strands/models/bedrock_invoke.py`)

1. _format_request always used OpenAI format (functional bug)
The method called _format_openai_request unconditionally despite _get_model_family() existing. Native Anthropic model IDs (e.g. anthropic.claude-*) would have received an OpenAI-formatted request body, causing API errors. Fixed to route to _format_anthropic_request when the model family is "anthropic".

2. _extract_usage_from_response had an unreachable branch (functional bug)
The OpenAI-format usage block (prompt_tokens / completion_tokens) was under elif "usage" in response_body — the same key as the if above it, so it was never reachable. OpenAI-format token counts were silently dropped. Fixed by branching on the field names inside usage instead (input_tokens vs prompt_tokens).

3. _stream variable shadowed the streaming module import
streaming = self.config.get("streaming", True) shadowed the from ..event_loop import streaming used later in structured_output. Renamed to use_streaming.

4. Image bytes not base64-encoded (functional bug)
_format_anthropic_request was passing raw bytes from image_data["source"]["bytes"] directly as the data field. The Anthropic InvokeModel API requires a base64-encoded string. Fixed to base64.b64encode(...).decode("utf-8").

mypy fixes

hatch run prepare (format + lint + mypy + pytest) now passes cleanly across Python 3.10–3.14 with 2386 tests passing.

Added # type: ignore[override] on update_config (same pattern as BedrockModel)
Annotated request: dict[str, Any] and content: list[dict[str, Any]] to allow mixed-value nested structures
Annotated events: list[StreamEvent] in _parse_anthropic_response
Fixed implicit Optional on response_body parameter
Added explicit str() cast in _extract_text_from_response

dgallitelli · 2026-05-13T13:41:39Z

Pushed a major rework of this PR (force-push, since the head ref is main on the fork). Single squashed commit: feat(bedrock): add BedrockModelInvoke for InvokeModel-only models.

What changed since the last review state

Rebased onto current main (the previous head was 91 commits behind / 4 ahead of merge-base; merge state was DIRTY).

Aligned with current SDK contracts:

BedrockInvokeConfig now subclasses BaseModelConfig so context_window_limit flows through (feat: add context_window_limit to model configs #2176).
Default model_id bumped to global.anthropic.claude-sonnet-4-6 (imported from bedrock.py) instead of the dated 3.5-sonnet-20241022 id (fix(bedrock): upgrade default model to Claude Sonnet 4.5 #2193).
__init__.py switched from eager to lazy __getattr__ import to match the lazy-load pattern used by every other provider.
Reuses BEDROCK_CONTEXT_WINDOW_OVERFLOW_MESSAGES, DEFAULT_BEDROCK_MODEL_ID, DEFAULT_BEDROCK_REGION, DEFAULT_READ_TIMEOUT from bedrock.py instead of redefining.

Streaming protocol (this is the bulk of the diff):

Anthropic Messages stream is fully translated: message_start → messageStart, content_block_start (text + tool_use) → contentBlockStart, content_block_delta (text and input_json_delta.partial_json) → contentBlockDelta, content_block_stop → contentBlockStop, message_delta.stop_reason/usage → messageStop + metadata.
OpenAI Chat Completions stream is translated symmetrically, including delta.tool_calls[] keyed by index (lazy-emit contentBlockStart once id/name is observed; concatenate arguments deltas as toolUse.input).
Stop reasons are mapped from each backend's vocabulary to the Strands set (tool_use, max_tokens, stop_sequence, end_turn) instead of being hard-coded to end_turn. This mirrors the fix from fix: override end_turn stop reason when streaming response contains toolUse blocks #1827 — without it, structured_output() and tool execution were unreachable on the streaming path.
Non-streaming responses go through the same translation, so tool use works in both modes.

Tool use & ToolChoice:

ToolChoice is now translated and forwarded to both backends ({"type": "auto"|"any"|"tool"} for Anthropic, "auto"/"required"/{"type": "function", ...} for OpenAI). Previously the parameter was accepted and silently dropped, which made structured_output non-functional.
OpenAI request formatter now passes toolUse and toolResult blocks through (tool_calls on assistant turns, separate role: "tool" messages for results) instead of flattening multi-block messages to a single string.

Other correctness fixes:

Image media type now produced as image/<format> (Anthropic API requires the prefixed form).
Unknown model ids default to the OpenAI/imported-model format rather than Anthropic — the headline use case for this provider per the description. The behavior can be overridden with the new model_family config key.
_extract_text_from_response no longer falls back to json.dumps(response) (could leak full response bodies into assistant text).
tool_result.is_error set when status == "error" so error semantics survive the round trip.
Dropped the unused guardrail_id/guardrail_version config keys; they were declared but never wired into the Bedrock call.

Tests:

24 unit tests (up from 8), now covering: init/default model id, family detection (parametrised) and override, request formatting for both families with image/tool_use/toolResult/tool_choice, full streaming traces for Anthropic text, Anthropic tool use, OpenAI text + tool use (with usage), non-streaming Anthropic, throttling and context-window-overflow error mapping, and end-to-end structured_output returning a Pydantic model.
Integration tests converted from @pytest.mark.skip to runnable. The imported-model test is now gated on STRANDS_BEDROCK_INVOKE_IMPORTED_MODEL_ARN since ARNs are account-specific.

Local verification

hatch run test-lint          → ruff check ✓ / mypy ✓
hatch test tests/strands     → 2974 passed
hatch test tests/strands/models/test_bedrock_invoke.py → 24 passed

About the still-failing checks

The three failing checks (Auto Strands Review / Trigger Strands Review, Secure Integration test / check-access-and-checkout, Secure Integration test / upload-metrics) are all gated on the pull_request_target protected-environment approval flow — they hit GitHub's 30-day waiting-job timeout (720h0m0s) without a maintainer clicking Approve, and the upload-metrics job then fails because the artifact from the unrun integ-tests step doesn't exist (if: always()). Could one of you kick those off?

cc @pgrayy @Unshure @zastrowm — would appreciate another look. Happy to split this into smaller commits if that's easier to review.

Adds a new ``BedrockModelInvoke`` provider that talks to Bedrock's native ``InvokeModel``/``InvokeModelWithResponseStream`` APIs instead of ``Converse``/``ConverseStream``. This makes Strands usable with models that do not support Converse, most notably Bedrock Custom Model Import (Llama, Mistral, Qwen, ...) and Anthropic models accessed via the Messages API. Supports both Anthropic Messages and OpenAI Chat Completions request/response formats; the wire format is auto-detected from the model id and can be overridden via the ``model_family`` config key. Streaming is fully wired through the Bedrock Converse-shaped event contract: ``messageStart``, ``contentBlockStart``/``contentBlockDelta``/``contentBlockStop`` for both text and tool-use blocks, ``messageStop`` with the mapped stop reason, and a ``metadata`` event carrying token usage. Non-streaming responses go through the same translation. Tool use, structured output, image inputs (Anthropic family), tool results, ``ToolChoice``, and the standard Bedrock error paths (throttling, context-window overflow, access-denied) are covered. The provider is exposed via lazy ``__getattr__`` import to keep package import time unchanged. Tests: 24 unit tests covering init, family detection, request formatting for both families, Anthropic and OpenAI streaming paths (text and tool use), non-streaming path, error mapping, and structured output. The integration test suite is converted from ``@pytest.mark.skip`` to runnable tests; the imported-model test is gated on the ``STRANDS_BEDROCK_INVOKE_IMPORTED_MODEL_ARN`` environment variable since ARNs are account-specific.

dgallitelli · 2026-05-13T14:14:30Z

Trimmed and force-pushed (commit ac804f9) to pass the PR Size Labeler size/xl gate. Diff is now 989 lines (was 1466), all under the 1000-line threshold; size label is back on size/m.

What I cut to get there (no behavior or test-coverage loss):

Squashed the per-callback inline event dicts in the streaming/non-streaming emitters behind small _text_start/_tool_use_start/_tool_use_delta/_metadata/_BLOCK_STOP/_TEXT_START helpers, so the wire shape lives in one place.
Merged the two tool-choice translators into a single _to_tool_choice(tool_choice, family) (was _to_anthropic_tool_choice + _to_openai_tool_choice).
Tightened the BedrockInvokeConfig and class-level docstrings (kept the Attributes elsewhere; the field names + types speak for themselves).
Module-scoped pytestmark = pytest.mark.usefixtures("bedrock_client") removed the boilerplate _ = bedrock_client lines from format-only tests.
Inlined a few obvious local helpers, collapsed multi-line dict literals where they fit at 120 chars.
Dropped the test_bedrock_invoke_configuration integ test — it was a duplicate of unit coverage.

Verification: hatch run test-lint ✓ (ruff + mypy), hatch test tests/strands ✓ (2974 passed), and the same 24-test bedrock_invoke suite still passes (streaming text + tool use for both families, non-streaming, structured output, error mapping).

The three remaining red checks are still the protected-environment ones — same situation as before, only a maintainer can release them. cc @pgrayy @Unshure @zastrowm.

github-actions Bot added the size/l label Nov 19, 2025

dgallitelli had a problem deploying to manual-approval November 19, 2025 22:59 — with GitHub Actions Failure

github-actions Bot added size/l and removed size/l labels Nov 19, 2025

dgallitelli had a problem deploying to manual-approval November 19, 2025 23:00 — with GitHub Actions Failure

dgallitelli changed the title ~~Bedrock Invoke class for invoking models without Converse API~~ Bedrock Invoke class for invoking models without Converse API (e.g. Bedrock Custom Model Import) Nov 19, 2025

dgallitelli force-pushed the main branch from 2ba8466 to c72dd2e Compare March 13, 2026 16:02

github-actions Bot added size/l and removed size/l labels Mar 13, 2026

dgallitelli had a problem deploying to manual-approval March 13, 2026 16:02 — with GitHub Actions Failure

github-actions Bot added size/l and removed size/l labels Mar 13, 2026

dgallitelli had a problem deploying to manual-approval March 13, 2026 16:03 — with GitHub Actions Failure

github-actions Bot added size/l and removed size/l labels Mar 13, 2026

dgallitelli had a problem deploying to manual-approval March 13, 2026 16:18 — with GitHub Actions Failure

dgallitelli force-pushed the main branch from b409167 to 7ebb35b Compare May 13, 2026 13:39

dgallitelli requested a deployment to manual-approval May 13, 2026 13:39 — with GitHub Actions Waiting

github-actions Bot added size/xl and removed size/l labels May 13, 2026

dgallitelli force-pushed the main branch from 7ebb35b to ac804f9 Compare May 13, 2026 14:13

dgallitelli requested a deployment to manual-approval May 13, 2026 14:13 — with GitHub Actions Waiting

github-actions Bot added size/l and removed size/xl labels May 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bedrock Invoke class for invoking models without Converse API (e.g. Bedrock Custom Model Import)#1217

Bedrock Invoke class for invoking models without Converse API (e.g. Bedrock Custom Model Import)#1217
dgallitelli wants to merge 1 commit into
strands-agents:mainfrom
dgallitelli:main

dgallitelli commented Nov 19, 2025 •

edited

Loading

Uh oh!

dgallitelli commented Nov 19, 2025

Uh oh!

westonbrown commented Nov 26, 2025

Uh oh!

sirianni commented Mar 11, 2026

Uh oh!

dgallitelli commented Mar 13, 2026

Uh oh!

dgallitelli commented May 13, 2026

Uh oh!

dgallitelli commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dgallitelli commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Documentation PR

Type of Change

Testing

Checklist

Uh oh!

dgallitelli commented Nov 19, 2025

Uh oh!

westonbrown commented Nov 26, 2025

Uh oh!

sirianni commented Mar 11, 2026

Uh oh!

dgallitelli commented Mar 13, 2026

Updates since original submission

Rebase / merge conflict (src/strands/models/__init__.py)

Bug fixes (src/strands/models/bedrock_invoke.py)

mypy fixes

Uh oh!

dgallitelli commented May 13, 2026

What changed since the last review state

Local verification

About the still-failing checks

Uh oh!

dgallitelli commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dgallitelli commented Nov 19, 2025 •

edited

Loading

Rebase / merge conflict (`src/strands/models/init.py`)

Bug fixes (`src/strands/models/bedrock_invoke.py`)