Add prompt caching support for AWS Bedrock #3438

DenysMoskalenko · 2025-11-15T15:34:21Z

Summary

Implements AWS Bedrock prompt caching support (see #3418) by fixing how cache points are sent, documenting the workflow, and extending test coverage to assert cache writes and reads.

Testing

uv run pytest tests/models/test_bedrock.py
uv run coverage run -m pytest tests/models/test_bedrock.py

DenysMoskalenko · 2025-11-15T15:36:11Z

docs/models/bedrock.md

@DouweM It's mostly the duplication of the same documentation we have for Anthropic CachePoint. What do you think, maybe we need to move it somewhere?

DenysMoskalenko · 2025-11-18T16:24:52Z

@DouweM Is there any change to continue with this PR? We need this feature a lot 🙏.

I read the #3453 but I guess that we can add bedrock support in the same way and make changes later for both places if needed, instead of just ignoring the Bedrock users. What do you think?

DouweM · 2025-11-18T20:10:00Z

@DenysMoskalenko Thanks for working on this Denys!

I guess that we can add bedrock support in the same way and make changes later for both places if needed, instead of just ignoring the Bedrock users. What do you think?

Agreed. Can you please have a look at these issues and address them in case they affect this implementation as well?

Add ttl to CachePoint and Anthropic caching model settings #3450
Anthropic prompt caching fails when there are more than 4 cache points #3467
Also see the last comment on Enable more advanced Anthropic prompt caching than CachePoint implementation llows #3453 where it was said that Amazon Nova may need special consideration.
Related to that last one, it's also worth checking what happens when a cache point is sent to a non-Anthropic model. If that's not supported, we should make sure we send them only to Anthropic models using a new flag on BedrockModelProfile (there are some examples in the file already)

DenysMoskalenko · 2025-11-19T10:15:15Z

Sure:

About the TTL (https://github.com/pydantic/pydantic-ai/pull/3450) — We cannot set the TTL for AWS Bedrock’s prompt cache; it’s fixed at a 5-minute sliding window that resets with each successful cache hit (see https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html#prompt-caching-overview).
About CachePoint strip. @DouweM Do you think we really need to do this work? The limit of 4 doesn’t seem stable (the docs currently show 4 everywhere, but it’s in a table, so it might change). I thought the initial idea was to rely on the user: if they hit the “maximum 4 CachePoints” error, it’s on them to use fewer than 4 in their code (Same for example with minimum tokens for cache-point, it depend on the model and will not work for small inputs). Explicit error is better than implicit magic, in my opinion we should show to the user that he makes something wrong. Anyway, if you require this change, do you think this should be part of this PR?
I don’t think there are any special considerations. I’ll update the tests and re-record the cassette for AWS Nova to confirm everything works correctly.
Same as above.

- Emit cache-point tool entries so Bedrock accepts cached tool definitions - Document and test prompt caching (writes + reads) with cassette-body checks - Refresh Bedrock cassettes and type annotations to align with the new flow

…multiple models

DenysMoskalenko · 2025-11-19T11:02:22Z

@DouweM
I’ve run tests using AWS Nova models, and prompt caching works well for both user and system prompts. I’ve recorded the cassette for that.

The limitation is:
– It doesn’t support tool caching. As specified in the documentation (https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html#prompt-caching-models), In my opinion, it’s fine to raise an error when a user tries to do something Bedrock doesn’t allow.

DouweM · 2025-11-19T15:52:34Z

About CachePoint strip. @DouweM Do you think we really need to do this work? The limit of 4 doesn’t seem stable (the docs currently show 4 everywhere, but it’s in a table, so it might change). I thought the initial idea was to rely on the user: if they hit the “maximum 4 CachePoints” error, it’s on them to use fewer than 4 in their code (Same for example with minimum tokens for cache-point, it depend on the model and will not work for small inputs). Explicit error is better than implicit magic, in my opinion we should show to the user that he makes something wrong.

@DenysMoskalenko If the number 4 changes or becomes model-specific we can add it to the model profile.

But I do think we should take care of staying under the limit, because it's not so easy for the user to do so themself if there are CachePoints in the message history for example. They could use a history processor to remove older ones, but that still wouldn't be able to know if there were cache points on the tool defs or instructions. So it's not really "the user is doing something wrong and should fix their code", but "the way we implemented cache points makes it easier to run into this issue than to fix it yourself", so we should fix it.

In this case, "implicit magic" is a bit intentional, because as I wrote on #3453, the goal is for this to be useful to people who don't want to become experts on prompt caching and the limitations Anthropic enforces, not the more advanced users and use cases that need fine-grained control.

In any case, the CachePoint stripping is being implemented in #3442 which I'd expect to merge today or tomorrow, so I'd recommend we wait for that one to merge first, and then adopt the relevant changes here as well. (Likely with a new prompt caching doc, so we don't end up with a ton of duplication)

In my opinion, it’s fine to raise an error when a user tries to do something Bedrock doesn’t allow.

Usually with model settings, we silently ignore them if they're not supported (that's why most of them say "Supported by: ..." in the docstring), so I might prefer to say "Supported by: Anthropic on Bedrock", and then silently ignore it for Nova.

I agree raising errors when the user does something unsupported is usually good, but with model settings we typically do a "best effort" so that as many requests as possible succeed.

DenysMoskalenko commented Nov 15, 2025

View reviewed changes

DenysMoskalenko force-pushed the feature/add_anthropick_prompt_caching_on_bedrock branch from 5263d8a to 6612939 Compare November 15, 2025 15:55

DouweM mentioned this pull request Nov 18, 2025

Enable more advanced Anthropic prompt caching than CachePoint implementation llows #3453

Open

2 tasks

DouweM self-assigned this Nov 18, 2025

DouweM added the awaiting author revision label Nov 18, 2025

DenysMoskalenko added 3 commits November 19, 2025 11:36

Bedrock: add prompt caching support and verification

25faefc

- Emit cache-point tool entries so Bedrock accepts cached tool definitions - Document and test prompt caching (writes + reads) with cassette-body checks - Refresh Bedrock cassettes and type annotations to align with the new flow

Skip documentation examples from tests

59c6868

Parametrize test_bedrock_cache_point_adds_cache_control to support …

900d542

…multiple models

DenysMoskalenko force-pushed the feature/add_anthropick_prompt_caching_on_bedrock branch from 783607c to 900d542 Compare November 19, 2025 10:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add prompt caching support for AWS Bedrock #3438

Add prompt caching support for AWS Bedrock #3438

Uh oh!

DenysMoskalenko commented Nov 15, 2025

Uh oh!

DenysMoskalenko Nov 15, 2025

Uh oh!

DenysMoskalenko commented Nov 18, 2025

Uh oh!

DouweM commented Nov 18, 2025

Uh oh!

DenysMoskalenko commented Nov 19, 2025 •

edited

Loading

Uh oh!

DenysMoskalenko commented Nov 19, 2025 •

edited

Loading

Uh oh!

DouweM commented Nov 19, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add prompt caching support for AWS Bedrock #3438

Are you sure you want to change the base?

Add prompt caching support for AWS Bedrock #3438

Uh oh!

Conversation

DenysMoskalenko commented Nov 15, 2025

Summary

Testing

Uh oh!

DenysMoskalenko Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

DenysMoskalenko commented Nov 18, 2025

Uh oh!

DouweM commented Nov 18, 2025

Uh oh!

DenysMoskalenko commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DenysMoskalenko commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DouweM commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DenysMoskalenko commented Nov 19, 2025 •

edited

Loading

DenysMoskalenko commented Nov 19, 2025 •

edited

Loading

DouweM commented Nov 19, 2025 •

edited

Loading