-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(llmobs): trace text-based bedrock converse api #12560
base: main
Are you sure you want to change the base?
Conversation
|
BenchmarksBenchmark execution time: 2025-03-12 00:02:27 Comparing candidate commit ef243ab in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 468 metrics, 2 unstable metrics. |
Datadog ReportBranch report: ✅ 0 Failed, 43 Passed, 290 Skipped, 49.39s Total duration (5m 5.68s time saved) |
…aude-code-converse-api
continue | ||
role = str(p.get("role", "")) | ||
content = p.get("content", "") | ||
if isinstance(content, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if content is a string? Doesn't seem like we append it anywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the bedrock converse spec states content
is always a list of ContentBlock
objects (what we parse inside this if block)
https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Message.html
i'll add a comment to clarify this
message = response.get("output", {}).get("message", {}) | ||
role = message.get("role", "assistant") | ||
tool_calls_info = [] | ||
if message.get("content") and isinstance(message["content"], list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can content be a string or a non-list type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i do not think so; bedrock docs state content is always a list of "ContentBlock" types: https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ContentBlock.html
and this seems to be consistent with the earliest version of botocore that supports bedrock converse
"name": content_block.get("toolUse", {}).get("name", ""), | ||
"arguments": content_block.get("toolUse", {}).get("input", ""), | ||
"tool_id": content_block.get("toolUse", {}).get("toolUseId", ""), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these values always guaranteed to be a string? should we be defensive/cast to string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name
and tool_id
are (i'll cast to string), but arguments
isn't - thanks for the catch, ill default arguments to {}
""" | ||
Sets LLM usage metrics in the context for LLM Observability. | ||
""" | ||
llmobs_usage = {} | ||
if input_tokens: | ||
if input_tokens is not None and input_tokens != "": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is checking against None and empty string not covered by if input_tokens
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want to differentiate token counts being 0 vs not present, which i guess is rare but valid. this is differentiate is needed since converse returns tokens in a usage
field with integer values
for content_block in content: | ||
if content_block.get("text") and isinstance(content_block.get("text"), str): | ||
content_blocks.append(content_block.get("text", "")) | ||
elif content_block.get("toolUse") and isinstance(content_block.get("toolUse"), dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can a content block contain multiple objects? i.e. should we always check each condition instead of if/elif/else
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tool_calls_info.append( | ||
{ | ||
"name": str(toolUse.get("name", "")), | ||
"arguments": toolUse.get("input", {}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should json dump this value if it's always json serializable, unless we need to pass it in as an actual dict (non-string) to be properly visualized in our UI. Correct me if I'm wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe if we dump it here, and then also call safe_json
when we encode, the arguments will show up as a raw json string in the UI instead of the whole thing being having nice json formatting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm yea actually we'll drop the output if this isn't a dict since that's how we decode tool calls in the backend
This PR supports instrumenting LLM spans for bedrock's
Converse
method. This PR does not touchConverseStream
, but we document it’s behavior intest_llmobs_converse_stream
.Its helpful to review the bedrock request syntax, and the response syntax
Example bedrock code snippet:
Manual QA
Example with tool calls
Example without tool calls
Data this PR traces
system
roleuser
roleassistant
rolemax_tokens
andtemperature
stop_reason
Implementation details:
We register a separate trace handler for processing bedrock converse responses.
core.on("botocore.bedrock.process_response_converse", _on_botocore_bedrock_process_response_converse)
This is to avoid the code-path that does extra post-processing of invoke model responses before it's ready for
llmobs_set_tags
.Converse still relies on the same trace handler for processing 1) request input 2) bedrock exceptions.
Cassettes
I chose to use cassettes since there were some difficulties with mocking out the bedrock calls with respx. There are some authentication steps that happen within the botocore library before the mocked LLM call, leading me to run into errors like:
This means we needed to mock out or find a way to skip the internal authentication steps, which would cause the test to be dependent on non-bedrock parts of the botocore library which may be subject to change. In my opinion, this makes cassettes the better option.
To Do
Checklist
Reviewer Checklist