Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AI] FunctionCallContent is not included in StreamingChatCompletionUpdate when KeepFunctionCallingMessages enabled #5909

Open
Gigabyte0x1337 opened this issue Feb 15, 2025 · 8 comments
Labels

Comments

@Gigabyte0x1337
Copy link

Description

When using Microsoft.Extensions.AI with function calling and streaming, the FunctionCallContent is not included, even with KeepFunctionCallingMessages enabled.

The reason I need this is to notify my front-end that a function call is busy. Of course, I could implement the whole thing myself, but what's the point of having it if it doesn't serve this purpose?

Reproduction Steps

Minimal example code to repreduce this.

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
	  <OutputType>Exe</OutputType>
	  <TargetFramework>net9.0</TargetFramework>
	  <ImplicitUsings>enable</ImplicitUsings>
	  <Nullable>enable</Nullable>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="Microsoft.Extensions.AI" Version="9.1.0-preview.1.25064.3" />
    <PackageReference Include="Microsoft.Extensions.AI.Abstractions" Version="9.1.0-preview.1.25064.3" />
    <PackageReference Include="Microsoft.Extensions.AI.OpenAI" Version="9.1.0-preview.1.25064.3" />
    <PackageReference Include="Microsoft.Extensions.Options" Version="9.0.2" />
  </ItemGroup>

</Project>
var chatClient = new ChatClientBuilder(new OpenAIChatClient(new OpenAIClient(openAiKey), "gpt-4o"))
    .UseFunctionInvocation(null, client =>
    {
        client.KeepFunctionCallingMessages = true;
        client.RetryOnError = true;
        client.ConcurrentInvocation = true;
    })
    .Build();


var messageStream = chatClient.CompleteStreamingAsync([
    new ChatMessage(ChatRole.System, [new TextContent("You are a helpful assistant")]),
    new ChatMessage(ChatRole.User,
    [
        new TextContent(
            "Hello, can you call the test function, try `test` as input. Follow the instruction solve the riddle! No need for confirmation do it yourself!")
    ])
], new ChatOptions()
{
    Tools = [AIFunctionFactory.Create((string? password) => password != "The Goat" ? 
        "Call me again with `The Goat` and see what happens next!" : 
        "Pass this along will you, 'The goat has been milked'!", "test_function", "This is function to test function calling.")]
});


await foreach (var messageUpdate in messageStream)
{
    foreach (var messageUpdateContent in messageUpdate.Contents)
    {
        if (messageUpdateContent is TextContent textContent)
        {
            Console.Write(textContent.Text);
        }
        else if (messageUpdateContent is UsageContent usageContent)
        {
            // Ignore usage
        }
        else
        {
            if (Console.GetCursorPosition().Left != 0)
            {
                Console.WriteLine();
            }

            Console.WriteLine(
                $"MessageUpdate {messageUpdateContent.GetType().Name} {messageUpdate.GetMessageId()}, {messageUpdate.GetParentMessageId()} Text: {messageUpdate.Text}");
        }
    }
}

// Output: "The goat has been milked"!

Expected behavior

MessageUpdate with FunctionCallContent when KeepFunctionCallingMessages is enabled.

Actual behavior

FunctionCallContent, FunctionResultContent is not included in the updates

Regression?

No response

Known Workarounds

No response

Configuration

.net 9

Other information

No response

@Gigabyte0x1337 Gigabyte0x1337 added bug This issue describes a behavior which is not expected - a bug. untriaged labels Feb 15, 2025
@Gigabyte0x1337 Gigabyte0x1337 changed the title FunctionCallContent is not included in StreamingChatCompletionUpdate [AI] FunctionCallContent is not included in StreamingChatCompletionUpdate when KeepFunctionCallingMessages enabled Feb 15, 2025
@stephentoub stephentoub added area-ai and removed untriaged bug This issue describes a behavior which is not expected - a bug. labels Feb 15, 2025
@stephentoub
Copy link
Member

stephentoub commented Feb 15, 2025

This is by design, though of course we could revisit the design. The reason for it is called out here:

// We're going to emit all StreamingChatMessage items upstream, even ones that represent
// function calls, because a given StreamingChatMessage can contain other content, too.
// And if we yield the function calls, and the consumer adds all the content into a message
// that's then added into history, they'll end up with function call contents that aren't
// directly paired with function result contents, which may cause issues for some models
// when the history is later sent again.

Essentially, a client that takes all the content and just adds it back into the history will end up with an erroneous history that includes FunctionCallContent without corresponding FunctionResultContent, or possibly duplicated FunctionCallContents. So the implementation is removing them from the stream but including them in the history (when KeepFunctionCallingMessages is true, which is the default).

If we change that to stream the function call messages, we'd need to come up with some other workaround, and that can't just be telling developers not to include FRCs in the history. We have helpers like ToChatCompletion that will convert updates into complete messages, and such helpers shouldn't have baked in policy that excludes FunctionCallContent, because they have no knowledge of whether a particular FunctionInvokingChatClient is in use.

Off the top of my head, one fallback could be to add this as another knob on FunctionInvokingChatClient, e.g. a StreamFunctionCallContent property, that defaults to false but that you could set to true if you want that behavior.

Another option would be to yield the FunctionCallContent and then also yield the manufactured FunctionResultContent, either instead of or in addition to having it in the chat history. That way at least they'd still be paired up. I suspect some services would still balk, though, at ending up with the same tool call ID multiple times in the same history. We could explore other combinations, as well.

cc: @SteveSandersonMS in case he has other ideas

@stephentoub
Copy link
Member

Of course, I could implement the whole thing myself, but what's the point of having it if it doesn't serve this purpose?

For this part specifically, though, it sounds like you actually want much more control over the invocation. Presumably you want to know not just that a function has been requested, but you'd want to display in your UI details of the invocation, that it's currently in-flight, when it's ended, maybe even details about what was computed, and I don't see FunctionInvokingChatClient evolving to serve all possible such cases. At some point, for specific scenarios where you want more control, you'll likely want to process the FunctionCallContent yourself.

@Gigabyte0x1337
Copy link
Author

I appreciate your quick response.
In my view, the FunctionInvokingChatClient simply invokes the actions like a proxy. I expected it to still retain the full stream without removing the function calls because, without the FunctionInvokingChatClient, it just passes them through. Currently, I have everything custom—the entire stack from API call to frontend. I was hoping to refactor it using some existing libraries to improve maintainability.

Thank you anyway. I will see if I can implement my own solution then.

@SteveSandersonMS
Copy link
Member

In terms of performing UI updates, the way I prefer to approach it is having the UI state computed from the message history (i.e., the IList<ChatMessage> passed as the first arg to CompleteStreamingAsync).

As long as the UI knows to refresh when function calls are performed, it will have access to the FunctionCallContent and FunctionResultContent instances that are added into this message history, and can show them in the right place relative to regular user/assistant messages.

The reason for it is called out here: We're going to emit all StreamingChatMessage items upstream, even ones that represent function calls

I find this code comment really confusing. Is there a typo in here making it say the opposite of what it means, or am I just failing the parse the sentence properly? It seems to say "we are going to emit the FCCs, because doing so can lead to a problem" (which is weird), and then the code proceeds to strip out and hence not emit the FCCs. Should the comment say "we are not going to emit the FCCs..."? I'm not even sure which direction is "upstream" (perhaps phrasing it as "back to the caller" or "to the inner client" would be clearer).

Another option would be to yield the FunctionCallContent and then also yield the manufactured FunctionResultContent, either instead of or in addition to having it in the chat history. That way at least they'd still be paired up. I suspect some services would still balk, though, at ending up with the same tool call ID multiple times in the same history.

That sounds good. But why would we have the same tool call IDs multiple times? Wouldn't we also remove the existing code that adds FCCs/FRCs into history when streaming, so they only get added via the new code path?

@SteveSandersonMS
Copy link
Member

I find this code comment really confusing.

Oh wait. It seems you are already rephrasing that comment in #5910. Your updated phrasing is clearer.

@stephentoub
Copy link
Member

But why would we have the same tool call IDs multiple times? Wouldn't we also remove the existing code that adds FCCs/FRCs into history when streaming, so they only get added via the new code path?

Interesting. So change streaming to never add the FCC/FRC content to the history, and KeepFunctionCallingMessages for streaming means whether they're yielded or not rather than whether they're stored or not?

@SteveSandersonMS
Copy link
Member

That is what I was thinking but haven't gone as far as to validate how the resulting consumer code would look.

Thinking one step further, it might not work. We'd expect people to add the streamed response to history using ToChatResponseAsync, which returns exactly one ChatResponse, which would contain only one ChatMessage (assuming we don't change the meaning of Choices), and that can have only one ChatRole. So in effect the entire streamed response can have only one ChatRole. That wouldn't suffice to model FCCs and FRCs because they need different roles (assistant and tool, respectively).

We'd need to change ToChatResponseAsync to ToChatResponsesAsync or possibly change ChatResponse so it can contain multiple messages per choice, both of which are pretty unpleasant.

Or perhaps replace ToChatResponseAsync with some other API like:

void AppendToHistory(this IEnumerable<ChatResponseUpdate> updates, IList<ChatMessage> history);

... which could create as many ChatMessages in the history as it needs. Whether the payoff in terms of clarifying how FCCs/FRCs work inside streaming is worth it, I'm not convinced it is.

I know for sure that doing this change would break some apps and invalidate some code that otherwise looks fine. For example the new chat template concatenates streamed updates manually because it was easier and shorter to do so, and that would start discarding the FCCs/FRCs unless the logic was updated. So there is a case that having FunctionInvokingChatClient append them to history itself is cleaner, and it's certainly more consistent with the non-streaming case.

@stephentoub
Copy link
Member

Thanks. And yeah, those all relate to why I think the current behavior might be the safer default but have an additional knob to let you opt-into alternate behavior, such as what you suggest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants