[Bug]: Streaming broken in new support for OpenAI assistants?

### What happened?

With #2842 assistant support was just added by @krrishdholakia 🙏

I just can't get streaming to work with the current integration, and I'm not sure if it's because of the missing documentation, my wrongdoing or because of a bug.

The following is the code I tried to get the streaming feedback with. This code is built upon the example of the official documentation of [custom callbacks](https://litellm.vercel.app/docs/observability/custom_callback), extended by the example code of the Pull Request to call an OpenAI assistant (https://github.com/BerriAI/litellm/pull/3455#issue-2279255048).

I do get the final, complete and unstreamed answer from the llm, but no handler calls are logged.


```
from litellm import get_assistants, create_thread, add_message, run_thread, get_messages, MessageData
import os, litellm

from litellm.integrations.custom_logger import CustomLogger

class MyCustomHandler(CustomLogger):
    def log_pre_api_call(self, model, messages, kwargs):
        print(f"Pre-API Call")

    def log_post_api_call(self, kwargs, response_obj, start_time, end_time):
        print(f"Post-API Call")

    def log_stream_event(self, kwargs, response_obj, start_time, end_time):
        print(f"On Stream")

    def log_success_event(self, kwargs, response_obj, start_time, end_time):
        print(f"On Success")

    def log_failure_event(self, kwargs, response_obj, start_time, end_time):
        print(f"On Failure")

    #### ASYNC #### - for acompletion/aembeddings

    async def async_log_stream_event(self, kwargs, response_obj, start_time, end_time):
        print(f"On Async Streaming")

    async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
        print(f"On Async Success")

    async def async_log_failure_event(self, kwargs, response_obj, start_time, end_time):
        print(f"On Async Success")

customHandler = MyCustomHandler()

litellm.callbacks = [customHandler]


assistants = get_assistants(custom_llm_provider="openai")

## get the first assistant ###
assistant_id = assistants.data[0].id

new_thread =  create_thread(
        custom_llm_provider="openai",
)

thread_id = new_thread.id

# add message to thread
message: MessageData = {"role": "user", "content": "Who are you?"}  # type: ignore

added_message = add_message(
    thread_id=new_thread.id, custom_llm_provider="openai", **message
)

run = run_thread(
    custom_llm_provider="openai", thread_id=thread_id, assistant_id=assistant_id, stream=True
)

# print the final, complete response
print(get_messages(custom_llm_provider="openai", thread_id=thread_id, assistant_id=assistant_id))
```

### Relevant log output

_No response_

### Twitter / LinkedIn details

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Streaming broken in new support for OpenAI assistants? #3669

What happened?

Relevant log output

Twitter / LinkedIn details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Streaming broken in new support for OpenAI assistants? #3669

Description

What happened?

Relevant log output

Twitter / LinkedIn details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions