Improve receive responses from the bot to almost instantly

### Description
The current RAG pipeline has a high response latency (3-5 minutes) due to the slow inference speed of the local LLM. This leads to client timeouts and a poor user experience. This story aims to solve the UX problem by implementing streaming, allowing the user to see the response as it's being generated.

### Acceptance Criteria
- [ ] The API has a new `/chat/stream` endpoint that returns a `StreamingResponse`.
- [ ] The `RAGEngine` can generate and yield tokens in real-time.
- [ ] The `tg-gateway` client can consume the stream and progressively edit the message in Telegram.
- [ ] The user sees the first token of the response within 2-3 seconds of sending a message.

### Tasks
- [ ] `[Task]` ID: RAG-10 - `feat(api): Refactor the RAG API endpoint to support streaming responses.`
- [ ] `[Task]` ID: RAG-11 - `feat(agent): Modify RAGEngine to return an async generator.`
- [ ] `[Task]` ID: RAG-12 - `refactor(tg-gateway): Update RagApiClient to handle streaming HTTP responses.`
- [ ] `[Task]` ID: RAG-13 - `feat(tg-gateway): Implement a message handler that edits a Telegram message progressively.`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve receive responses from the bot to almost instantly #30

Description

Acceptance Criteria

Tasks

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Improve receive responses from the bot to almost instantly #30

Description

Description

Acceptance Criteria

Tasks

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions