perf: Parallelize LLM calls in extract_keywords_and_rewrite to reduce latency

## Problem
`extract_keywords_and_rewrite` makes 4 sequential Gemini calls per request:
1. `detect_intents` (raw query)
2. `rewrite_with_history`
3. `call_gemini_for_keywords`
4. `detect_intents` (rewritten query)

Calls 1 and 2 are independent — both only need the raw query. Running them with `asyncio.gather` saves ~1-2s per request.

## Fix
```python
intents0, effective = await asyncio.gather(
    call_gemini_detect_intents(state["query"], history),
    call_gemini_rewrite_with_history(state["query"], history),
)
```
Then run `call_gemini_for_keywords` and second `detect_intents` in a second gather.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Parallelize LLM calls in extract_keywords_and_rewrite to reduce latency #78

Problem

Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

perf: Parallelize LLM calls in extract_keywords_and_rewrite to reduce latency #78

Description

Problem

Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions