Problem
extract_keywords_and_rewrite makes 4 sequential Gemini calls per request:
detect_intents (raw query)
rewrite_with_history
call_gemini_for_keywords
detect_intents (rewritten query)
Calls 1 and 2 are independent — both only need the raw query. Running them with asyncio.gather saves ~1-2s per request.
Fix
intents0, effective = await asyncio.gather(
call_gemini_detect_intents(state["query"], history),
call_gemini_rewrite_with_history(state["query"], history),
)
Then run call_gemini_for_keywords and second detect_intents in a second gather.