LiteLLM + Gemini 2.5 Pro: cached_tokens=None crashes Agents SDK with Pydantic int-validation error #758

ruidazeng · 2025-05-26T02:17:16Z

Please read this first

Have you read the docs? Yes – Agents SDK docs
Have you searched for related issues? Yes – nothing covering the cached_tokens=None → Pydantic error with LiteLLM + Gemini 2.5.

Describe the bug

When Runner.run() executes an agent whose model is a Gemini 2.5-pro instance wrapped by LiteLLM, the pipeline dies during cost-calculation with

Error in Explainer (revision): 1 validation error for InputTokensDetails
cached_tokens
  Input should be a valid integer [type=int_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.11/v/int_type

The log shows LiteLLM repeatedly inserting cached_tokens=None into the request metadata; Pydantic 2.11’s InputTokensDetails model rejects None because the field is typed as int.
The fallback code then silently downgrades the model to o3-2025-04-16, masking the issue in production.

Debug information

Item	Value
Agents SDK	`v0.0.16`
LiteLLM	`v1.71.1`
Python	3.12.7
Pydantic	2.11.03
OS	macOS 14.4 (Apple Silicon)
Model string	`gemini/gemini-2.5-pro-preview-05-06`

Repro steps

"""
Minimal repro for cached_tokens=None crash with LiteLLM + Gemini 2.5.

Save as repro.py and run `python repro.py` (GOOGLE_API_KEY or GEMINI_API_KEY must be set).
"""

from agents import Agent, Runner
from agents.extensions.models.litellm_model import LitellmModel
import litellm, os, asyncio, json

# Suppress NULL fields – does *not* avoid the bug
litellm.drop_params = True

gemini = LitellmModel(
    model="gemini/gemini-2.5-pro-preview-05-06",
    api_key=os.getenv("GOOGLE_API_KEY") or os.getenv("GEMINI_API_KEY")
)

echo_agent = Agent(
    name="Echo",
    instructions="Return the user's message verbatim in JSON: {\"echo\": \"...\"}",
    model=gemini,
)

async def main():
    # Any prompt triggers the validation failure
    result = await Runner.run(echo_agent, [{"role": "user", "content": "ping"}])
    print(result.final_output)

asyncio.run(main())

Observed output

Error in Explainer (revision): 1 validation error for InputTokensDetails
cached_tokens
  Input should be a valid integer ...

Commenting out the Gemini/LiteLLM model and falling back to any OpenAI model (o3-2025-04-16, gpt-4o) makes the script succeed, confirming the issue is isolated to the Gemini + LiteLLM path.

Expected behavior

LiteLLM should pass a valid integer (e.g., 0) for cached_tokens instead of None, or
The Agents SDK should coerce None → 0 before instantiating InputTokensDetails.

Either fix would allow Gemini 2.5 to run without crashing and would eliminate silent model downgrades.

The text was updated successfully, but these errors were encountered:

DanielHashmi · 2025-05-26T14:03:57Z

I'm also getting this error:

pydantic_core._pydantic_core.ValidationError: 1 validation error for InputTokensDetails cached_tokens Input should be a valid integer [type=int_type, input_value=None, input_type=NoneType] For further information visit https://errors.pydantic.dev/2.11/v/int_type

cassiusoat · 2025-05-29T10:14:29Z

I met a similar problem.

import asyncio
import os
from agents import Agent, Runner
from agents.extensions.models.litellm_model import LitellmModel


gemini_api_key = os.getenv("GEMINI_API_KEY")


agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant.",
    model=LitellmModel(
        model="gemini/gemini-2.5-flash-preview-04-17",
        api_key=gemini_api_key
    )
)


async def main():
    result = await Runner.run(agent, "write code for saying hi from LiteLLM")
    print(result.final_output)

if __name__ == "__main__":
    asyncio.run(main())

pydantic_core._pydantic_core.ValidationError: 1 validation error for InputTokensDetails cached_tokens Input should be a valid integer [type=int_type, input_value=None, input_type=NoneType] For further information visit https://errors.pydantic.dev/2.11/v/int_type Unclosed client session client_session: <aiohttp.client.ClientSession object at 0x10fba3850> Unclosed connector connections: ['deque([(<aiohttp.client_proto.ResponseHandler object at 0x10fb5a660>, 6293.459826416)])'] connector: <aiohttp.connector.TCPConnector object at 0x10fba2d90>

DanielHashmi · 2025-06-01T00:56:28Z

This might help! #760 (comment)

handrew · 2025-06-03T05:15:01Z

~~What version are you on? This might have been fixed with #735.~~

Never mind I see you're on the latest version. Don't know what it could be.

handrew · 2025-06-04T05:41:59Z

Oh wait I think #735 was just released in 0.0.17. Might be your issue?

ruidazeng added the bug Something isn't working label May 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LiteLLM + Gemini 2.5 Pro: cached_tokens=None crashes Agents SDK with Pydantic int-validation error #758

LiteLLM + Gemini 2.5 Pro: cached_tokens=None crashes Agents SDK with Pydantic int-validation error #758

ruidazeng commented May 26, 2025 •

edited

Loading

DanielHashmi commented May 26, 2025

Uh oh!

cassiusoat commented May 29, 2025

Uh oh!

DanielHashmi commented Jun 1, 2025

Uh oh!

handrew commented Jun 3, 2025 •

edited

Loading

Uh oh!

handrew commented Jun 4, 2025

Uh oh!

LiteLLM + Gemini 2.5 Pro: cached_tokens=None crashes Agents SDK with Pydantic int-validation error #758

LiteLLM + Gemini 2.5 Pro: cached_tokens=None crashes Agents SDK with Pydantic int-validation error #758

Comments

ruidazeng commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Please read this first

Describe the bug

Debug information

Repro steps

Expected behavior

DanielHashmi commented May 26, 2025

I'm also getting this error:

Uh oh!

cassiusoat commented May 29, 2025

Uh oh!

DanielHashmi commented Jun 1, 2025

Uh oh!

handrew commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

handrew commented Jun 4, 2025

Uh oh!

ruidazeng commented May 26, 2025 •

edited

Loading

handrew commented Jun 3, 2025 •

edited

Loading