[Bug]: ZeroDivisionError caused by dividing by pbar.format_dict["elapsed"] in LLM._run_engine() when use_tqdm=True

### Your current environment

(abbreviated)

### 🐛 Describe the bug

According to the vLLM's source code(`vllm/entrypoints/llm.py:1746`), there is a `tqdm`(progress bar library)-related code like below.

https://github.com/vllm-project/vllm/blob/428bc7bf1c54674956dd24f00db43dbcf3655c4d/vllm/entrypoints/llm.py#L1746C54-L1746C81

The given code calculates `total_in_toks` divided by `pbar.format_dict["elapsed"]` in order to calculate the value of the variable `in_spd`. However, according to the [`tqdm` library's documentation](https://tqdm.github.io/docs/tqdm/), that `elapsed` field indicates the number of seconds elapsed since start. If that value is zero (or extremely close to zero) at the time of division, this leads to a `ZeroDivisionError`.

In my situation (using the `DeepSeek‑OCR` model via vLLM), all output from vLLM is suppressed (via `quiet_stdio()`), and when I call `llm.generate(..., use_tqdm=True)` (the default value of `use_tqdm` is `True` currently) I consistently get `ZeroDivisionError: integer division or modulo by zero` error with the following traceback logs:
```
╭────────────────────────────────── Traceback (most recent call last) ───────────────────────────────────╮
│ /home/grlee/ghrepo/pdfscribe2ds/ocr_pipeline/pipeline.py:72 in run_pdf_pipeline                        │
│                                                                                                        │
│   69 │   for img_path in image_paths:                                                                  │
│   70 │   │   try:                                                                                      │
│   71 │   │   │   with quiet.quiet_stdio():                                                             │
│ ❱ 72 │   │   │   │   raw_md = ocr.image_to_markdown(img_path)                                          │
│   73 │   │   │                                                                                         │
│   74 │   │   │   page_stem = img_path.stem  # e.g. "page_001"                                          │
│   75 │   │   │   assets_dir = md_out_dir / f"{page_stem}_assets"                                       │
│                                                                                                        │
│ /home/grlee/ghrepo/pdfscribe2ds/ocr_pipeline/ocr_engine.py:67 in image_to_markdown                     │
│                                                                                                        │
│   64 │   │   │   skip_special_tokens=False,                                                            │
│   65 │   │   )                                                                                         │
│   66 │   │                                                                                             │
│ ❱ 67 │   │   outputs = self.llm.generate(model_input, sampling_param)  # type: ignore                  │
│   68 │   │   text_output = outputs[0].outputs[0].text                                                  │
│   69 │   │   return text_output                                                                        │
│   70                                                                                                   │
│                                                                                                        │
│ /home/grlee/ghrepo/pdfscribe2ds/.venv/lib/python3.12/site-packages/vllm/entrypoints/llm.py:446 in      │
│ generate                                                                                               │
│                                                                                                        │
│    443 │   │   │   priority=priority,                                                                  │
│    444 │   │   )                                                                                       │
│    445 │   │                                                                                           │
│ ❱  446 │   │   outputs = self._run_engine(use_tqdm=use_tqdm)                                           │
│    447 │   │   return self.engine_class.validate_outputs(outputs, RequestOutput)                       │
│    448 │                                                                                               │
│    449 │   def _get_modality_specific_lora_reqs(                                                       │
│                                                                                                        │
│ /home/grlee/ghrepo/pdfscribe2ds/.venv/lib/python3.12/site-packages/vllm/entrypoints/llm.py:1746 in     │
│ _run_engine                                                                                            │
│                                                                                                        │
│   1743 │   │   │   │   │   │   │   n = len(output.outputs)                                             │
│   1744 │   │   │   │   │   │   │   assert output.prompt_token_ids is not None                          │
│   1745 │   │   │   │   │   │   │   total_in_toks += len(output.prompt_token_ids) * n                   │
│ ❱ 1746 │   │   │   │   │   │   │   in_spd = total_in_toks / pbar.format_dict["elapsed"]                │
│   1747 │   │   │   │   │   │   │   total_out_toks += sum(                                              │
│   1748 │   │   │   │   │   │   │   │   len(stp.token_ids) for stp in output.outputs                    │
│   1749 │   │   │   │   │   │   │   )                                                                   │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯

```

The dump above was generated from [the code in my project](https://github.com/KnightChaser/pdfscribe2ds/blob/9f083a51aa74d7ff37ebd3511857b16aff3ddd19/ocr_pipeline/ocr_engine.py#L67). Currently, I just disable the `tqdm` option to `False` to avoid the error.

I recommend replacing that `llm.py's code with `max(pbar.format_dict["elapsed"], 1e-6)`, instead of simply using `pbar.format_dict["elapsed"]` when dividing, just in case, to safely guard against zero.

Note: Currently, I'm using the `vllm==0.11.1rc6.dev107+g878fd5a16.cu129` version of vLLM because there is an ongoing version issue regarding to utilizing the DeepSeek-OCR model through vLLM framework. (https://github.com/vllm-project/vllm/issues/28030) So I'm not sure whether this bug is already addressed or not during the current development procedure, but at least I couldn't find related information on the issue tab, so I'm leaving this one.

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: ZeroDivisionError caused by dividing by pbar.format_dict["elapsed"] in LLM._run_engine() when use_tqdm=True #28097

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: ZeroDivisionError caused by dividing by pbar.format_dict["elapsed"] in LLM._run_engine() when use_tqdm=True #28097

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions