Fix vllm sampling params #2871

malte-aws · 2025-01-06T13:58:29Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

Updates loading of generation config from file.
Previously the _load_generation_config function did not load all configuration from a generation_config.json file.

With this pull request _load_generation_config loads all configuration from a generation_config.json file. The _prepare_generation_config now uses the sampling parameters loaded from file and overwrites the ones provided in the request config.

This allows to configure guided decoding in a generation_config.json file. Here is an example generation_config.json file:

{
  "bos_token_id": 151643,
  "do_sample": true,
  "eos_token_id": 151645,
  "max_new_tokens": 2048,
  "pad_token_id": 151643,
  "temperature": 0.01,
  "top_k": 1,
  "top_p": 0.001,
  "transformers_version": "4.46.0.dev0",
  "guided_decoding": {
    "backend":"lm-format-enforcer",
    "json":{
      "type": "object",
      "properties": {
        "foo": { "type": "string"},
      },
      "required": ["foo"]
    }
  }
}

…fig as default values.

malte-aws added 4 commits January 6, 2025 11:11

Fix vllm sampling parameters filter

5418f62

Fixes loading generation config from nested dicts using from_optional

7db491e

Fixes formatting and removes print statements

f5e8984

Updates preparation of generation config to use loaded generation con…

5bfafeb

…fig as default values.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix vllm sampling params #2871

Fix vllm sampling params #2871

malte-aws commented Jan 6, 2025

Fix vllm sampling params #2871

Are you sure you want to change the base?

Fix vllm sampling params #2871

Conversation

malte-aws commented Jan 6, 2025

PR type

PR information