Incorrect answer with openai compatible penalty parameters #238

Spycsh · 2024-10-17T08:20:10Z

System Info

Hi there, I met a bug that when using TGI Gaudi 2.0.5 with both meta-llama/Meta-Llama-3-8B-Instruct and Intel/neural-chat-7b-v3-3. When I set the default frequency/repetition/presence penalty parameters based on the openai format(https://platform.openai.com/docs/api-reference/completions/create), I got wrong answers. Here are the screenshots:

I then checked it on TGI CPU and I did not encounter the bug, so I suspect there is something wrong with TGI Gaudi. Could you please look at this issues?

Information

Docker
The CLI directly

Tasks

An officially supported command
My own modifications

Reproduction

Here is a minimum reproduction

model=Intel/neural-chat-7b-v3-3
hf_token=xxxx
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run

docker run -p 8080:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all \
 -e PT_HPU_LAZY_MODE=0 -e OMPI_MCA_btl_vader_single_copy_mechanism=none \
 -e HF_TOKEN=$hf_token --cap-add=sys_nice --ipc=host -e http_proxy=${http_proxy} -e https_proxy=${https_proxy} \
 ghcr.io/huggingface/tgi-gaudi:2.0.5 --model-id $model --max-input-tokens 1024 --max-total-tokens 2048

http_proxy= curl http://${host_ip}:8081/v1/chat/completions   -H "Content-Type: application/json"   -d '{
    "model": "tgi",
    "messages": [
      {
        "role": "user",
        "content": "What is deep Learning!"
      }
    ], "max_tokens":128,"temperature":0.01, "top_p":0.95, "frequency_penalty":0.0, "repetition_penalty":1.03, "presence_penalty":0.0 }'

The answer (missing spaces between words in the end):

{"id":"","object":"text_completion","created":1729153043,"model":"Intel/neural-chat-7b-v3-3","system_fingerprint":"2.0.4-native","choices":[{"index":0,"message":{"role":"assistant","content":"Deep learning refers to a subset of machinelearning techniques that use artificial neural networks (ANNs) with multiple layers for feature extraction and transformation. These algorithms are designed based on the structure, functioningsimilarityto human brain's neuronsand their connections in order tomimethe processof how humans learn from data by recognizing patterns within it without explicit programming rules or instructions being given beforehand; this makes them highly effective at handling complex tasks like image recognitionor natural language processing(NLP). The deeper these network structures get - meaning more hiddenlayers-themorecomplexpatternsthatcanbelearnedfromdataarepossible"},"logprobs":null,"finish_reason":"length"}],"usage":{"prompt_tokens":5,"completion_tokens":128,"total_tokens":133}}

Then I remove the repetition_penalty, only keep openai compatible frequency_penalty, presence_penalty

http_proxy= curl http://${host_ip}:8081/v1/chat/completions   -H "Content-Type: application/json"   -d '{
    "model": "tgi",
    "messages": [
      {
        "role": "user",
        "content": "What is deep Learning!"
      }
    ], "max_tokens":128,"temperature":0.01, "top_p":0.95, "frequency_penalty":0.0, "presence_penalty":0.0 }'

Still error:

{"id":"","object":"text_completion","created":1729153206,"model":"Intel/neural-chat-7b-v3-3","system_fingerprint":"2.0.4-native","choices":[{"index":0,"message":{"role":"assistant","content":"Deep learning refers to a subset of machinelearning techniques that use artificial neural networks (ANNs) with multiple layers for feature extraction and transformation. These algorithms are designed based on the structure, functioningsimilarityto human brain's neuronsand their connections in order tomimethe processof how humans learn from data or information through experience by recognizing patterns within large datasets without explicit programming rules defined beforehand; this allows them tounderstandcomplexrelationshipsbetween variables more effectively than traditionalmachine-learninglearningalgorithmswhich relyonlinearmodelsorrulebasedapproachesforpatternrecognition tasks such as image classification"},"logprobs":null,"finish_reason":"length"}],"usage":{"prompt_tokens":5,"completion_tokens":128,"total_tokens":133}}

Expected behavior

The answer should be well-formatted and correct.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect answer with openai compatible penalty parameters #238

Incorrect answer with openai compatible penalty parameters #238

Spycsh commented Oct 17, 2024 •

edited

Loading

Incorrect answer with openai compatible penalty parameters #238

Incorrect answer with openai compatible penalty parameters #238

Comments

Spycsh commented Oct 17, 2024 • edited Loading

System Info

Information

Tasks

Reproduction

Expected behavior

Spycsh commented Oct 17, 2024 •

edited

Loading