Skip to content

Eval bug: Embedding output differs significantly between b4712 and b4713 #14848

@lingchingteng

Description

@lingchingteng

Name and Version

version: 5713 (4c9fdfb)
built with clang version 18.1.8 for x86_64-pc-windows-msvc

Operating systems

Windows

GGML backends

CUDA

Hardware

CPU Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
GPU NVIDIA Quadro RTX 5000 with Max-Q Design

Models

bge-m3

Problem description & steps to reproduce

The embedding results are very different between commit b4712 and b4713.

Server command used:

.\llama-server.exe --hf-repo gpustack/bge-m3-GGUF --hf-file bge-m3-Q4_K_M.gguf --embedding -ngl 99

POST request:

curl.exe -d "{\"input\": \"Hello\"}" http://127.0.0.1:8080/v1/embeddings

Please let me know if this behavior is expected or if there was a change in the embedding logic between these versions.

First Bad Commit

#14217

Relevant log output

no relevant log

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions