-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
langchain中Xinference Chat的支持 #1510
Comments
Xinference 是兼容 OpenAI 的 API,你可以在 LangChain使用 OpenAI API来访问 Xinference。 |
谢谢,根据您的提示,我调起来了text-generation和chat的接口,有个新的问题,不知道是不是bug 客户端: llm = OpenAI(model="qwen1.5-chat-14B", temperature=0.9, max_tokens=30 * 1024, streaming=True)
query_result = llm.invoke(input="你的名字", temperature=0.9, max_tokens=30 * 1024, logit_bias=None, stream=True) 这里需要指定logit_bias=None(也可能是langchain的问题) 服务端这边判断的是None,如果我不显式指定logit_bias为None,就会返回给客户端501 |
目前 logit_bias 还没实现,所以如果传了值会提示 501 错误。 |
感谢回复 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@codingl2k1 您好:
llm = Xinference(
server_url="server_url",
model_uid="qwen1.5-chat-14B", # replace model_uid with the model UID return from launching the model,
temperature=0.1, max_tokens=30 * 1024, stream=False, verbose=True
)
这个是源码:
def _call(
self,
prompt: str,
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> str:
"""Call the xinference model and return the output.
里面没有支持model.chat,所以现在chat只能用原生的xinference.Client?
The text was updated successfully, but these errors were encountered: