langchain中Xinference Chat的支持 #1510

buptzyf · 2024-05-17T01:51:49Z

我看到这个pr被关了https://github.com/langchain-ai/langchain/pull/12702，所以后续还继续提交pr么
这个是没有使用chat么？
llm = Xinference(
server_url="server_url",
model_uid="qwen1.5-chat-14B", # replace model_uid with the model UID return from launching the model,
temperature=0.1, max_tokens=30 * 1024, stream=False, verbose=True
)

这个是源码：
def _call(
self,
prompt: str,
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> str:
"""Call the xinference model and return the output.

    Args:
        prompt: The prompt to use for generation.
        stop: Optional list of stop words to use when generating.
        generate_config: Optional dictionary for the configuration used for
            generation.

    Returns:
        The generated string by the model.
    """
    model = self.client.get_model(self.model_uid)

    generate_config: "LlamaCppGenerateConfig" = kwargs.get("generate_config", {})

    generate_config = {**self.model_kwargs, **generate_config}

    if stop:
        generate_config["stop"] = stop

    if generate_config and generate_config.get("stream"):
        combined_text_output = ""
        for token in self._stream_generate(
            model=model,
            prompt=prompt,
            run_manager=run_manager,
            generate_config=generate_config,
        ):
            combined_text_output += token
        return combined_text_output

    else:
        completion = model.generate(prompt=prompt, generate_config=generate_config)
        return completion["choices"][0]["text"]

def _stream_generate(
    self,
    model: Union["RESTfulGenerateModelHandle", "RESTfulChatModelHandle"],
    prompt: str,
    run_manager: Optional[CallbackManagerForLLMRun] = None,
    generate_config: Optional["LlamaCppGenerateConfig"] = None,
) -> Generator[str, None, None]:
    """
    Args:
        prompt: The prompt to use for generation.
        model: The model used for generation.
        stop: Optional list of stop words to use when generating.
        generate_config: Optional dictionary for the configuration used for
            generation.

    Yields:
        A string token.
    """
    streaming_response = model.generate(
        prompt=prompt, generate_config=generate_config
    )
    for chunk in streaming_response:
        if isinstance(chunk, dict):
            choices = chunk.get("choices", [])
            if choices:
                choice = choices[0]
                if isinstance(choice, dict):
                    token = choice.get("text", "")
                    log_probs = choice.get("logprobs")
                    if run_manager:
                        run_manager.on_llm_new_token(
                            token=token, verbose=self.verbose, log_probs=log_probs
                        )
                    yield token

里面没有支持model.chat，所以现在chat只能用原生的xinference.Client?

The text was updated successfully, but these errors were encountered:

codingl2k1 · 2024-05-20T07:47:54Z

Xinference 是兼容 OpenAI 的 API，你可以在 LangChain使用 OpenAI API来访问 Xinference。

buptzyf · 2024-05-21T01:19:40Z

Xinference 是兼容 OpenAI 的 API，你可以在 LangChain使用 OpenAI API来访问 Xinference。

谢谢，根据您的提示，我调起来了text-generation和chat的接口，有个新的问题，不知道是不是bug

客户端：

llm = OpenAI(model="qwen1.5-chat-14B", temperature=0.9, max_tokens=30 * 1024, streaming=True)
query_result = llm.invoke(input="你的名字", temperature=0.9, max_tokens=30 * 1024, logit_bias=None, stream=True)

这里需要指定logit_bias=None(也可能是langchain的问题)

Xinference 服务端：

服务端这边判断的是None，如果我不显式指定logit_bias为None，就会返回给客户端501

codingl2k1 · 2024-05-21T07:43:50Z

目前 logit_bias 还没实现，所以如果传了值会提示 501 错误。

buptzyf · 2024-05-22T06:49:16Z

目前 logit_bias 还没实现，所以如果传了值会提示 501 错误。

感谢回复

buptzyf added the question Further information is requested label May 17, 2024

XprobeBot added this to the v0.11.1 milestone May 17, 2024

buptzyf changed the title ~~langchain中ChatModel的支持~~ langchain中Xinference Chat的支持 May 17, 2024

XprobeBot modified the milestones: v0.11.1, v0.11.2 May 17, 2024

buptzyf closed this as completed May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

langchain中Xinference Chat的支持 #1510

langchain中Xinference Chat的支持 #1510

buptzyf commented May 17, 2024

codingl2k1 commented May 20, 2024

buptzyf commented May 21, 2024 •

edited

codingl2k1 commented May 21, 2024

buptzyf commented May 22, 2024

langchain中Xinference Chat的支持 #1510

langchain中Xinference Chat的支持 #1510

Comments

buptzyf commented May 17, 2024

codingl2k1 commented May 20, 2024

buptzyf commented May 21, 2024 • edited

codingl2k1 commented May 21, 2024

buptzyf commented May 22, 2024

buptzyf commented May 21, 2024 •

edited