Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local LLM Functionary-7b-v2.1-GGUF and Extended OpenAI Conversation #174

Open
FutureProofHomes opened this issue Mar 16, 2024 · 15 comments
Open

Comments

@FutureProofHomes
Copy link

FutureProofHomes commented Mar 16, 2024

@jekalmin I am running Functionary-7b-v2.1-GGUF model via Python Bindings for llama.cpp. Here is an example of my python script successfully calling the LLM and returning the correct response for get_current_weather:

image

The problem is, when I issue an almost identical request via Home Assistant and your Extended OpenAI Conversation I get the following error:
image

Here are the Home Assistant Logs:

2024-03-16 01:27:32.831 INFO (MainThread) [custom_components.extended_openai_conversation] Prompt for /data/models/huggingface/models--meetkai--functionary-7b-v2.1-GGUF/snapshots/4386d5a19700bd0ae1582574e3de13a218fb1c8e/functionary-7b-v2.1.q4_0.gguf: [{'role': 'system', 'content': "I want you to act as smart home manager of Home Assistant.\nI will provide information of smart home along with a question, you will truthfully make correction or answer using information provided in one sentence in everyday language.\n\nCurrent Time: 2024-03-16 01:27:32.826442-04:00\n\nAvailable Devices:\n```csv\nentity_id,name,state,aliases\nweather.clusterhome,Local Weather,partlycloudy,\n```\n\nThe current state of devices is provided in available devices.\nUse execute_services function only for requested action, not for current states.\nDo not execute service without user's confirmation.\nDo not restate or appreciate what user says, rather make a quick inquiry."}, {'role': 'user', 'content': 'What is the weather in New York?’}]
2024-03-16 01:27:37.786 INFO (MainThread) [custom_components.extended_openai_conversation] Response {'id': 'chatcmpl-05d7ce5d-c3d2-4fd1-ac71-1af5266ea5d4', 'choices': [{'finish_reason': 'tool_calls', 'index': 0, 'message': {'role': 'assistant', 'function_call': {'arguments': '{}', 'name': ' get_current_weather'}, 'tool_calls': [{'id': 'call_zrdjmFbhRmjp7l9010Jvr4EP', 'function': {'arguments': '{}', 'name': ' get_current_weather'}, 'type': 'function'}]}}], 'created': 1710566857, 'model': '/data/models/huggingface/models--meetkai--functionary-7b-v2.1-GGUF/snapshots/4386d5a19700bd0ae1582574e3de13a218fb1c8e/functionary-7b-v2.1.q4_0.gguf', 'object': 'chat.completion', 'usage': {'completion_tokens': 2, 'prompt_tokens': 349, 'total_tokens': 351}}
2024-03-16 01:27:37.790 ERROR (MainThread) [custom_components.extended_openai_conversation] function ' get_current_weather' does not exist
Traceback (most recent call last):
  File "/config/custom_components/extended_openai_conversation/__init__.py", line 196, in async_process
    query_response = await self.query(user_input, messages, exposed_entities, 0)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/extended_openai_conversation/__init__.py", line 384, in query
    return await self.execute_tool_calls(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/extended_openai_conversation/__init__.py", line 477, in execute_tool_calls
    raise FunctionNotFound(function_name)
custom_components.extended_openai_conversation.exceptions.FunctionNotFound: function ' get_current_weather' does not exist

If you inspect the above logs closely you'll see that it looks like Extended OpenAI Conversation is not passing the spec I defined for get_current_weather. Basically the tools or functions part of the payload is not being passed to the LLM I think? What am I doing wrong here? Any tips?

Btw, here are the options I'm passing into your plug-in. Notice that I am in fact defining the ' get_current_weather' spec:
image

@rfam13
Copy link

rfam13 commented Mar 19, 2024

https://huggingface.co/TheBloke/NexusRaven-V2-13B-GGUF

this is the only one I have had any luck calling functions, specifically the execute_services. This does work if you instruct it what to do in the prompt. I am having success even with Q2.

EDIT: Sorry misread the title, thought you were using LocalAI.

In case you were wondering, the 13b runs on my i9 14th gen with 64gb ddr5 with a response time between 5-10seconds after the initial model loading. I assume this would be greatly reduced by using a GPU.

@thisIsLoading
Copy link

@rfam13 you mind sharing what you did? did you just install localAI with that model and hooked it up with HA? was there some tweaking involved regarding the prompt format?

@jekalmin
Copy link
Owner

@FutureProofHomes

Thanks for reporting an issue.
It seems LLM tries to call get_current_weather function while your function name is get_current_weather.

In order to match get_current_weather and get_current_weather, you should change the code like below.

function_name = tool.function.name.strip()

and here like below.

function_name = message.function_call.name.strip()

Ideally, LLM doesn't contain extra space in function names, but it happens.
Since it's a workaround rather than solution, I didn't apply this to the code yet.

@FutureProofHomes
Copy link
Author

FutureProofHomes commented Mar 26, 2024

Thank you @jekalmin. I actually found this solution a few days ago and should have closed the ticket.

Perhaps we could consider merging this change with main because it wouldn't harm strings that don't have a trailing whitespace. e.g. the workaround doesn't have a negative impact that breaks any existing code, right?

@jekalmin
Copy link
Owner

@FutureProofHomes

Yes, it should not break any existing code.
Although it is not ideal, as you suggested, it would help digging Local LLM.

I would probably merge this in the next release.

@FutureProofHomes
Copy link
Author

FutureProofHomes commented Mar 29, 2024

I got everything working by sacrificing history capability. The pydantic schema behind the Functionary model isn't exactly the same as the OpenAI payloads I guess. These are the changes I made to __init.py__.

Commented out self.history:
image

Commented out self.history again and removed exclude_none=True param from model_dump():
image

Stripped whitespaces and again removed exclude_none=True param from model_dump() and had to comment out appending tool calls to message.
image

@jekalmin would you like me to create a Works with Functionary branch on the repo?

NOTE: I also had to slightly modify Functionary's validation logic as well. You can see that in ticket here: MeetKai/functionary#136

@FutureProofHomes
Copy link
Author

FutureProofHomes commented Mar 29, 2024

@jekalmin it would be really awesome if you could get this up and running on your local machine, then help figure out how to get history working. You could then claim your extension works with a fully local LLM. It can control lights, locks, run scripts, and report the status of the home. If we fine-tune this model with more HA data it could be as powerful as OpenAI.

image

@thisIsLoading
Copy link

@FutureProofHomes impressive. what would you need history for then? isnt it constrained by the small context size anyway?

@FutureProofHomes
Copy link
Author

FutureProofHomes commented Mar 29, 2024

My current context threshold is set to 13000 with this model. Should be plenty enough to handle normal conversation history, I think. Here's an example of how it behaves currently without history. In this case the model doesn't know what I mean when I say, turn "them on" because it has no historical context.

image

@FutureProofHomes
Copy link
Author

In case anyone is interested:
image

@thisIsLoading
Copy link

ah, got it, yeah, being able to refer to at least the last couple messages is important.

re your context, the model itself only supports 8k context size:

image

it should still be plenty to remember the last few messages though.. agreed

@FutureProofHomes
Copy link
Author

Good catch. Will set my Extended OpenAI Conversation config to 8k. Thanks.

@FutureProofHomes
Copy link
Author

FYI - I believe this is our root cause for the issue. I'm fine if we think we should close this issue.

@jekalmin
Copy link
Owner

jekalmin commented Apr 3, 2024

Thanks @FutureProofHomes for work done!
Please give me time to catch up where you are.

BramNH added a commit to BramNH/extended_openai_conversation that referenced this issue Apr 5, 2024
…h the Functionary LLM. Source: fixed by FutureProofHomes in jekalmin#174 (comment)
@Anto79-ops
Copy link

hello, I wanted to mention im also interested in this. I'm using LocalAI and this model works most of the time using it as is, with this integration:

https://huggingface.co/macadeliccc/laser-dolphin-mixtral-2x7b-dpo

I'm trying to get the new Llama-3-instruct 70b to work with this intergration. It does not seem to work well, becuse I think the function calling syntax is a bit different. Anyone know how to modify this model.yaml to work with this integrations functions?

name: llama3-8b-instruct
mmap: true
parameters:
  model: huggingface://second-state/Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct-Q5_K_M.gguf

template:
  chat_message: |
    <|start_header_id|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}<|end_header_id|>

    {{ if .FunctionCall -}}
    Function call:
    {{ else if eq .RoleName "tool" -}}
    Function response:
    {{ end -}}
    {{ if .Content -}}
    {{.Content -}}
    {{ else if .FunctionCall -}}
    {{ toJson .FunctionCall -}}
    {{ end -}}
    <|eot_id|>
  function: |
    <|start_header_id|>system<|end_header_id|>

    You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
    <tools>
    {{range .Functions}}
    {'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
    {{end}}
    </tools>
    Use the following pydantic model json schema for each tool call you will make:
    {'title': 'FunctionCall', 'type': 'object', 'properties': {'arguments': {'title': 'Arguments', 'type': 'object'}, 'name': {'title': 'Name', 'type': 'string'}}, 'required': ['arguments', 'name']}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
    Function call:
  chat: |
    <|begin_of_text|>{{.Input }}
    <|start_header_id|>assistant<|end_header_id|>
  completion: |
    {{.Input}}
context_size: 8192
f16: true
stopwords:
- <|im_end|>
- <dummy32000>
- "<|eot_id|>"
usage: |
      curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
          "model": "llama3-8b-instruct",
          "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}]
      }'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants