Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to dev UI #300

Open
gorkem opened this issue May 7, 2024 · 9 comments
Open

Updates to dev UI #300

gorkem opened this issue May 7, 2024 · 9 comments
Assignees
Labels
CLI Topics related to the CLI *Salmon not pink ux Anything in the field of UX

Comments

@gorkem
Copy link
Contributor

gorkem commented May 7, 2024

Describe the problem you're trying to solve
Dev UI should be more helpful with helping the application developers integrate with the model.

Describe the solution you'd like

  • Generate/show example code as the parameters and prompts are entered.
  • A way to see JSON communicated between the server and responses
  • hide or drop the list of preferences to highlight more frequently used ones
  • Ability to see the README.md from the ModelKit if it has one
@gorkem gorkem added the enhancement New feature or request label May 7, 2024
@bmicklea bmicklea added ux Anything in the field of UX and removed enhancement New feature or request labels May 13, 2024
@annigro
Copy link

annigro commented May 16, 2024

@annigro
Copy link

annigro commented May 16, 2024

UX Design

@gorkem
Copy link
Contributor Author

gorkem commented May 21, 2024

Here is some more clarification on the requests.

POST /completion is an API specific to llama.cpp server. There is example code available for its usage which we will adjust for code generation feature.

POST /v1/chat/completions The endpoint for chat completion API is compatible with the OpenAI endpoint. We can use the existing OpenAPI libraries for code generation.

Options

The initial thought was to reduce the options to match OpenAI. However, with further thinking and considering the options are supported by both endpoints. We should continue to use all options but categorize them better.

Categories

Text Generation Controls

  • Temperature: Controls randomness.
  • Top_k: Limits token selection to the most probable.
  • Top_p: Limits token selection by cumulative probability.
  • Min_p: Sets a minimum probability threshold.
  • N_predict: Max tokens to predict. (This could be a category of its own)

Sampling and Diversity

  • Tfs_z: Tail-free sampling to control diversity.
  • Typical_p: Locally typical sampling for natural text.
  • Presence_penalty, frequency_penalty: Adjusts likelihood based on token presence and frequency.
  • Mirostat, mirostat_tau, mirostat_eta: Controls complexity and randomness.
  • Repeat_penalty, repeat_last_n: Controls repetition.

Advanced Settings and Customization

  • Grammar, json_schema: Enables grammar-based sampling with JSON schema.
  • api_key

Probability and Statistical Controls

  • N_probs: Provides probabilities for top N tokens.
  • Min_keep: Ensures a minimum number of tokens are returned.

single or multiple page

I have not really found a good reason to keep the multiple page. It is unintuitive to go back and forth for changing the parameters. I suggest we do a single page implementation.

@annigro
Copy link

annigro commented May 22, 2024

--> We talked about how the generated code for chat mode can get really long and therefore messy. @gorkem I assume the first two lines are your solution. How would it look in the UI?

@javisperez
Copy link
Contributor

@annigro no, Gorkem's first two lines are related to internal code usage, and should be transparent for the UI other than one line. We talked about doing something like this for it:

message: [{ actor: 'user', content: 'foo bar fooz' }]

and when is too long:

message: [{ actor: 'user', content: 'foo bar ...' }] <-- ellipsis but for the message's content only, and only if is too long.

but clicking on "copy code" would always copy the whole thing, regardless of the ellipsis.

@javisperez
Copy link
Contributor

@gorkem i gave a try to the open ai api but looks like both the payload and the response is different than the one we already have in llama.cpp (which makes sense). I vote to keep using the completion llama.cpp endpoint instead or redoing everything including the api layer just to support openai v1 endpoints. Thoughts?

@annigro
Copy link

annigro commented Jun 12, 2024

@gorkem
Copy link
Contributor Author

gorkem commented Jun 18, 2024

I did a pass left a few comments.

@annigro
Copy link

annigro commented Jul 2, 2024

@bmicklea bmicklea added the CLI Topics related to the CLI *Salmon not pink label Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLI Topics related to the CLI *Salmon not pink ux Anything in the field of UX
Projects
None yet
Development

No branches or pull requests

4 participants