I connected localGPT to my corpus building system - To Train LLM's #481
clearsitedesigns
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I designed a system to build a data store on any topic. I ingest this with my custom overlap embedding function. Then, I use the local GPT to run a series of chain commands, using the model as a dummy in between. The model is tuned to respond. I adjust several hyperparameters to do this - to a series of questions that I pose to the content in a change and then outputs the data as a "new training" source for an LLM as question and answer pairs. I use a secondary config that adds a Madlib style-like question and answer path to vary how the questions are asked. Then, from there, I save the output as a question and answer instruct pairs; I have a series of validation checks to help determine where hallucinations occur.
I thought I would share just a little information about this. The below is just a sample. It will run in the background, for 2,000 cycles (the typical amount to adjust a Lora adapter)
llama_print_timings: load time = 2014.40 ms
llama_print_timings: sample time = 388.07 ms / 551 runs ( 0.70 ms per token, 1419.83 tokens per second)
llama_print_timings: prompt eval time = 42211.96 ms / 989 tokens ( 42.68 ms per token, 23.43 tokens per second)
llama_print_timings: eval time = 32695.94 ms / 550 runs ( 59.45 ms per token, 16.82 tokens per second)
llama_print_timings: total time = 76212.72 ms
Length of raw answer in tokens: 348
Length of query in tokens: 5
Llama.generate: prefix-match hit
llama_print_timings: load time = 2014.40 ms
llama_print_timings: sample time = 12.78 ms / 18 runs ( 0.71 ms per token, 1408.56 tokens per second)
llama_print_timings: prompt eval time = 43205.15 ms / 1000 tokens ( 43.21 ms per token, 23.15 tokens per second)
llama_print_timings: eval time = 1055.25 ms / 18 runs ( 58.63 ms per token, 17.06 tokens per second)
llama_print_timings: total time = 44436.19 ms
Length of raw answer in tokens: 5
llama_print_timings: load time = 2014.40 ms
llama_print_timings: sample time = 356.10 ms / 501 runs ( 0.71 ms per token, 1406.93 tokens per second)
llama_print_timings: prompt eval time = 33024.80 ms / 778 tokens ( 42.45 ms per token, 23.56 tokens per second)
llama_print_timings: eval time = 29268.26 ms / 500 runs ( 58.54 ms per token, 17.08 tokens per second)
llama_print_timings: total time = 63490.88 ms
Length of raw answer in tokens: 318
Beta Was this translation helpful? Give feedback.
All reactions