fix up the verification agent

LlmLaraHub · May 7, 2024 · 0a783f6 · 0a783f6
1 parent ed206d8
commit 0a783f6
Show file tree

Hide file tree

Showing 7 changed files with 626 additions and 37 deletions.
diff --git a/README.md b/README.md
@@ -115,4 +115,13 @@ Per the Laravel docs https://laravel.com/docs/11.x/reverb
 ## Modules
 We are using this system https://github.com/nWidart/laravel-modules
 
-Make sure to register your module in `bootstrap/providers.php`
+Make sure to register your module in `bootstrap/providers.php`
+
+
+## Ollama Notes
+
+Make sure to run 
+```bash
+launchctl setenv OLLAMA_NUM_PARALLEL 3
+```
+and restart the service so you get multiple requests at a time
diff --git a/app/Domains/Agents/VerifyResponseAgent.php b/app/Domains/Agents/VerifyResponseAgent.php
@@ -18,43 +18,29 @@ public function verify(VerifyPromptInputDto $input): VerifyPromptOutputDto
         $verifyPrompt = $input->verifyPrompt;
 
         $prompt = <<<EOT
-As a data verification assistant please review the following and return 
-a response that cleans up the original "LLM RESPONSE" included below.
-What is key for you to do is that this is a RAG systems so if the original "LLM RESPONSE" response does not
-line up with the data in the "CONTEXT" then remove any questionable text and 
-numbers. See VERIFY PROMPT for any additional information. The output here
-will go directly to the user in a chat window so please reply accordingly.
-Your Response will not include anything about the verification process you are just a proxy to the origin LLM RESPONSE.
-Your Response will be that just cleaned up for chat.
-DO NOT include text like "Here is the cleaned-up response" the user should not even know your step happened :) 
-Your repsonse will NOT be a list like below but just follow the formatting of the "LLM RESPONSE".
+As a Data Integrity Officer please review the following and return only what remains after you clean it up.
+DO NOT include text like "Here is the cleaned-up response" the user should not even know your step happened in the process.
 DO NOT get an information outside of this context.
-
-### Included are the following sections
-- ORIGINAL PROMPT: The question from the user
-- CONTEXT: 
-- LLM RESPONSE: The response from the LLM system using the original prompt and context
-- VERIFY PROMPT: The prompt added to help clear up the required output.
-
+Just return the text as if answering the intial users prompt "ORIGINAL PROMPT"
+Using the CONTEXT make sure the LLM RESPONSE is accurent and just clean it up if not.
 
 ### START ORIGINAL PROMPT 
-{$originalPrompt}
+$originalPrompt
 ### END ORIGINAL PROMPT
 
 ### START CONTEXT
-{$context}
+$context
 ### END CONTEXT
 
 ### START LLM RESPONSE
-{$llmResponse}
+$llmResponse
 ### END LLM RESPONSE
 
-### START VERIFY PROMPT
-{$verifyPrompt}
-### END VERIFY PROMPT
 
 EOT;
 
+        //put_fixture("verified_prompt_not_working.txt", $prompt, false);
+
         Log::info('[LaraChain] VerifyResponseAgent::verify', [
             'prompt' => $prompt,
         ]);

diff --git a/config/horizon.php b/config/horizon.php
@@ -211,7 +211,7 @@
             'queue' => ['ollama'],
             'balance' => 'auto',
             'autoScalingStrategy' => 'time',
-            'maxProcesses' => 1,
+            'maxProcesses' => env('OLLAMA_NUM_PARALLEL', 1),
             'maxTime' => 0,
             'maxJobs' => 0,
             'memory' => 256,

diff --git a/package-lock.json b/package-lock.json
diff --git a/tests/fixtures/messages_llama3.json b/tests/fixtures/messages_llama3.json
diff --git a/tests/fixtures/ollama_chat_results.json b/tests/fixtures/ollama_chat_results.json
@@ -1,15 +1,15 @@
 {
-    "model": "phi3",
-    "created_at": "2024-05-06T10:01:23.675322Z",
+    "model": "llama3",
+    "created_at": "2024-05-07T15:37:02.999098Z",
     "message": {
         "role": "assistant",
-        "content": " ###\n\nWhat are real-world applications of Retrieval Augmented Generation?\n===\nRetrieval Augmented Generation (RAG) has a wide array of real-world applications across various domains, significantly enhancing the capabilities of AI systems in producing more accurate and contextually relevant outputs. Here are some prominent use cases:\n\n1. **Question Answering Systems**: RAG powers advanced Q&A platforms that efficiently retrieve information from extensive databases to generate coherent answers. This is particularly useful for educational platforms, customer service bots, and knowledge dissemination in research fields like medicine, law, and science.\n\n2. **Data-to-Text Generation**: In industries where generating insights from data reports or visualizations (like charts, graphs) is crucial, RAG models can extract relevant information across multiple sources to create comprehensive business intelligence documents. This helps in summarizing complex datasets into actionable insights for decision-mayer companies and researchers.\n\n3. **Multimedia Understanding**: Beyond textual data, RAG's ability to integrate multimodal inputs (images, videos, audio) enables it to understand context better. For example, in e-commerce, a chatbot can retrieve product descriptions or reviews from various sources when users query about specific items, enhancing the shopping experience with richer information.\n\n4. **Personal Assistants**: In virtual assistants and smart personal devices, RAG models contribute to delivering more accurate responses by pulling relevant content from a vast array of data sources, improving user interaction and satisfaction.\n\n5. **Language Generation for Content Creation**: News agencies and content creators utilize RAG models to generate high-quality articles or reports, ensuring the content is not only engaging but also factually accurate by leveraging external information databases.\n\n6. **Educational Tools**: In educational technology, RAG can be employed in adaptive learning platforms that provide students with personalized explanations and answers based on their queries and the retrieved relevant information from diverse textual sources.\n\n7. **Legal and Compliance Fields**: Law firms and regulatory bodies use RAG to quickly access legal precedents, statutes, or case laws to support arguments in court cases or compliance documentation, enhancing the accuracy of legal documents and analyses.\n\n8. **Healthcare Information Retrieval**: Health professionals can utilize RAG models to extract critical patient data from various medical records and databases for accurate diagnoses, treatment plans, and research on diseases by integrating information across disparate sources.\n\nThese applications demonstrate the versatility of RAG in addressing complex challenges that require not just generating natural language responses but also ensuring those are grounded in relevant facts and contextual understanding from a multitude of data sources. As technology evolves, we can expect even more innovative uses of RAG across industries seeking to enhance the accuracy, efficiency, and depth of AI-driven solutions."
+        "content": "You're referring to Alpha, a measure of internal consistency developed by Lee J. Cronbach!\n\nCronbach's Alpha is a statistical measure that calculates the reliability of a test or scale based on its internal structure. The estimate varies depending on several factors, including:\n\n1. **Number of items**: More items generally lead to higher Alpha values.\n2. **Item difficulty**: Items with similar difficulties tend to increase Alpha.\n3. **Item overlap**: Overlapping item content can boost Alpha.\n4. **Test length**: Longer tests often have lower Alpha values due to increased measurement error.\n\nHere are some general guidelines for interpreting Cronbach's Alpha estimates:\n\n* **Alpha > 0.9**: Very high reliability; the test or scale is highly consistent and reliable.\n* **Alpha = 0.8-0.89**: High reliability; the test or scale is generally consistent, but with some minor inconsistencies.\n* **Alpha = 0.7-0.79**: Moderate reliability; the test or scale has some inconsistencies, but still provides a reasonable measure of what it's supposed to measure.\n* **Alpha = 0.6-0.69**: Low reliability; the test or scale is prone to errors and inconsistent measurements.\n* **Alpha < 0.6**: Very low reliability; the test or scale is unreliable and may not be a good measure of what it's supposed to measure.\n\nKeep in mind that these are general guidelines, and the interpretation of Alpha values depends on the specific context and research question being investigated. It's always a good idea to consult with a statistician or researcher familiar with your field to better understand how to interpret Cronbach's Alpha estimates for your particular study."
     },
     "done": true,
-    "total_duration": 27623276250,
-    "load_duration": 3146162458,
-    "prompt_eval_count": 1928,
-    "prompt_eval_duration": 4706168000,
-    "eval_count": 652,
-    "eval_duration": 19762069000
+    "total_duration": 14696055042,
+    "load_duration": 1058272083,
+    "prompt_eval_count": 19,
+    "prompt_eval_duration": 118677000,
+    "eval_count": 344,
+    "eval_duration": 13448675000
 }