AISI Inspect integration #6

jacobthebanana · 2024-06-24T22:12:53Z

PR Type

Feature

Short Description

Integrate AISI Inspect for RAG and for RAGAS Evaluation.

To run these examples, install inspect-ai and run:

cd veval/
inspect eval run_inspect_rag_solver.py

…use. Fixed embedding dimensionality mismatch by specifying embed model name.

…ion is not yet ready.

…xt in solver. Revised nan handling in reducer.

…ics selected in yaml config.

…ue from Inspect plan.

xeon27 · 2024-06-25T16:58:02Z

veval/systems/template.py

@@ -102,3 +111,70 @@ def get_cfg(self):
        if self._cfg is None:
            raise ValueError("System config not set.")
        return self._cfg.as_dict()
+
+    def get_inspect_tool(


What's the purpose of implementing both the get_inspect_tool and the get_inspect_solver. Do we need both, or just one of these is sufficient?

xeon27 · 2024-06-25T16:59:48Z

veval/systems/template.py

+            async def solve(state: TaskState, generate: Generate) -> TaskState:
+                query = state.user_prompt.text
+                async with concurrency("document_search", max_concurrency):
+                    response = self.invoke(query, documents)


The response here consists of both the retrieved context and the generated answer, why are we passing it again through the chain_of_thought(), generate() and self_critique() pipeline?

xeon27 · 2024-06-25T17:38:46Z

veval/metrics/template.py

+                metric_function=inspect_metric_fn,
+            )()
+            for ragas_feature_name in RAGAS_FEATURE_NAMES


The current implementation calculates all available metrics irrespective of task. Modify this to restrict to only those metrics which are specified in the task config (yaml file).

AISI Inpsect Integration: Added baseline RAG via Solver and via Tool …

8871847

…use. Fixed embedding dimensionality mismatch by specifying embed model name.

jacobthebanana changed the base branch from main to develop June 24, 2024 22:13

jacobthebanana added 6 commits June 24, 2024 19:44

AISI Inpsect Integration [WIP]: Implemented metrics via RAGAS, reduct…

a720117

…ion is not yet ready.

Basic RAG: Revised implementation of loading embeddings from cache.

c270e6c

AISI Inpsect Integration: Implemented reduction of RAGAS metrics.

d3256f0

AISI Inpsect Integration: Replaced summarized response with raw conte…

605d893

…xt in solver. Revised nan handling in reducer.

AISI Inpsect Integration: Added option to evaluate only on RAGAS metr…

d387350

…ics selected in yaml config.

AISI Inpsect Integration: Eliminated chain of thought and self-critiq…

b94268b

…ue from Inspect plan.

xeon27 approved these changes Jul 5, 2024

View reviewed changes

xeon27 marked this pull request as ready for review July 5, 2024 16:31

xeon27 merged commit 032ed7f into develop Jul 5, 2024

xeon27 deleted the aisi-inspect-integration branch July 5, 2024 16:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AISI Inspect integration #6

AISI Inspect integration #6

jacobthebanana commented Jun 24, 2024 •

edited

Loading

xeon27 Jun 25, 2024

xeon27 Jun 25, 2024

xeon27 Jun 25, 2024

AISI Inspect integration #6

AISI Inspect integration #6

Conversation

jacobthebanana commented Jun 24, 2024 • edited Loading

PR Type

Short Description

xeon27 Jun 25, 2024

Choose a reason for hiding this comment

xeon27 Jun 25, 2024

Choose a reason for hiding this comment

xeon27 Jun 25, 2024

Choose a reason for hiding this comment

jacobthebanana commented Jun 24, 2024 •

edited

Loading