Update question router evaluation rake task to return the answer message #608

exfalsoquodlibet · 2025-10-14T07:39:48Z

This is so that the downstream evaluation pipeline works as expected as for alphagov/govuk-chat-evaluation#73

…o also return the answer message The evaluation:generate_question_routing_response task now returns the answer message in its output. This is needed for downstream evaluation of the question router behaviour.

…ned answer Adjusted the specs for evaluation:generate_question_routing_response to verify that the 'answer' field is returned correctly. 'answer' is nil for genuine_rag classification and includes the message for other classifications. This ensures tests match the updated task behavior.

kevindew

Amazing work on the Ruby!

Just a couple of changes for long lines (and I think we can drop a test)

kevindew · 2025-10-14T07:53:42Z

spec/lib/tasks/evaluation_spec.rb

      end
    end

+    it "the answer message is null when classification is genuine_rag" do


I don't think you need this test as we're not actually testing a conditional in the code.

But a tip for future: For these RSpec tests use the it method to form a sentence. So for this it would probably be it "has a nil answer for a genuine_rag classification"

kevindew · 2025-10-14T07:55:19Z

spec/lib/tasks/evaluation_spec.rb

    it "outputs the response as JSON to stdout" do
      ClimateControl.modify(INPUT: input) do
-        answer = build(:answer, question_routing_label: "genuine_rag", question_routing_confidence_score: 0.2)
+        answer = build(:answer, question_routing_label: "unclear_intent", question_routing_confidence_score: 0.2, message: "Sorry, can you say that again?")


Suggested change

answer = build(:answer, question_routing_label: "unclear_intent", question_routing_confidence_score: 0.2, message: "Sorry, can you say that again?")

answer = build(:answer,

question_routing_label: "unclear_intent",

question_routing_confidence_score: 0.2,

message: "Sorry, can you say that again?")

Just because this line has got a little long

kevindew · 2025-10-14T08:12:34Z

spec/lib/tasks/evaluation_spec.rb

        expect { Rake::Task[task_name].invoke("openai") }
-          .to output("{\"classification\":\"genuine_rag\",\"confidence_score\":0.2}\n").to_stdout
+          .to output("{\"classification\":\"unclear_intent\",\"confidence_score\":0.2,\"answer\":\"Sorry, can you say that again?\"}\n").to_stdout


Suggested change

expect { Rake::Task[task_name].invoke("openai") }

.to output("{\"classification\":\"genuine_rag\",\"confidence_score\":0.2}\n").to_stdout

.to output("{\"classification\":\"unclear_intent\",\"confidence_score\":0.2,\"answer\":\"Sorry, can you say that again?\"}\n").to_stdout

expected_output = {

classification: "unclear_intent",

confidence_score: 0.2,

answer: "Sorry, can you say that again?",

}

expect { Rake::Task[task_name].invoke("openai") }

.to output("#{expected_output.to_json}\n").to_stdout

Sorry this suggestion hasn't quite worked out properly because you only changed one line, but I think the line had got a bit long so it's best to pull the data out and convert to json

…onse Remove unnecessary test and shorten long lines following reviewer's comments

exfalsoquodlibet · 2025-10-14T09:10:13Z

Thanks @kevindew. I think I have implemented the changes you suggested.

kevindew

Super, nice work 😎

exfalsoquodlibet added 2 commits October 14, 2025 08:33

updated the rake task evaluation:generate_question_routing_response t…

c9bd69d

…o also return the answer message The evaluation:generate_question_routing_response task now returns the answer message in its output. This is needed for downstream evaluation of the question router behaviour.

govuk-ci temporarily deployed to govuk-chat-fix-question-olwlmq October 14, 2025 07:42 Inactive

fixed RuboCop linting issues in spec/lib/tasks/evaluation_spec.rb

81bc373

govuk-ci temporarily deployed to govuk-chat-fix-question-olwlmq October 14, 2025 07:55 Inactive

kevindew reviewed Oct 14, 2025

View reviewed changes

Tidy up tests for rake evaluation task generate_question_routing_resp…

faed0f1

…onse Remove unnecessary test and shorten long lines following reviewer's comments

govuk-ci temporarily deployed to govuk-chat-fix-question-olwlmq October 14, 2025 09:09 Inactive

kevindew reviewed Oct 14, 2025

View reviewed changes

kevindew approved these changes Oct 14, 2025

View reviewed changes

exfalsoquodlibet merged commit 81e5b71 into main Oct 14, 2025
12 checks passed

exfalsoquodlibet deleted the fix-question-router-rake-task branch October 14, 2025 09:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update question router evaluation rake task to return the answer message #608

Update question router evaluation rake task to return the answer message #608

exfalsoquodlibet commented Oct 14, 2025

Uh oh!

kevindew left a comment

Uh oh!

kevindew Oct 14, 2025

Uh oh!

kevindew Oct 14, 2025

Uh oh!

kevindew Oct 14, 2025

Uh oh!

exfalsoquodlibet commented Oct 14, 2025

Uh oh!

kevindew left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Update question router evaluation rake task to return the answer message #608

Update question router evaluation rake task to return the answer message #608

Conversation

exfalsoquodlibet commented Oct 14, 2025

Uh oh!

kevindew left a comment

Choose a reason for hiding this comment

Uh oh!

kevindew Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

kevindew Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

kevindew Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

exfalsoquodlibet commented Oct 14, 2025

Uh oh!

kevindew left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants