Skip to content

Conversation

exfalsoquodlibet
Copy link
Contributor

This is so that the downstream evaluation pipeline works as expected as for alphagov/govuk-chat-evaluation#73

…o also return the answer message

The evaluation:generate_question_routing_response task now returns
the answer message in its output. This is needed for downstream
evaluation of the question router behaviour.
…ned answer

Adjusted the specs for evaluation:generate_question_routing_response
to verify that the 'answer' field is returned correctly. 'answer'
is nil for genuine_rag classification and includes the message for
other classifications. This ensures tests match the updated task behavior.
@govuk-ci govuk-ci temporarily deployed to govuk-chat-fix-question-olwlmq October 14, 2025 07:42 Inactive
@govuk-ci govuk-ci temporarily deployed to govuk-chat-fix-question-olwlmq October 14, 2025 07:55 Inactive
Copy link
Member

@kevindew kevindew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work on the Ruby!

Just a couple of changes for long lines (and I think we can drop a test)

end
end

it "the answer message is null when classification is genuine_rag" do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this test as we're not actually testing a conditional in the code.

But a tip for future: For these RSpec tests use the it method to form a sentence. So for this it would probably be it "has a nil answer for a genuine_rag classification"

it "outputs the response as JSON to stdout" do
ClimateControl.modify(INPUT: input) do
answer = build(:answer, question_routing_label: "genuine_rag", question_routing_confidence_score: 0.2)
answer = build(:answer, question_routing_label: "unclear_intent", question_routing_confidence_score: 0.2, message: "Sorry, can you say that again?")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
answer = build(:answer, question_routing_label: "unclear_intent", question_routing_confidence_score: 0.2, message: "Sorry, can you say that again?")
answer = build(:answer,
question_routing_label: "unclear_intent",
question_routing_confidence_score: 0.2,
message: "Sorry, can you say that again?")

Just because this line has got a little long

Comment on lines 332 to 333
expect { Rake::Task[task_name].invoke("openai") }
.to output("{\"classification\":\"genuine_rag\",\"confidence_score\":0.2}\n").to_stdout
.to output("{\"classification\":\"unclear_intent\",\"confidence_score\":0.2,\"answer\":\"Sorry, can you say that again?\"}\n").to_stdout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
expect { Rake::Task[task_name].invoke("openai") }
.to output("{\"classification\":\"genuine_rag\",\"confidence_score\":0.2}\n").to_stdout
.to output("{\"classification\":\"unclear_intent\",\"confidence_score\":0.2,\"answer\":\"Sorry, can you say that again?\"}\n").to_stdout
expected_output = {
classification: "unclear_intent",
confidence_score: 0.2,
answer: "Sorry, can you say that again?",
}
expect { Rake::Task[task_name].invoke("openai") }
.to output("#{expected_output.to_json}\n").to_stdout

Sorry this suggestion hasn't quite worked out properly because you only changed one line, but I think the line had got a bit long so it's best to pull the data out and convert to json

…onse

Remove unnecessary test and shorten long lines following reviewer's comments
@govuk-ci govuk-ci temporarily deployed to govuk-chat-fix-question-olwlmq October 14, 2025 09:09 Inactive
@exfalsoquodlibet
Copy link
Contributor Author

Thanks @kevindew. I think I have implemented the changes you suggested.

Copy link
Member

@kevindew kevindew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super, nice work 😎

@exfalsoquodlibet exfalsoquodlibet merged commit 81e5b71 into main Oct 14, 2025
12 checks passed
@exfalsoquodlibet exfalsoquodlibet deleted the fix-question-router-rake-task branch October 14, 2025 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants