-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: allow manual repair edit #261
base: main
Are you sure you want to change the base?
Conversation
WIP because I want to add a couple of tests in the repair API |
@@ -73,6 +76,12 @@ async def post_repair_run( | |||
project_id: str, task_id: str, run_id: str, input: RepairRunPost | |||
) -> TaskRun: | |||
task, run = task_and_run_from_id(project_id, task_id, run_id) | |||
|
|||
# the repair must match the task's output schema if structured output is enabled | |||
if task.output_json_schema: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not found a centralized utility that could be used to do the validation for this. The validation in the adapter does not seem to be usable outside the context of a generation run.
Is there a better existing method to check the object against this task's output schema?
if (!repair_run) { | ||
return | ||
} | ||
repair_run.output.output = repair_output_edited |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way it currently works is that the repair run would still be attributed to whichever model generated the repair output that was repaired. Would you rather want the repair run to be different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we should attribute it to the person making the edit. A standard Human datasource should work!
ohhh, can't wait! Let's pretty print for the text-area if it's JSON? Makes it easier to edit. Basically: |
function validate_output(text: string) { | ||
if (task.output_json_schema) { | ||
try { | ||
// TODO: validate against the JSON schema instead of just checking if it is valid JSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scosman - I'd want to validate the schema on the client-side here, but the webapp does not have a library for JSON schema validation.
Is there any particular library you would prefer to use for JSON schema validation on the client side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can leave validation to server! We just shouldn't save it without validation, and should make a decent error message there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added validation on the server side, but doing it on the client too would be nice as it allows validating in realtime and doing things like disabling the submit button while invalid.
The current UI for editing the repair requires that you first save to apply the edit to the unsaved repair, and then you submit the repair. If we do server-only, the validation feedback would not be on this same form.
I could change it to make the edit form directly submit to the API (instead of applying to the run without submitting) if you think that makes more sense - then server validation errors could come through on this same form
output: { | ||
...repair_run.output, | ||
output: repair_output_edited, | ||
source: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried source: null
and source: undefined
instead of prefilling like this, but the API validation does not allow that
<div | ||
class="text-xs font-medium text-error flex flex-row mb-2 items-center" | ||
> | ||
<svg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This SVG comes from the Warning
component
{#if task.output_json_schema} | ||
{#if repair_output_edited_valid} | ||
<div class="text-xs font-medium text-gray-500 flex flex-row mb-2"> | ||
<svg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This SVG is the same used elsewhere in the page to show Structure Valid
.
Maybe worth wrapping each SVG into components and giving them names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
totally. I try to do that when I duplicate it.... but as you can tell I'm not religious about it
4737e61
to
364849c
Compare
364849c
to
3c45a75
Compare
3c45a75
to
7e0ba5f
Compare
Added the following changes:
Kapture.2025-03-22.at.13.51.51.mp4 |
This looks great. Few quick comments, but I'll be able to look in more depth next week! |
What does this PR do?
This PR adds support for manually editing a task run repair. After
Attempt repair
, the user has a new button (Edit
) that turns the repair output display into a text area that allows users editing the repair attempt.This PR also fixes the missing validation of the repair output in the repair API. There was a bug that allowed repair outputs that do not match the structured output schema, which might have caused downstream problems due to corrupt shapes.
This is an acceptable solution but it has a major limitation due to requiring the user to first
Attempt (automatic) repair
. In the case of aTaskRun
accessed from theDataset
page, the user does not have the ability to select the model used during the repair - it is by default the same as the one that generated the original run output. If a run used a small model (e.g. the small Llama models), the user may be stuck as the small model may fail to even pass validation duringAttempt repair
, which means the user then never gets access to theEdit
.New

Edit
button:Edit
mode:Error state (if JSON does not match schema):

Related Issues
Manual edit requested: #115
Contributor License Agreement
I, @leonardmq, confirm that I have read and agree to the Contributors License Agreement.
Checklists