[Question] Custom metric type #775

us2547 · 2024-10-30T04:35:59Z

Hi,
Is there a way to define custom metric that would output dictionary or tuple?
I need to create a metric that will output Scorer values aggregated by the value and display occurrence of each value.
The values in my case are strings.
It seems that the only output type from the "metric" that is supported is "float".
Here is example "metric" which is currently not supported:

@metric
def item_count() -> Metric:
    """
    Currently not working. Inspect expects a float value for the metric.
    """
    def metric(scores: list[Score]) -> tuple:
        count_dict = {}
        for score in scores:
            score_value = str(score)
            if score_value in count_dict:
                count_dict[score_value] += 1
            else:
                count_dict[score_value] = 1
        return tuple(count_dict.items())
    return metric

Thank you.

The text was updated successfully, but these errors were encountered:

jjallaire · 2024-10-30T17:32:23Z

cc @dragonstyle

dragonstyle · 2024-10-30T17:36:30Z

What version of Inspect are you using? Metrics in recent versions of Inspect are allowed to return any of the following types:

str | int | float | bool
list[str | int | float | bool]
dict[str, str | int | float | bool | None]

The elements of the system downstream should automatically handle metrics that return lists or dicts, hopefully behaving as you'd expect them to.

us2547 · 2024-10-31T13:56:35Z

I'm using latest inspect, version: inspect_ai==0.3.42. I have custom scorer that returns following class (example):

       # Debug
        scorer_value = "test-value"
        return_score = Score(
                value=scorer_value,
                answer=state.output.completion,
                explanation=scorer_explanation,
                metadata=metadata,
            )
        return return_score

The "value" is properly reflected as "test-value" in the json and in inspect view, however if I use custom metric, by the time Scorer.value reaches it, the value is converted to float and is showing as 0.0. Is there a way to send string values to custom metric for processing?

dragonstyle · 2024-10-31T14:28:42Z

I can confirm that I'm seeing the same issue (the score arrives with a float value in the metric). I am investigating now, but I suspect the issue is related to how we've implemented sample reducing (and it not dealing very nicely with strings, by default). I'll respond here once I've buttoned this down - sorry for the issue and thanks for reporting it.

If a user produces a score whose value is a string, when that value is ‘reduced’ using the default mean reducer, it is coerced to a float. For strings thing means when the Score arrives at the custom metric, it will carry the reduced value which has been coerced to a float. This fix is minimal - it implements support for string values in the mean reducer, providing the most common string value (or the first string value if non are most common). Fixes #775

dragonstyle · 2024-11-04T13:04:07Z

I'm reverting the original fix to this as it caused additional regressions elsewhere. I will attempt to fix again today.

dragonstyle · 2024-11-05T14:19:24Z

We ended up reverting the second fix to this issue as well. I will take another crack at it soon!

dragonstyle self-assigned this Oct 31, 2024

dragonstyle mentioned this issue Oct 31, 2024

Better support string values for scores #785

Merged

5 tasks

jjallaire closed this as completed in #785 Oct 31, 2024

dragonstyle reopened this Nov 4, 2024

dragonstyle mentioned this issue Nov 4, 2024

Don’t run reducer against single epoch samples #799

Merged

5 tasks

jjallaire-aisi closed this as completed in #799 Nov 4, 2024

dragonstyle reopened this Nov 5, 2024

us2547 mentioned this issue Nov 12, 2024

Validation error when reading Eval log #834

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Custom metric type #775

[Question] Custom metric type #775

us2547 commented Oct 30, 2024

jjallaire commented Oct 30, 2024

dragonstyle commented Oct 30, 2024

us2547 commented Oct 31, 2024 •

edited

Loading

dragonstyle commented Oct 31, 2024

dragonstyle commented Nov 4, 2024

dragonstyle commented Nov 5, 2024

[Question] Custom metric type #775

[Question] Custom metric type #775

Comments

us2547 commented Oct 30, 2024

jjallaire commented Oct 30, 2024

dragonstyle commented Oct 30, 2024

us2547 commented Oct 31, 2024 • edited Loading

dragonstyle commented Oct 31, 2024

dragonstyle commented Nov 4, 2024

dragonstyle commented Nov 5, 2024

us2547 commented Oct 31, 2024 •

edited

Loading