-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Custom metric type #775
Comments
cc @dragonstyle |
What version of Inspect are you using? Metrics in recent versions of Inspect are allowed to return any of the following types: str | int | float | bool
list[str | int | float | bool]
dict[str, str | int | float | bool | None] The elements of the system downstream should automatically handle metrics that return lists or dicts, hopefully behaving as you'd expect them to. |
I'm using latest inspect, version:
The "value" is properly reflected as "test-value" in the json and in |
I can confirm that I'm seeing the same issue (the score arrives with a float value in the metric). I am investigating now, but I suspect the issue is related to how we've implemented sample reducing (and it not dealing very nicely with strings, by default). I'll respond here once I've buttoned this down - sorry for the issue and thanks for reporting it. |
If a user produces a score whose value is a string, when that value is ‘reduced’ using the default mean reducer, it is coerced to a float. For strings thing means when the Score arrives at the custom metric, it will carry the reduced value which has been coerced to a float. This fix is minimal - it implements support for string values in the mean reducer, providing the most common string value (or the first string value if non are most common). Fixes #775
If a user produces a score whose value is a string, when that value is ‘reduced’ using the default mean reducer, it is coerced to a float. For strings thing means when the Score arrives at the custom metric, it will carry the reduced value which has been coerced to a float. This fix is minimal - it implements support for string values in the mean reducer, providing the most common string value (or the first string value if non are most common). Fixes #775
I'm reverting the original fix to this as it caused additional regressions elsewhere. I will attempt to fix again today. |
We ended up reverting the second fix to this issue as well. I will take another crack at it soon! |
Hi,
Is there a way to define custom metric that would output dictionary or tuple?
I need to create a metric that will output Scorer values aggregated by the value and display occurrence of each value.
The values in my case are strings.
It seems that the only output type from the "metric" that is supported is "float".
Here is example "metric" which is currently not supported:
Thank you.
The text was updated successfully, but these errors were encountered: