Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plot_results(): Are there any frameworks that allow summarising and visualising inspect logs? #704

Open
sohaibimran7 opened this issue Oct 16, 2024 · 2 comments

Comments

@sohaibimran7
Copy link

Many evaluation tools have frameworks to allow summarising and visualising results. An example is zeno for lm-eval-harness. I understand that results-summarisation & visualisation needs can be quite diverse and one tool may not work for anyone. Still, I think if inspect ai logs can be easily summarised and visualised, researchers could iterate faster.
I wrote a very quick and dirty class for visualising a list of EvalLogInfos for my own experiments and was wondering what other people use and whether there is interest in results summarisation visualisation support for inspect.

@jjallaire-aisi
Copy link
Collaborator

This is definitely something we are interested in supporting more deeply! We are soon going to make it possible to run a set of analysis code on top of an eval-set and then display that in the viewer. At the same time, we will hopefully discover some useful common idioms and tools that we can provide. Would love to hear from people on this thread about what the general shape of requirements are!

@sohaibimran7
Copy link
Author

sohaibimran7 commented Oct 16, 2024

I personally would value the following in a visualisation framework:

  1. Ability to categorise logs by {log_dir, run_id, task, dataset, scorer and model}
  2. More finely categorise based on substrings of {model, task, log_dir}
  3. Ability to rename categories and their elements and sort and filter by categories using custom sort and filter functions.
  4. Ability to map each category to a plotting element {x axis, y axis, x offset, y offset, colour, horizontal and vertical faceting in a multi-plot figure}
  5. Ability to plot any figure I like (bar charts, box plots, violins etc.)
  6. Extensibility

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants