Skip to content

Enumerate models and datasets accessible from HELM#21

Open
b-fenelon wants to merge 1 commit intomainfrom
helm-helpers
Open

Enumerate models and datasets accessible from HELM#21
b-fenelon wants to merge 1 commit intomainfrom
helm-helpers

Conversation

@b-fenelon
Copy link
Contributor

@b-fenelon b-fenelon commented Sep 22, 2025

This is a WIP for wrangling all 'officially' available models and scenarios with metadata from HELM. The goal is to expose the applicable HELM APIs in MAGNET for resolving evaluation queries. Right now, models and scenarios each have a simple helper class that attempts to collect callable definitions and link metadata.

Models

The published model descriptions and deployments (i.e. docs list) can be found in HELM under /helm/config/model_[deployments, metadata].yaml and are read in helm/benchmark/config_registry.py. Models with unsupported or deprecated tags are removed in this helper to align the metadata and models.

Scenarios

Scenarios (i.e. docs list) do not have a single exhaustive source. Instead, RunSpecFunction definitions from /helm/benchmark/run_specs/*.py are read in helm/benchmark/run_spec.py. If you can instantiate a RunSpec, then the scenario_spec attribute contains the HELM path to a Scenario definition. This helper replicates the run_spec.py functionality to also collect all Scenario definitions from helm/benchmark/scenarios/*.py. The current gap is matching each Scenario to a RunSpec.

Next steps:

  • Accept (or enumerate all possible) arguments to resolve Scenario -> RunSpec
  • Standardize helper methods/outputs
  • Create example query scenarios (e.g. Model Family A, Dataset D)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant