Enumerate models and datasets accessible from HELM by b-fenelon · Pull Request #21 · AIQ-Kitware/aiq-magnet

b-fenelon · 2025-09-22T18:36:09Z

This is a WIP for wrangling all 'officially' available models and scenarios with metadata from HELM. The goal is to expose the applicable HELM APIs in MAGNET for resolving evaluation queries. Right now, models and scenarios each have a simple helper class that attempts to collect callable definitions and link metadata.

Models

The published model descriptions and deployments (i.e. docs list) can be found in HELM under /helm/config/model_[deployments, metadata].yaml and are read in helm/benchmark/config_registry.py. Models with unsupported or deprecated tags are removed in this helper to align the metadata and models.

Scenarios

Scenarios (i.e. docs list) do not have a single exhaustive source. Instead, RunSpecFunction definitions from /helm/benchmark/run_specs/*.py are read in helm/benchmark/run_spec.py. If you can instantiate a RunSpec, then the scenario_spec attribute contains the HELM path to a Scenario definition. This helper replicates the run_spec.py functionality to also collect all Scenario definitions from helm/benchmark/scenarios/*.py. The current gap is matching each Scenario to a RunSpec.

Next steps:

Accept (or enumerate all possible) arguments to resolve Scenario -> RunSpec
Standardize helper methods/outputs
Create example query scenarios (e.g. Model Family A, Dataset D)

Add helpers to wrangle HELM models and scenarios

8dacc4b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enumerate models and datasets accessible from HELM#21

Enumerate models and datasets accessible from HELM#21
b-fenelon wants to merge 1 commit intomainfrom
helm-helpers

b-fenelon commented Sep 22, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

b-fenelon commented Sep 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Models

Scenarios

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

b-fenelon commented Sep 22, 2025 •

edited

Loading