Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dataset): dataset as config pilot code #47

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

emptymalei
Copy link
Member

@emptymalei emptymalei commented Mar 28, 2024

Resolves #46
Depends on #75

A pilot study of dataset as configs.

Background

#15
#46

Experiment

In this PR, we implemented a small example of how to define dataset using a yaml file. In this example, we provided a yaml file datasets/minimal.yaml,

version: 0.1.0
models:
brownian_motion:
definition:
system:
sigma: 1
delta_t: 1
initial_condition:
x0: 0
args:
n_steps: 100

When we run a command

poetry run hamilflow gen datasets/miminal.yaml

we can save the dataset in a specified location.

A few things are ignored in this example:

  • This implementation ignores the saving part as they will be trivial.
  • Version check. In principle, we should check if the version of the package matches the config.

How to Improve this Minimal Example

  1. Yaml can hold custom data with a prefix ! in the keys. We can implement a better way to define a model using customized data.
  2. This is a CLI example. Same can be down for the Python interface. Something like dataset = genenrate_dataset(path_to_config)
  3. We should have a meaningful validation before generating the data. In the validation process, we validate if the config works, and spit out meaningful error messages, e.g., version of the package doesn't match the version specified of the config.

A few questions

  1. Will this work for all models?
  2. Is it worth the hassle at all?

This comment was marked as off-topic.

This comment was marked as off-topic.

This was referenced Mar 28, 2024
Copy link
Contributor

github-actions bot commented Jun 2, 2024

PR Preview Action v1.4.7
🚀 Deployed preview to https://kausalflow.github.io/hamilflow/pr-preview/pr-47/
on branch gh-pages at 2024-06-02 19:44 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dataset as Configs
1 participant