You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As the number of things we want to modify in the benchmarks increase, it is necessary that we discuss how do we handle this situation. On the one hand, it is important that we have a consistent benchmark that is exactly the same every time. On the other hand, we would like to allow users to try out and compare different backends, optimizers, etc. Introducing this flexibility raises design questions in the design of the benchmarks, but also in the design of the dashboard, CLI, and other components. I am opening this issue in order for it to serve as a place for discussing these design choices.
Some of the questions we need to address are:
Will we create separate images for the official benchmark and one intended for experimentation?
For the case of the experimentation image, how do we handle the parameter selection in the dashboard and the CLI?
How do we handle the passing of the selected parameters to the experimentation image?
Feel free to propose solutions and add new questions as they come up.
The text was updated successfully, but these errors were encountered:
**kwargs ?
none needed for the official benchmark (i.e. the reference implementation we provide for each task). we can still offer the official benchmark params with hardcoded params.
As the number of things we want to modify in the benchmarks increase, it is necessary that we discuss how do we handle this situation. On the one hand, it is important that we have a consistent benchmark that is exactly the same every time. On the other hand, we would like to allow users to try out and compare different backends, optimizers, etc. Introducing this flexibility raises design questions in the design of the benchmarks, but also in the design of the dashboard, CLI, and other components. I am opening this issue in order for it to serve as a place for discussing these design choices.
Some of the questions we need to address are:
Feel free to propose solutions and add new questions as they come up.
The text was updated successfully, but these errors were encountered: