Benchmark flexibility design discussion #85

mmilenkoski · 2020-04-27T17:04:41Z

As the number of things we want to modify in the benchmarks increase, it is necessary that we discuss how do we handle this situation. On the one hand, it is important that we have a consistent benchmark that is exactly the same every time. On the other hand, we would like to allow users to try out and compare different backends, optimizers, etc. Introducing this flexibility raises design questions in the design of the benchmarks, but also in the design of the dashboard, CLI, and other components. I am opening this issue in order for it to serve as a place for discussing these design choices.

Some of the questions we need to address are:

Will we create separate images for the official benchmark and one intended for experimentation?
For the case of the experimentation image, how do we handle the parameter selection in the dashboard and the CLI?
How do we handle the passing of the selected parameters to the experimentation image?

Feel free to propose solutions and add new questions as they come up.

martinjaggi · 2020-04-27T19:18:04Z

**kwargs ?
none needed for the official benchmark (i.e. the reference implementation we provide for each task). we can still offer the official benchmark params with hardcoded params.

Panaetius mentioned this issue May 4, 2020

Add Image Recognition Benchmark with DistributedDataParallel mlbench/mlbench-benchmarks#42

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark flexibility design discussion #85

Benchmark flexibility design discussion #85

mmilenkoski commented Apr 27, 2020

martinjaggi commented Apr 27, 2020

Benchmark flexibility design discussion #85

Benchmark flexibility design discussion #85

Comments

mmilenkoski commented Apr 27, 2020

martinjaggi commented Apr 27, 2020