-
Notifications
You must be signed in to change notification settings - Fork 59
Description
A collection of issues and potential improvements discovered in code or project work:
-
1) The parallelization of
simulate_scenariosseems to suffer from fork poisoning on Linux machines, making the parallel computation slower than sequential. Discovered by @fabianliebig , feel free to create a sub-issue describing your findings in detail -
2) When ran with
parralel_runs=True, the environment is seemingly not correctly passed to the workers. This causes any warning settings to be dropped and usually results in the complete swarming of the output with warning messages coming in from all workers -> very annoying in jupyter notebooks -
3) Related to the previous point: It is common to use a
RandomRecommederas baseline for a simulation. However, because its non-predictive, it spits out a warning that it shouldn't be called with an objective each time. But its not possible to set up the random scenario campaign without the objective because the simulation module requires this for access to the targets -
4) The calculation of cumulative best values is not really reasonable for multi-target scenarios. Example: ran with a batch setting > 1, it might be that there is a really good value for target A in there. The same for target B. So the cumulatives will look really good. But unless these two good values belong to the same point, its not really a useful information. In practice we are interested in the average properties of the best point, and not the best target values independent of what point they belonged to. This might result in extremely distorted views for larger batch sizes. Could add options to specify what is the "best" point, ie lowest loss or highest desirability etc
-
5) It appears the random seed is not correctly passed or treated. Here is an example where all campaigns use the random recommender (in the first iteration), so they should all result in the exact same starting point, but they are not:
