You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While integration of BMI init config auto-generation capabilities was done in #607, practical performance testing was not conducted. Given #637 and the fact that DMOD currently only implements an object store dataset backing, there may be some practical issues with the current implementation; e.g., it produces configs perfectly correctly, but takes an impractical or excessive (compare to the job needing the configs) amount of time to complete.
First, analysis is needed for the running time in various scenarios, given the current implementation and more practical off-the-shelf hardware configuration (i.e., at most, a small cluster of desktop-level machines). Depending on the results, adjustments to the implementation should be made to optimize it for current dataset capabilities. Where possible, this should be done in a way that lends itself well to future dataset backings (i.e., #593), which may or may not have the same IO performance characteristics and thus may need (or benefit from) certain differences in the implementation.
The text was updated successfully, but these errors were encountered:
I added support for writing config files to various archive formats in this PR. Here is an example of writing config files on the fly to a gzipped archive file. Compression is not required (and of course slows things down). I would be interested to see the performance of writing to just a tar archive.
This issue will also be useful in conducting benchmarks. The issue shows an, albeit naive, approach to generating config files concurrently.
While integration of BMI init config auto-generation capabilities was done in #607, practical performance testing was not conducted. Given #637 and the fact that DMOD currently only implements an object store dataset backing, there may be some practical issues with the current implementation; e.g., it produces configs perfectly correctly, but takes an impractical or excessive (compare to the job needing the configs) amount of time to complete.
First, analysis is needed for the running time in various scenarios, given the current implementation and more practical off-the-shelf hardware configuration (i.e., at most, a small cluster of desktop-level machines). Depending on the results, adjustments to the implementation should be made to optimize it for current dataset capabilities. Where possible, this should be done in a way that lends itself well to future dataset backings (i.e., #593), which may or may not have the same IO performance characteristics and thus may need (or benefit from) certain differences in the implementation.
The text was updated successfully, but these errors were encountered: