Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue: Make loading of massive and/or remote runs faster #8

Open
wookayin opened this issue Nov 10, 2022 · 0 comments
Open
Labels
enhancement Small improvements new feature New feature request

Comments

@wookayin
Copy link
Owner

wookayin commented Nov 10, 2022

Fetching and loading many runs (dozens or hundreds of) over remote (or local) machines, each of which can have pretty large file size can be slow and resource intensive for where get_runs is called, even when parallelized. Network traffic over remote runs can be also pretty much and often is a bottleneck.

For example, it would took > 2 minutes for fetching 24 real-world experiment runs over a remote node over SSH where each run has ~100K rows (and dozens of summary tags), and the eventfile is quite fat (~30MB each, network traffic being ~700MB in total) due to non-scalar artifacts/tensors saved. Needless to say about heavy CPU consumption distributed over the local 8 processes.

Ideally parsing of such large-scale experiment runs could be much faster. In comparison, Tensorboard's can load such scales of run data as fast as almost instant with the use of --load_fast mode (i.e., rustboard). There are several ideas and steps towards this goal:

  • Use native code (rustboard) to parse tensorboard event files.
  • Run extraction of scalar data remotely rather than locally; this would significantly reduce the network traffic required and the communication overhead.
  • Run a remote expt daemon/helper process which would enable incremental data loading. Or can we fetch raw data from tensorboard?
@wookayin wookayin added enhancement Small improvements new feature New feature request labels Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Small improvements new feature New feature request
Projects
None yet
Development

No branches or pull requests

1 participant