Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A collection of performance improvements #231

Open
alfh opened this issue Nov 19, 2023 · 1 comment
Open

A collection of performance improvements #231

alfh opened this issue Nov 19, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@alfh
Copy link
Contributor

alfh commented Nov 19, 2023

Is your feature request related to a problem? Please describe.

I would like to improve the performance of the map generation, making sure that my CPU is working as hard as possible, to generate the maps of large and small countries.

Manual tests shows me that it would be possible on my PC, to go from 3 days to 30 minutes for generating Canada, which has almost 5000 tiles. (That does not include contour lines, I have not tested that yet).

Describe the solution you'd like

These are just my high level thoughts as of now. I think separate feature requests should be raised for each bullet point, but want some initial feedback on the points, and the loosely planned "road" to better performance.
I am really open to input here, to what would be wise, and to what sequence we should tackle the issues / sub-issues.

  1. Add a Benchmark.md file or similar, where one can list the time taken for various countries, using various PCs and specific version of wahooMapsCreator. This to give us visibilty of any performance improvements, and give users an idea of what performance to expect.
  2. Use osmium id renumbering on the two files generated in filter_tags_from_country_osm_pbf_files. This would take a few seconds, but it would reduce the memory use in the upcoming "extract" steps, since that depends quite a bit on the "highest IDs in the files"
  3. Start using asyncio to launch external programs. This has the most potential, and is probably quite easy. This would allow us to have "CPU count minus 1" external programs running in parallel for at least some of the steps. Some of the steps are not really multi threaded, so it will only use one CPU core.
  4. Split the actual invoking of the external program for each step, into a separate method, this to allow each step to have a "for" loop, and using asyncio to set up tasks, and then await at the end
  5. For the extract step, generate a JSON file with X number of tiles, and then invoke osmium extract with it. In the beginning this could be just 5-10 tiles in each batch (the memory usage is proportional to the number of tiles). This will be a great time saver I think for large countries.
  6. For the extract step, going further, the number of tiles in the batch could be increased, so that (total number of tiles) / (number of CPU cores - 1), is used as the batch size. But here the memory requirement increases, but I am also looking into changes on the osmium itself, which would really decrease that memory usage 10x (but currently with some performance degradation).
  7. Add more unit tests for countries (not requiring the large pbf files to be checked in), so we see that our changes does not affect the resulting map files.

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

Consider if any minimum RAM requirements should be set, like 8 GB perfhaps ?

https://realpython.com/async-io-python/ is a good article on asyncio, in addition to the Python manual.
Also looked a bit on trio, https://trio.readthedocs.io/en/stable/, but I think the standard asyncio is good enough for our use.

@alfh alfh added the enhancement New feature or request label Nov 19, 2023
@alfh
Copy link
Contributor Author

alfh commented Nov 19, 2023

Ref 3 :
I'm prototyping the use of asyncio in a prototype branch of mine for now : https://github.com/alfh/wahooMapsCreator/tree/use_asyncio_for_external_programs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant