feat: share host conda package cache with Docker containers#1080
Closed
nh13 wants to merge 8 commits intobioconda:masterfrom
Closed
feat: share host conda package cache with Docker containers#1080nh13 wants to merge 8 commits intobioconda:masterfrom
nh13 wants to merge 8 commits intobioconda:masterfrom
Conversation
Mount the host's conda package cache directory read-only inside the Docker build container. This allows the container to reuse cached repodata and packages instead of re-downloading them from scratch. Changes: - Add _get_host_pkgs_dir() helper to locate the host's pkgs directory - Add share_host_cache parameter to RecipeBuilder (default: True) - Mount host pkgs dir as read-only volume in build_recipe() - Update BUILD_SCRIPT_TEMPLATE to prepend the mounted cache to pkgs_dirs - Add --no-share-host-cache CLI flag for opt-out The :ro mount prevents any corruption of the host cache. Conda handles read-only pkgs_dirs gracefully. If the host pkgs dir is not found, the feature is silently skipped with no change in behavior. Saves 30-90s per Docker build by avoiding redundant repodata downloads.
4305e9e to
900c3f3
Compare
Use conda config --system flag to avoid interference with CONDARC env var inside the container, and ensure writable pkgs dir comes first.
…kgs_dirs The CONDARC env var may reference a host path not mounted into the container. Create the file if missing so conda config writes and later pkgs_dirs resolution work correctly.
conda-build's clean_pkg_cache() calls PackageCacheData.first_writable() on each pkgs_dir individually without catching NoWritablePkgsDirError, so read-only entries in pkgs_dirs are not supported. Instead, symlink the repodata cache files and package archives from the read-only host mount into the container's writable /opt/conda/pkgs directory.
Member
Author
|
Thank-you for reviewing! I don't have permission to merge (squash), any chance you can do that? |
Member
Author
|
bump, I don't have merge permissions |
Member
|
Did you / do you still want to complete the test plan? |
Member
Author
|
Closing this: I misunderstood, CI runners start fresh each run so there's no warm host cache to share |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
pkgs_dirs[0]) read-only inside Docker build containerspkgs_dirs_get_host_pkgs_dir()helper that triesconda.base.contextfirst, falls back to path heuristic--no-share-host-cacheCLI flag for opt-outSavings
30-90s per Docker build by avoiding redundant repodata downloads (~100+ MB from conda-forge).
Risks
:romount prevents any corruption of host cache:romeans no locks are createdTest plan
pytest test/bioconda-utils build recipes/ config.yml --dockeron a sample recipe--no-share-host-cacheto confirm opt-out works