-
Notifications
You must be signed in to change notification settings - Fork 967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add caching of downloaded actions to achieve parity with the tool cache functionality #3551
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Tucker Fowler <[email protected]>
if (hasActionArchiveCache && externalCachingEnabled) | ||
{ | ||
executionContext.Output($"Saving archive file to cache at '{cacheArchiveFile}'"); | ||
Directory.CreateDirectory(Path.GetDirectoryName(cacheArchiveFile)); | ||
File.Copy(archiveFile, cacheArchiveFile, true); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By adding the bit of code in the PR we are able to step into the middle of the actions download functionality and save the downloaded archive to the cache folder without any change to the rest of the logic flow. On subsequent runs, that archive is found and used by all runners as intended.
Further, we also attempted to mount the NFS as the temp folder, just as a “let’s try it and see what happens” kind of thing, knowing that we would be persisting far too much data. That also did not work.
According to our testing, a mounted NFS archive directory, regardless of “tool” or “action” cache HAS to exist in a directory outside of any of the directories created by the runner. This means that in order for a mounted persistent cache to work, the actions caching process HAS to function in the way that the tool cache functions e.g., persisting the downloaded archive to the provided cache directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requested by Account Manager -
We are building out a NOVEL approach to actions and tool caching in ARCv2 that requires a small change to the runner's action caching process.
Right now the runner has some mechanisms for caching tools and actions to decrease run time in workflow runs in github hosted runners. However the caches currently depend on pre deployment cache population using https://github.com/actions/action-versions scripts in a multi-stage docker build.
During a fast track engagement with Ken Muse, we implemented those caches in our self hosted runners, however we use an inordinate number of tools and actions across our organization and the time to build the docker image with those caches is measured in hours and results in a massive docker image.
We have found a way to cache in real time using a many to one gcp filestore NFS mounted in a Persistent Volume in our gke cluster deployment of actions scale set. The Persistent Volume claim can then be mounted to the runner container's filesystem.
The tool cache, which is a standing github solution, uses an environment variable to point to a custom directory and store downloaded tools in said directory, the flow is as follows- check cache for tool, if found use it, if not found download it and persist it in the tool cache directory and then use it, repeat. This makes it possible to mount our NFS to an arbitrary directory and have the tools that are downloaded persisted in that arbitrary directory for further use.
The actions cache mechanism is different however. The directory that the actions cache environment variable points to is only used to check for previously downloaded actions. The cache directory is in no way used to store actions that are downloaded so that they can be persisted. The logic flow is as follows - Check cache directory for action, if found use it, if not found download it -> move it to _temp -> expand it to a random GUID named file -> use decompressed action, end. This makes it impossible to persist actions in the same way that the tool cache persists tools. E.G. the actions cache is never populated with downloaded actions when the cache variable is provided.
This pull request implements a logic gate dependent on an envirment variable being set to true, where if remote caching is enabled, we copy the downloaded SHA.tar.gz/zip to the actions cache location so that all runners across the ARC ecosystem now have access to the cached actions/tools in real time after first download.
This solution has been tested out using the contribution docs.
This added functionality makes the use of real time NFS tool and actions caching possible.
Actual Changes
ACTIONS_RUNNER_ACTION_ARCHIVE_EXTERNAL_CACHING_ENABLED
which maintains current actions caching functionality (downloads actions without caching) while allowing for external caching functionalitytrue
Authored by