-
Notifications
You must be signed in to change notification settings - Fork 0
Add precompilation and generate custom sysimage for production #73
Description
TL;DR: We need to add generation of a custom Julia sysimage in the docker build process.
Julia is a strongly typed dynamic language with a JIT compilation process.
Because of its dynamic nature, methods/functions need to be compiled prior to first use.
When developing packages, this is okay as things typically change quite rapidly.
In production contexts, it incurs a significant cold startup time that could be avoided.
I've experimented with combinations of two approaches: including precompilation signatures and generating a custom system image (a "sysimage") to reduce this start up time.
I used the script below, with additional timing information added to initialise_data():
using TOML
using ReefGuideAPI
cfg = TOML.parsefile(".config.toml")
@time ReefGuideAPI.initialise_data(cfg)Precompilation signatures are directives to the Julia compiler so it is aware what methods should be compiled prior to first use.
This is referred to as TTFX, or Time-To-First-X, where X is whatever arbitrary work needs to be done) at the cost of package load times.
- https://julialang.org/blog/2021/01/precompile_tutorial/
- https://github.com/JuliaLang/PrecompileTools.jl
- https://github.com/rikhuijzer/PrecompileSignatures.jl
I use PrecompileSignatures.jl to generate these signatures as well as PrecompileTools.jl to force compilation of additional methods defined outside of ReefGuide.
The below is added to ReefGuide:
using PrecompileSignatures: @precompile_signatures
using PrecompileTools
# ... usual package code ...
# Auto-generate precompilation signatures for ReefGuide
@precompile_signatures(ReefGuideAPI)
# Force precompilation of methods that slow down initial use.
@setup_workload begin
@compile_workload begin
GeoParquet.read(joinpath(pkgdir(ReefGuideAPI), "assets", "dummy.parq"))
GDF.read(joinpath(pkgdir(ReefGuideAPI), "assets", "dummy.gpkg"))
end
endIn addition to the above, I compile a custom sysimage with PackageCompiler.jl.
Using a custom sysimage should virtually eliminate package load times, however this should only be used for production deployments as packages compiled into a sysimage cannot be changed.
Timings are below.
For each, I run the script mentioned above twice in a fresh session.
The first represents a completely fresh environment.
The second represents an environment that has some package code already precompiled, as might happen in a Docker container.
Before any changes
Initialise data: 18.763472 seconds (81.43 M allocations: 5.875 GiB, 18.52% gc time, 77.83% compilation time: 2% of which was recompilation)
Total startup time including package load: 49.976926 seconds (100.74 M allocations: 6.975 GiB, 8.33% gc time, 30.50% compilation time: 5% of which was recompilation)
Initialise data: 18.633717 seconds (81.43 M allocations: 5.875 GiB, 17.79% gc time, 75.64% compilation time: 2% of which was recompilation)
Total startup time including package load: 26.232391 seconds (91.18 M allocations: 6.398 GiB, 14.65% gc time, 55.75% compilation time: 5% of which was recompilation)
With precompile signatures
Initialise data: 15.782065 seconds (65.51 M allocations: 5.071 GiB, 22.82% gc time, 73.08% compilation time: 3% of which was recompilation)
Total startup time including package load: 72.208752 seconds (85.03 M allocations: 6.184 GiB, 5.87% gc time, 16.99% compilation time: 6% of which was recompilation)
Initialise data: 16.247786 seconds (65.51 M allocations: 5.071 GiB, 23.22% gc time, 69.85% compilation time: 3% of which was recompilation)
Total startup time including package load: 24.201942 seconds (75.42 M allocations: 5.604 GiB, 17.75% gc time, 49.20% compilation time: 6% of which was recompilation)
As indicated above, initial package load times increase with a slight decrease to TTFX.
In our specific case, adding more precompilation directives will likely make initial package load times even longer.
So what if we bake ReefGuide into the Julia sysimage?
using PackageCompiler
using ReefGuideAPI
create_sysimage(["ReefGuideAPI"]; sysimage_path="ReefGuideSysImage.dll")
# Need to tell Julia to compile for all deployment relevant architectures later; cpu_target=""
# Now should be able to start with custom sysimage:
# julia -q -J ReefGuideSysImage.dllNote: Compilation of custom sysimage takes 8-9mins on my laptop.
Initialise data: 10.726559 seconds (43.51 M allocations: 4.013 GiB, 26.24% gc time, 53.83% compilation time)
Total startup time including package load: 10.740834 seconds (43.52 M allocations: 4.013 GiB, 26.20% gc time, 53.86% compilation time)
Initialise data: 10.767349 seconds (43.52 M allocations: 4.013 GiB, 27.89% gc time, 54.52% compilation time)
Total startup time including package load: 10.778395 seconds (43.52 M allocations: 4.013 GiB, 27.86% gc time, 54.54% compilation time)
In other words, package load times are virtually eliminated, and it takes ~10 secs to run initialise_data().
About ~4 secs of this is loading the raster stack so that could be removed once the new improvements are confirmed to be working.