solver: add per-step CPU and memory resource limits#6569
solver: add per-step CPU and memory resource limits#6569jirimoravcik wants to merge 1 commit intomoby:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds support for specifying per-exec Linux cgroup CPU/memory constraints via LLB OpMetadata (so limits affect runtime execution but not the cache key), plumbed from the Dockerfile frontend/client through the solver into OCI spec generation.
Changes:
- Introduces
LinuxResourcesintopb.OpMetadataand propagates it through vertex options, exec op handling, and executor metadata. - Adds client-side LLB APIs (
WithLinuxResources,MemoryLimit,CPUQuota, etc.) and Dockerfile/frontend attribute parsing to set these limits. - Extends OCI spec generation (Linux-only) to apply the resource limits, with integration/unit tests covering behavior and cache-key invariance.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| solver/types.go | Extends VertexOptions to carry Linux resource constraints. |
| solver/pb/ops.proto | Adds LinuxResources message + linux_resources field to OpMetadata. |
| solver/pb/ops.pb.go | Regenerates protobuf Go bindings for new metadata field/message. |
| solver/pb/ops_vtproto.pb.go | Regenerates vtproto fast-paths to include LinuxResources. |
| solver/pb/caps.go | Introduces and registers capability exec.meta.linux.resources. |
| solver/llbsolver/vertex.go | Propagates OpMetadata.LinuxResources into vertex options. |
| solver/llbsolver/ops/exec.go | Threads linux resource limits into executor.Meta for exec. |
| frontend/dockerui/config.go | Adds new frontend attr keys + stores parsed linux resources in config. |
| frontend/dockerui/attr.go | Implements parsing of linux resource attrs into pb.LinuxResources. |
| frontend/dockerfile/dockerfile2llb/convert.go | Applies linux resource constraints to RUN when cap supported. |
| frontend/dockerfile/dockerfile_test.go | Adds integration test verifying cgroup memory limit from Dockerfile attrs. |
| executor/executor.go | Extends executor.Meta to include LinuxResources. |
| executor/oci/spec.go | Hooks linux resource option generation into spec creation pipeline. |
| executor/oci/spec_linux.go | Implements OCI spec mutations for CPU/memory/cpuset limits on Linux. |
| executor/oci/spec_windows.go | Returns a clear error for linux resource limits on Windows. |
| executor/oci/spec_freebsd.go | Returns a clear error for linux resource limits on FreeBSD. |
| executor/oci/spec_darwin.go | Returns a clear error for linux resource limits on Darwin. |
| client/llb/meta.go | Adds public llb.LinuxResources struct for the client API. |
| client/llb/state.go | Adds metadata plumbing and constraint opts for linux resources. |
| client/llb/exec.go | Emits capability when linux resources are present in metadata. |
| client/llb/exec_test.go | Adds unit tests for metadata presence, merge behavior, and cache-key invariance. |
| client/client_test.go | Adds integration test validating cgroup limits during solves. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Do I understand correctly that when same step has different resource definitions then what is actually used for the step that runs is undefined? |
Hmm, yeah, that is problematic. Do you think having the limits as parts of the cache key would be better? I think it's common to set the same limits for many builds, which would still work correctly and prevent "undefined" behavior, which is present in the current implementation. |
|
No, having them in cache key would make the steps run multiple times, and cache import to not work if resource limits do not match, what I think is worse. Ideally, we can make it so that the more relaxed resource limits wins. Or even that the build that was started before wins would be improvement. These cases may be simpler to implement if the resources are not part of the LLB checksum(eg. can't be inside Meta then). If this becomes too complicated, just documenting that behavior is undefined may be an option. |
Ok, makes sense. I implemented logic that chooses the most "relaxed" resources. Happy to address any feedback. Thanks! |
Add support for setting cgroup resource limits (memory, memory-swap, cpu-shares, cpu-period, cpu-quota, cpuset-cpus, cpuset-mems) on individual build steps. Signed-off-by: Jiří Moravčík <jiri.moravcik@gmail.com>
c48b5d8 to
7546f2e
Compare
|
@tonistiigi Hello, rebased the PR to have one commit only. Can you run the CI? Thanks |
Adds support for setting cgroup resource limits on individual build steps:
memory,memory-swap,cpu-shares,cpu-period,cpu-quota,cpuset-cpus, andcpuset-mems. These correspond to the legacy Docker build API. In case of several builds with different limits, the most relaxed limits are chosen.The limits do not affect the cache key.
I think we should document that the BuildKit steps can run in parallel anyway and mention that these limits only apply to a single step. (On the other hand, I still feel this can be useful)
Closes #593