-
Notifications
You must be signed in to change notification settings - Fork 15
Improve handling of parallel CUDA stacks #179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Possible approach: explicit matrix layers Define a matrix layer (for example pytorch), higher layers that depend on it are automatically Sketch (requires reserving [[runtimes]]
name = "cpython-3.11"
# ...
[[frameworks]]
name = "pytorch-{variant}"
requirements = [
"torch==2.7.0",
]
variants = {
"cpu" = {
"requirements" = [],
},
"cu128" = {
"requirements" = [
"torch @ https://download.pytorch.org/whl/cu128/torch-2.7.0%2Bcu128-cp311-cp311-win_amd64.whl",
],
},
}
# ...
[[frameworks]]
name = "docling"
# Builds against pytorch-cpu, runs against any variant
frameworks = ["pytorch-{cpu}"]
# ...
[[applications]]
name = "docling-{variant}"
# Builds against every pytorch variant
# Use a comma separated list to build against a subset of variants
# Only one `*` is permitted in the framework list
frameworks = ["pytorch-{*}"]
# ... The application definition above would be a shorthand for: [[applications]]
name = "docling-{variant}"
variants = {
"cpu" = {
"frameworks" = ["pytorch-cpu"],
},
"cu128" = {
"frameworks" = ["pytorch-cu128"],
},
}
# ... Frameworks would also be permitted to specify star-variant dependencies, which would be suitable for use cases like splitting The sketch uses the "variant" terminology because this feature is intended specifically for the same cases as the "wheel variants" work. Locking two different variants of a layer should give a consistent set of requirements. If they're arbitrarily different, then those are different layer definitions, not layer variants. The syntax is designed such that as the standardisation work progresses, we should be able to specify the relevant wheel variant selection criteria as part of the layer variant definitions. |
Possible approach: compatible framework layers Instead of automatically duplicating higher layers, allow higher layers to declare "One of X" style dependencies (potentially based on a layer tagging mechanism rather than specifically naming layers). One layer is nominated as the layer to use when building, resulting layer lock is checked for consistency with the other compatible layers. Edit: after expanding on the explicit matrix idea, it's hard to see any real benefits in requiring users to define each compatible layer separately. That kind of flexibility makes sense when deployment is a true "mix and match" exercise, but outside wheel variants, that isn't the intended usage model for venvstacks. |
Just to make sure we're on the the same page: the problem we want to avoid is having two copies of the same docling dependencies on the user's machines. When we want to deploy an incremental upgrade to the docling framework, we want to deploy only one docling framework layer, and two app layers -- one for CUDA and one for CPU. The matrix solution sounds like there would be two copies of the docling framework on the users' machines. I think these two issues are relevant:
|
Given the dealbreaker, I reworked the matrix idea to allow for framework layers that are built against a default version of a matrix layer, but are expected to be runtime compatible with all the variants of that layer. |
Pre-requisite steps before embarking on the wheel variant support:
|
The matrix idea looks reasonable to me. Wanted to point out two things, (1) the |
The Python ecosystem doesn't currently have a universal way of handling variations in low level hardware support outside the combination of operating system and CPU architecture represented in wheel platform tags.
While there is work in progress to comprehensively address that issue via "wheel variants",
venvstacks
still needs its own mechanism for handling this problem (preferably in a way that will work with, rather than against, the ongoing wheel variant design work).As a concrete example, consider the following scenario:
docling
is to be exposed via avenvstacks
application layerFor this scenario, the desired outcomes are that:
docling
framework layer that the app layer combines with the relevantpytorch
layersSome potential approaches are considered in the comments below.
Other issues potentially impacted by this one:
show
command that summarises a parsed stack definition #159 (as the mapping from the stack config as written to the deployed stacks gets more complex, it becomes harder to visualise)The text was updated successfully, but these errors were encountered: