Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wishlist: Vulkan backend for Linux #1751

Open
tidux opened this issue Jan 4, 2025 · 3 comments
Open

Wishlist: Vulkan backend for Linux #1751

tidux opened this issue Jan 4, 2025 · 3 comments
Labels
enhancement New feature or request low priority wontfix This will not be worked on

Comments

@tidux
Copy link

tidux commented Jan 4, 2025

The Asahi Linux project has shipped Vulkan drivers for the M chips. Adding support for this stack would allow better cluster management (e.g. exo) for multi machine mlx setups. It would also, in theory, allow running AI models on other unified memory hardware with Linux support, such as AMD APUs. This would actually fulfill a major design goal of Vulkan (and its AMD ancestor Mantle) by providing a common library for unified compute, without the overhead of OpenCL.

@awni
Copy link
Member

awni commented Jan 6, 2025

This is a pretty major undertaking and it's unlikely we will have bandwidth to work on it in the near future.

We'd need a Vulkan runtime back-end and presumably we'd need to rewrite all of the compute kernels in OpenGL SL (?).

In theory it's all doable, particularly given that the Vulkan API is similar to Metal it should be pluggable as a back-end in MLX. It would be pretty interesting to see what this looks like if someone is interested to work on it.

Adding support for this stack would allow better cluster management (e.g. exo) for multi machine mlx setups

Curious, why so?

@awni awni added enhancement New feature or request wontfix This will not be worked on low priority labels Jan 6, 2025
@tidux
Copy link
Author

tidux commented Jan 6, 2025

Curious, why so?

Linux lets you pass GPU devices in to a container, even in a Kubernetes cluster. This means rather than running something like ansible arm_mac_cluster -a "exo 1>/dev/null 2>&1 &!" to start the mlx-using processes, you can apply a Kubernetes Deployment manifest and run it like a real cluster program. If you aren't using exo and need to run your own MLX jobs via a batch framework there are Kubernetes options like Argo, or non-containerized HPC focused batch frameworks. Linux also won't throw a screaming fit if you try to run new binaries without manually approving them through the GUI on each host.

@alyssarosenzweig
Copy link

without the overhead of OpenCL.

What does this mean? Asahi Linux also ships conformant OpenCL 3.0 drivers. I would expect similar performance between OpenCL and Vulkan compute, both use the same backend compiler.

We'd need a Vulkan runtime back-end and presumably we'd need to rewrite all of the compute kernels in OpenGL SL (?).

Not necessarily, anything that can compile to appropriate SPIR-V... that includes HLSL, GLSL, etc.

For OpenCL, it'd likely be OpenCL C but could also be appropriate SPIR-V target.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request low priority wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

3 participants