Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run Nvidia Vulkan headless without Xorg! #24

Open
boberfly opened this issue Feb 27, 2019 · 17 comments
Open

Run Nvidia Vulkan headless without Xorg! #24

boberfly opened this issue Feb 27, 2019 · 17 comments

Comments

@boberfly
Copy link
Contributor

Hi all,

I was looking around and discovered something kind of amazing and can potentially resolve the need for the nv_vulkan_wrapper entirely. This is what I did:

  1. Make a file for testing in $HOME/.local/share/vulkan/icd.d/nvidiaegl_icd.json
  2. Fill it with this, which is a copy of nvidia's standard icd only it is pointing to another library (I am using 415.27 of nvidia's driver):
    { "file_format_version" : "1.0.0", "ICD": { "library_path": "libEGL_nvidia.so.0", "api_version" : "1.1.84" } }
  3. Copy primus_vk.json to $HOME/.local/share/vulkan/implicit_layer.d/ like normal that is pointing to the actual primus_vk.so and NOT the nv_vulkan_wrapper at all.
  4. Run a Vulkan app with just enabling the ENABLE_PRIMUS_LAYER=1 your_vulkan_app

I'm doing this with The-Forge and the log file now tells me it is using my Nvidia card instead of my AMD card. I am not using an optimus laptop here also, so this seems to work on a dual-gpu setup in a regular PC. My display is going through a Vega Frontier Edition but it is being rendered on a Quadro K2000 (I will later test on an RTX 2070 and see if it works).

I have noticed some crazy stutter though, so there might be something else which needs to be fixed here.

@leonmaxx
Copy link

leonmaxx commented Feb 27, 2019

Quite interesting, I'll surely will test this approach on this weekend. Probably will need to modify bumblebee to not start Xorg, or create some script to enable discrete video card and load nvidia modules to test.

Edit: optirun already have option --no-xorg :).

@leonmaxx
Copy link

I have noticed some crazy stutter though

Do your test app have V-Sync enabled?

@boberfly
Copy link
Contributor Author

@leonmaxx possibly, V-Sync might be the issue here you're right. Maybe I need to apply that modesetting flag too I'll try that when I get the chance, cheers for the tip!

As for the bumblebee question, this is probably right but I can't use it here as I get some error, but I am not worried about power usage in my workstation so no big deal... :)

@boberfly
Copy link
Contributor Author

One thing I noticed is that vulkaninfo will display that the device does exist, but it segfaults when looking for available outputs, which makes sense as we are loading a vulkan instance from EGL in an environment which it is not expecting (perhaps that modesetting flag nvidia-drm.modeset=1 is the key to enabling this?)

@leonmaxx
Copy link

@boberfly If you use kernel 4.17+, you can use in-kernel pci-e power management. All you need is just to unload nvidia kernel modules and set /sys/bus/pci/*device*/power/control to auto, and kernel will put your card to suspend state. It'll wake-up automatically when nvidia kernel modules is loaded.

@leonmaxx
Copy link

If interested I use this udev rule to apply pm automatically on boot nvrtpm.rules:

ACTION=="add|change", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030200", ATTR{power/control}="auto"
ACTION=="add|change", SUBSYSTEM=="pci", ATTR{vendor}=="0x10de", ATTR{class}=="0x030000", ATTR{power/control}="auto"

@boberfly
Copy link
Contributor Author

Sadly anything past kernel 4.15 absolutely does not work on my machine, the amdgpu driver doesn't seem to work anymore in 4.17 (it did before) and past 4.18 onwards there is something weird with my hardware topology which does a kernel panic on startup, something I need to bug up about but haven't had the chance or patience yet...

@leonmaxx
Copy link

I can confirm it's working! I just launched WoT with suggested changes to .json and ENABLE_PRIMUS_LAYER=1 optirun --no-xorg wine WorldOfTanks.exe and it works!

@leonmaxx
Copy link

If anyone wants to test I added primus-vk-headless package to my copr repository which works without Xorg.

@boberfly
Copy link
Contributor Author

Good to hear @leonmaxx is the performance decent, and do you have nvidia-drm.modeset=1 set with vsync on/off?

@boberfly
Copy link
Contributor Author

I have a patch set which will come soon which allows 2 environment variables to set which GPU does what via vendorID:deviceID hex numbers, at least in my case I need to do this with 2 discrete GPUs.

@leonmaxx
Copy link

Performance is good, I have stable 60 FPS with V-Sync on. Mouse latency seems to be better.
I do not have nvidia-drm.modeset=1 set. I use only nvidia kernel module, other modules nvidia-drm and nvidia-modeset is disabled using alias ... off.
In my notebook I have GeForce 1050 without hardware outputs (PCI device class 0x302 - 3D accelerator), possibly that is why it works without modeset.

@leonmaxx
Copy link

And I didn't noticed any stuttering.

@boberfly
Copy link
Contributor Author

boberfly commented Mar 1, 2019

@leonmaxx hey I got vsync working, it has fixed the hitching but it is very very slow on The-Forge unfortunately

Also I just made a PR which allows to set env vars on what GPU to use for display and rendering if you wanted to test it, but I guess you don't need it on optimus laptops...

@wirr00
Copy link

wirr00 commented May 10, 2019

This seems to be working like a charm - in fact it is the only way for me to get wine/dxvk up & running on an (Optimus) Nvidia GTX 860M using proprietary drivers v418.56 on Debian. If I can contribute with some test results please let me know.

@felixdoerre
Copy link
Owner

@wirr00 I also have the proprietary drivers v418.56 and am using primus_vk as described in the README (with wrapper and libGL.so)

Generally to the idea to always use the vulkan driver from libEGL.so: I have no idea what's the real difference between the different vulkan drivers shpped by nvidia. On my system I can ICDs in: libEGL_nvidia.so.0 libGL.so.1 libGLX_nvidia.so.0. I cannot tell, what's the difference between those versions and which version has what advantages. I'd like to stick to libGL as this seems to be the vulkan ICD that is installed "normally". However libGL seems to misbehave in terms that is requires the secondary X-Server from bumblebee and libEGL_nvidia does not. Does anyone of you know any documentation/explanation what these libraries are supposed to be for and what their differences are?

@boberfly
Copy link
Contributor Author

Hi @felixdoerre
Not sure about documentation, but my assumption is that Linux distros will eventually default to Wayland, and GLX for context-creation was always tied to the Xserver, and on Wayland you would use EGL for context-creation, so this is probably why Nvidia ships two libraries for either situation. I think the end goal is to use Xwayland for legacy GLX contexts once Wayland takes over the Xserver as the default, but Nvidia will ship the GL/GLX lib for years to come while stable distros like RHEL/CentOS use X as default...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants