Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[macOS] Window freezes and flashes magenta unpredictably on random file loads #17

Closed
foxnne opened this issue Jul 11, 2023 · 9 comments · Fixed by #44
Closed

[macOS] Window freezes and flashes magenta unpredictably on random file loads #17

foxnne opened this issue Jul 11, 2023 · 9 comments · Fixed by #44
Assignees
Labels
help wanted Extra attention is needed

Comments

@foxnne
Copy link
Owner

foxnne commented Jul 11, 2023

This is a long-standing issue since moving to zig-gamedev, I'm unsure of the cause and it seems to be rather random, some file loads trigger it and some don't, and not the same files every time. It can be forced to happen by loading many files, i.e. packing a full project.

I suspect that the issue is related to this, however, I'm unable to verify yet until the dawn lib is updated.

I did try to debug the application using Xcode, which did at least reveal the following error messages:

2023-07-11 13:09:07.434221-0500 Pixi[71341:647781] Metal GPU Frame Capture Enabled
2023-07-11 13:09:07.434349-0500 Pixi[71341:647781] Metal API Validation Enabled
info: [zgpu] High-performance device has been selected:
info: [zgpu]   Name: Apple M2
info: [zgpu]   Driver: Metal driver on macOS Version 13.4.1 (Build 22F82)
info: [zgpu]   Adapter type: discrete_gpu
info: [zgpu]   Backend type: metal
2023-07-11 13:09:18.445450-0500 Pixi[71341:647781] +[CATransaction synchronize] called within transaction
2023-07-11 13:09:19.581168-0500 Pixi[71341:647781] [default] CGSWindowShmemCreateWithPort failed on port 0
2023-07-11 13:09:45.459204-0500 Pixi[71341:648487] Execution of the command buffer was aborted due to an error during execution. Caused GPU Timeout Error (00000002:kIOGPUCommandBufferCallbackErrorTimeout)
...
2023-07-11 13:09:45.465631-0500 Pixi[71341:648487] Execution of the command buffer was aborted due to an error during execution. Ignored (for causing prior/excessive GPU errors) (00000004:kIOGPUCommandBufferCallbackErrorSubmissionsIgnored)
@foxnne foxnne added the help wanted Extra attention is needed label Jul 11, 2023
@foxnne foxnne added this to the 0.1 milestone Jul 11, 2023
@foxnne foxnne self-assigned this Jul 11, 2023
@foxnne
Copy link
Owner Author

foxnne commented Aug 9, 2023

After moving to mach-core which at the time was the only way to get an updated dawn binary, this issue became nearly solved. The massive memory leak we had prior to that in Dawn is now seemingly gone, or at least, very very much reduced.

However, the bug remains. Its way harder to predict now, as I can spam the Pack Project button, loading several files at once and packing everything and releasing the files, and the issue will not occur for a large number of attempts. I have not been able to reproduce it other than knowing it eventually happens on a file load.

I have distributed debug statements through the loading function, and observed that when I did eventually trigger the bug, all debug statements were still written, leading me to believe that its still Dawn having the issue on macOS.

As much as I would love to fix this now, I just do not know where to begin to do so. I have spoken with slimsag and it seems the initial leak was not present in his testing on an older macOS version. This bug could be something that only happens on the OS version I'm on (13.4.1) with Dawn.

@foxnne foxnne removed this from the 0.1 milestone Aug 18, 2023
@foxnne
Copy link
Owner Author

foxnne commented Oct 5, 2023

I'm fairly certain this has to do with Dawn and other users are experiencing similar issues here: zig-gamedev/zig-gamedev#411

@kamidev
Copy link
Contributor

kamidev commented Oct 5, 2023

As much as I would love to fix this now, I just do not know where to begin to do so. I have spoken with slimsag and it seems the initial leak was not present in his testing on an older macOS version. This bug could be something that only happens on the OS version I'm on (13.4.1) with Dawn.

I also saw the magenta flashing on macOS 13.4.1. But my current memory freeze problems are on macOS 14.0. I get no flashing there, just a complete freeze that forces me to reboot (see zig-gamedev/zig-gamedev#411).

Update: If I run videostreams or some memory-intensive things in the background, the glfw apps does magenta flash on macOS 14.0, too.

@foxnne foxnne pinned this issue Oct 7, 2023
@foxnne
Copy link
Owner Author

foxnne commented Oct 9, 2023

Update: If I run videostreams or some memory-intensive things in the background, the glfw apps does magenta flash on macOS 14.0, too.

Not sure if you have a handy way of testing, but I've noticed the behavior for me mainly only happens when I'm loading files. I think that's similar to what you describe. In Pixi I can basically trigger a file load on a button press, and its decently easy to repeat the behavior using that.

@foxnne
Copy link
Owner Author

foxnne commented Dec 19, 2023

Update: After updating to Sonoma 14.1.2, it seems this issue is worse. I experience the magenta screen and hard freeze far more frequently. I spoke with pdoane and it seems this issue is not present in sysgpu, mach's answer to Dawn. I believe there are a few blockers here before Pixi can use sysgpu but I'll be swapping over as soon as possible.

@kamidev
Copy link
Contributor

kamidev commented Dec 19, 2023

Update: After updating to Sonoma 14.1.2, it seems this issue is worse. I experience the magenta screen and hard freeze far more frequently. I spoke with pdoane and it seems this issue is not present in sysgpu, mach's answer to Dawn. I believe there are a few blockers here before Pixi can use sysgpu but I'll be swapping over as soon as possible.

Thanks! I just updated to 14.1.2, haven't checked yet. Interesting that sysgpu is unaffected!

@foxnne
Copy link
Owner Author

foxnne commented Dec 21, 2023

I've created a sysgpu branch that uses a set of compatible generated Imgui bindings. I'm really hoping to have that swapped over tomorrow, but it's a ton of changes to swap Imgui bindings. Anyway, when that's done it will be a good test to make sure that issue isn't present in their implementation, and if it works well on all platforms I'll merge it into the main branch and finally close this issue. Its been driving me crazy trying to draw some assets with all the constant freezes.

@foxnne
Copy link
Owner Author

foxnne commented Dec 22, 2023

Good news! I've completed the painful task of switching Imgui bindings, which allows us to easily swap between Dawn and sysgpu backends. I can confirm I have had zero freezes/crashes with sysgpu. I will keep the sysgpu branch separate and develop there however until a single bug gets ironed out which causes broken Windows builds. Once that is merged, I'll merge the sysgpu branch into main and we can finally close this awful issue. :)

@foxnne
Copy link
Owner Author

foxnne commented Jan 17, 2024

I've merged the sysgpu branch into main. Currently there are a few misalignments in the development of pixi and mach-sysgpu/mach-core, so it's not set as default yet. However, you can easily enable it by zig build run -Duse_sysgpu=true. We will be frozen on this current version of mach-core until sysgpu is re-enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants