-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Harvester memory leakage [Bug] #16925
Comments
can you provide logs? this happens consistently? are all plots healthy (no lookup failures due to bad drives, etc) |
In addition to the log, can you give us a bit more info about your farm as well? How many plots, compression levels, if you have a mix of classic (non-compressed) and compressed, etc. |
For example on a PC with a full node: K33 10 8% Total is about 60 tib and they are distributed to three PCs (one full node+2 harvesters). I am experiencing the same issue on all three. One of the harvesters has 32GB RAM. Other PCs have 6GB. |
Same problem when using compressed plot farming on windows 10. |
I'm experiencing this same issue, but only with my remote harvester. And I have a large footprint. It's a Dell server with 2x 8 core 2.1Ghz CPUs, 384GB of RAM, with 40,708 C05 plots. It doesn't seem to matter what I set the swapfile to. After less than a day, I get the virtual memory exhaustion somewhere in the System Logs and the node stops providing any plots to my main farmer. start_harvester.exe continues to run most of the time. It just does nothing. So, my powershell script that monitors it doesn't trigger an alert. I had previously run all of these plots of of my main harvester that also had about 37,000 more plots. It didn't have virtual memory issues like this. It had rather high latency with the harvester, so that's why I moved 1/2 to a remote harvester. |
I found the solution to this problem! At least, this fixed it on both my farmer and my remote harvester. It looks like I had encountered this problem on my original farmer 6 months ago, but had forgotten what the fix was. I had modified a powershell script that I'd found online to fix it. After running this, I've not had any low virtual memory errors. The issue is that Windows Defender keeps trying to scan or block some of the files involved. (I'm not sure if it's the executables or the plots) I assume that it gets in the middle of the plot checks and the harvester has to abandon it and try to redo the checks, but doesn't release the memory used. Over time, this creates a problem on the system. So, I modified this script to exclude the chia directories, the key executables, and all .plot files. For some reason, I can't find where I originally got the powershell script off the web. Attached is my version if anyone else wants to use it. Note that my file has a .txt extension, but you'll need to change that to .ps1 and run it within powershell. |
Thank you for your solution, but I always close MSWD(fxxk MSWD) when windows installed. So I think this cannot solve my problem. Anyway, I will try it. And I will want to beat the shit out of MSWD if the problem is truely caused by MSWD. My solution is using GUI instead of CLI. The GUI farmer still cause leak, but for some reasons, memory usage will be controled when it close to max virtual memory. And will not gain, even you close the GUI and restart it without restart OS. However the GUI will act strange after a long time running(about 2 weeks, this problem is long-standing, exist before the compress-plot chia version), and you will need to restart it. But in CLI, each time you call a chia process(like "chia farm summary" when you want to check the farmer), the memory usage will gain, and finally lead to OOM. My farmer server only run chia and utorrent(for PT download, very little memory used), so I can tolerate this kind of memory leak with an upper limit, now it works well. However, I still want to see the final solution of this problem. |
Darn. I hoped that what I'd found would help someone else. Are you running any other antivirus or anti-malware on your system? I experienced the OOM error even when I ran the full node GUI on this system, as well--though not as often as when running only the harvester. The Windows Defender exclusions fixed it both for the full GUI and the harvester. |
Maybe the MSWD is not obediently shut down, I will check it later. |
My walkaround is a daily harvester reboot using a task scheduler. Memory leakage is not very violent. But it is not a beautiful solution of cause. |
Can you guys make sure Windows Defender and other antivirus software is out of the picture (use exclusions or worst case disable). Plots are big files so I can see how it might cause issues with scanners and memory usage. Let us known if u can diagnose the issue further. It seems like this is system specific since it happens only to certain machines. |
I've tried various solutions. I turned off the antivirus, added exceptions, deleted antiviruses, disabled various background applications, but still there is a leak when the harvester is running. If you do not turn on chia, then there is no leakage. I don't understand what the reason might be. |
Can you check the task list and see if the chia harvester is allocating the memory or if it is some other process? We appreciate the screen shots and I've been using Google translate on my phone to look at them but we don't see the process name listed on the memory displays. Thanks for your report! The fact that you are not using compressed plotting greatly reduced the search surface so would be interested in hearing your results. Also, which patched version of Windows 10 are you running? Which CPU? We still think this is machine specific. |
I continue to study this bug. I left the PC to farm chia overnight. The following indicators increased overnight: Committed memory, cache memory and paged pool. I tried to look at it in RAMMap and I think I realized that chia does not work correctly with plot files and the file of the blockchain itself. As I understand it, windows sort of caches them in RAM. In RAMMap, you can clear this cache by clicking on the "Empty Standby List". The memory is cleared instantly and there is no need to restart the PC. There is even a small program on github that can do this. You can configure it to run in the windows scheduler, for example, once an hour. Well, apparently we can only wait for the developers to fix it. |
Empty standby list only cleans up RAM. It does not help with a huge swap. |
Yes, it really only cleans up RAM, but the SWAP file grows just when RAM is overflowing. After I added the Empty standby list |
Closing. Reopen if u still see this with recent versions |
What happened?
Harvester crashes after some days of work. It seems that a memory leakage occurs.
Resource-Exhaustion-Detector:
Windows successfully diagnosed a low virtual memory condition. The following programs consumed the most virtual memory: start_harvester.exe (5716) consumed 20,333,797,376 bytes
It started after enabling compressed plot farming.
Weakest PC info:
WIN 10, 6 gb RAM, GTX 1660, 23GB free on C: after reboot
The problem persists on 3 different PCs
Current config:
Version
2.1.1
What platform are you using?
Windows
What ui mode are you using?
GUI
Relevant log output
No response
The text was updated successfully, but these errors were encountered: