Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large memory use #30

Open
davidvancemartin opened this issue Jun 17, 2019 · 4 comments
Open

Large memory use #30

davidvancemartin opened this issue Jun 17, 2019 · 4 comments

Comments

@davidvancemartin
Copy link

Great software! Running into a memory issue though I was hoping you could help with. I have a simple for loop with on order ~50-500 iterations and inside each one I run wotan.flatten on a Kepler light curve (long 30 minute cadence, so arrays of ~60,000 elements. The aim is to assess how good different detrending methods are.

I am finding that according to the Activity Monitor a very large amount of memory is being taken up with this, on the order of 10's of GB. It is also making it impossible to CTRL-C cancel out of the python loop.

Even if I just do single instances of wotan.flatten it takes about 100 MB of memory per run and that does not seem to get released until I restart python.

Have you encountered this before? Is there an easy way of clearing up the memory? I have tried Garbage Collection (gc.collect) without luck.

Thanks!

@hippke
Copy link
Owner

hippke commented Jun 18, 2019

Thanks! Sounds like the same numba memory leak I encountered in the past. Can you check if it goes away withmethod=medfilt? Not that you should use the median, just to check the memory.

@davidvancemartin
Copy link
Author

Hey Michael, thank you for the response. I just checked and method='median' (medfilt brought up an error) and method='mean' do not seem to have a significant memory leak. 500 iterations using method='mean' took about 150 mb of memory (although that memory usage does seem to remain until python is reset).

Specifying method='biweight' (or just not setting the method, which I believe would then default to 'biweight') seems to be the cause of the issue.

@hippke
Copy link
Owner

hippke commented Jun 18, 2019

Thanks! I have narrowed down the problem to be a bug in numba, which is known and discussed here. This will probably be fixed in the next numba release. In the meantime, I will try to implement a work-around. I will post here again about the progress.

Regarding medfilt, you have to give the window_length in cadences, which must be a not too small integer. Probably that was why you got the error.

And yes, giving no method defaults to the biweight.

@hippke
Copy link
Owner

hippke commented Jun 18, 2019

Mhm, I'm trying to make a simple example to show the leak, but I can't reproduce it. Can you try to run this code on your machine to check if it leaks? Perhaps it happens only with some versions of numba, numpy etc.
My versions are (from pip list)

  • numba 0.42.0
  • numpy 1.16.4
  • wotan 1.0.6.

Test code which doesn't leak:

import psutil
import numpy as np
from wotan import flatten

points = 60000
flux = np.random.normal(1, 0.0001, points)
time = np.linspace(0, 1000, points)
base_mem = psutil.virtual_memory()[3]

for idx in range(100):
    flatten_lc, trend_lc = flatten(time, flux, method='biweight', window_length=0.2, return_trend=True)
    new_mem = psutil.virtual_memory()[3]
    print('Memory usage (MB):', (new_mem - base_mem) / 1024 / 1024)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants