-
-
Notifications
You must be signed in to change notification settings - Fork 13
Open
Description
Dear anndataR team,
The memory usage for anndataR is much higher compared to the python interface of anndata.
For the same file (https://datasets.cellxgene.cziscience.com/df829d08-839f-4f77-8581-5edc9705a5a7.h5ad), the memory usage from python is around 3.19 GB:
total used free shared buff/cache available
Mem: 65420264 26791720 33263416 30476 5365128 38050976
Swap: 0 0 0
And the memory usage from anndataR is around 5.95 GB, even when the data matrix is not loaded:
Error reading element X of type <csr_matrix>
'R_Calloc' could not allocate memory (1971718190 of 8 bytes)
> system('free')
total used free shared buff/cache available
Mem: 65420252 49919548 14864860 34552 635844 14910220
Swap: 0 0 0
After forcing garbage collection, memory is rapidly freed up:
> gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 6336582 338.5 13590895 725.9 8839828 472.1
Vcells 13390530 102.2 4377879324 33400.6 5930311734 45244.7
> system('free')
total used free shared buff/cache available
Mem: 65420252 3617396 61166988 34552 635868 61212372
Swap: 0 0 0
The same out-of-memory error is raised even when trying to read the same dataset on a cluster with much larger RAM (45.44 GB), where the process is automatically killed.
Would you please look into this issue as there might be something in the backend that causes this excessive RAM usage?
Many thanks,
YH
Metadata
Metadata
Assignees
Labels
No labels