Unusually high memory usage for anndataR compared to python anndata

Dear anndataR team,

The memory usage for anndataR is much higher compared to the python interface of anndata. 
For the same file (https://datasets.cellxgene.cziscience.com/df829d08-839f-4f77-8581-5edc9705a5a7.h5ad), the memory usage from python is around 3.19 GB:

```
              total        used        free      shared  buff/cache   available
Mem:       65420264    26791720    33263416       30476     5365128    38050976
Swap:             0           0           0
```

And the memory usage from anndataR is around 5.95 GB, even when the data matrix is not loaded:
```
Error reading element X of type <csr_matrix>
'R_Calloc' could not allocate memory (1971718190 of 8 bytes)
> system('free')
              total        used        free      shared  buff/cache   available
Mem:       65420252    49919548    14864860       34552      635844    14910220
Swap:             0           0           0
```
After forcing garbage collection, memory is rapidly freed up:
```
> gc()
           used  (Mb) gc trigger    (Mb)   max used    (Mb)
Ncells  6336582 338.5   13590895   725.9    8839828   472.1
Vcells 13390530 102.2 4377879324 33400.6 5930311734 45244.7
> system('free')
              total        used        free      shared  buff/cache   available
Mem:       65420252     3617396    61166988       34552      635868    61212372
Swap:             0           0           0
```
The same out-of-memory error is raised even when trying to read the same dataset on a cluster with much larger RAM (45.44 GB), where the process is automatically killed.

Would you please look into this issue as there might be something in the backend that causes this excessive RAM usage? 

Many thanks,
YH

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Unusually high memory usage for anndataR compared to python anndata #329

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Unusually high memory usage for anndataR compared to python anndata #329

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions