Skip to content

aksarkar/anmf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Amortized NMF

This package provides an implementation of NMF with performance comparable to randomized PCA on sparse data, e.g. arising from scRNA-seq of $10^6$ cells at $10^4$ genes.

The key advances are: (1) the use of amortized inference, which allows the model to be fit on a GPU using minibatch SGD; (2) support for sparse matrices, which allows the entire data matrix to be stored on GPU memory.

Installation

pip install https://www.github.com/aksarkar/anmf.git#egg=anmf

Usage

import anmf
import scipy.sparse as ss
import torch.data.utils as td

# This matrix has shape (593844, 16002) and is 99% sparse
x = ss.load_npz('immune-cell-census.npz')

# This automatically moves to the GPU if available
data = anmf.ExprDataset(x)
loader = td.DataLoader(data, batch_size=32, collate_fn=data.collate_fn)
model = anmf.ANMF(input_dim=x.shape[1], hidden_dim=128, latent_dim=10)
model.fit(loader, max_epochs=10)

# In amortized inference, we learn a function from data to loadings
l = model.loadings(x)
f = model.factors

Releases

No releases published

Packages

No packages published

Languages