Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add uwot.approx_pow option to RunUMAP for better reproducibility #9449

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

daskelly
Copy link

@daskelly daskelly commented Nov 1, 2024

Dear Seurat Team,

The issue of UMAP reproducibility using the uwot method has been raised before #6722, and as noted in that PR this was discussed in a uwot issue jlmelville/uwot#46. Specifically the observation is that OS differences in the underlying implementation of some C/C++ libraries causes very small numerical differences OSs that produce different UMAP results. An option that corrects this was added to uwot version 0.1.8 as discussed in jlmelville/uwot#46. Given that Seurat requires uwot >0.1.10, I thought it would be useful to include this option in Seurat's RunUMAP functions.

I tested this option on two systems (Mac and CentOS). Without the new uwot.approx_pow option (setting it to FALSE) the UMAPs are different, but when setting uwot.approx_pow = TRUE the results are identical. I am not sure why #6722 noted that approx_pow did not solve the issue -- it does solve the problem for my example. See details below for how I tested this using a simple UMAP on pbmc_small:

Running on Mac laptop

install_github('daskelly/seurat@umap-approx-pow')
suppressPackageStartupMessages(library(Seurat))
Sys.info()['sysname']
##  sysname 
## "Darwin"
RunUMAP(pbmc_small, verbose = FALSE, dims = 1:5, uwot.approx_pow = FALSE) |>
  Embeddings(object = _, 'umap') |> head(3)
##                  umap_1    umap_2
## ATGCCAGAACGACT 4.692863  1.759652
## CATGGCCTGTGCAT 5.494108  1.453728
## GAACCTGATGAACC 2.188469 -5.069190
RunUMAP(pbmc_small, verbose = FALSE, dims = 1:5, uwot.approx_pow = TRUE) |>
  Embeddings(object = _, 'umap') |> head(3)
##                   umap_1    umap_2
## ATGCCAGAACGACT 0.1691852 0.2062277
## CATGGCCTGTGCAT 1.4661827 0.7792036
## GAACCTGATGAACC 4.8212224 0.0872156

Running on CentOS

install_github('daskelly/seurat@umap-approx-pow')
suppressPackageStartupMessages(library(Seurat))
Sys.info()['sysname']
## sysname 
## "Linux"
RunUMAP(pbmc_small, verbose = FALSE, dims = 1:5, uwot.approx_pow = FALSE) |>
  Embeddings(object = _, 'umap') |> head(3)
##                  umap_1    umap_2
## ATGCCAGAACGACT 2.765658  3.131156
## CATGGCCTGTGCAT 2.465478  2.309039
## GAACCTGATGAACC 2.971336 -2.502957
RunUMAP(pbmc_small, verbose = FALSE, dims = 1:5, uwot.approx_pow = TRUE) |>
  Embeddings(object = _, 'umap') |> head(3)
##                   umap_1    umap_2
## ATGCCAGAACGACT 0.1691852 0.2062277
## CATGGCCTGTGCAT 1.4661827 0.7792036
## GAACCTGATGAACC 4.8212224 0.0872156

The results above show that the UMAP results are identical across OSs when uwot.approx_pow = TRUE only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant