Strong impact of number of ADT PCs on WNN UMAP? #9482

erflynn · 2024-11-14T19:20:31Z

I'm running WNN following the tutorial on a dataset of about 200,000 skin cells. I noticed that choice of number of ADT PCs has a huge impact on the UMAP -- the UMAP clearly looks better with more ADT PCs, despite the elbow plot indicating we should likely use fewer. I am using the same exact settings (30 RNA PCs) except the number of ADT PCs. Seurat version 4.3.0.

wnn <- FindMultiModalNeighbors(
  wnn, reduction.list = list("pca", "apca"), 
  dims.list = list(1:30, 1:n_adt_pcs), modality.weight.name = "RNA.weight",
  prune.snn=1/20 # adjusted to avoid small clusters
)
wnn <-RunUMAP(wnn, nn.name = "weighted.nn", reduction.name = "wnn.umap", reduction.key = "wnnUMAP_")

Have you noticed this before? Is this expected behavior? What would you recommend doing to proceed?

The text was updated successfully, but these errors were encountered:

erflynn · 2024-11-20T21:08:45Z

Following up on this with a reprex --
Using the bmcite dataset and code in the (WNN tutorial)[https://satijalab.org/seurat/articles/weighted_nearest_neighbor_analysis], we also see the pattern that too few PCs from ADT or RNA lead to poor UMAPs; however, the numbers at which this occurs are much lower (e.g. 3 or 5 PCs rather than 10 or 15). It does seem to be the case that after "enough" PCs, it looks relatively similar? (Though in other cases, particularly in sub-clustering, we've found that "too many" PCs makesthe UMAP look worse.)

I understand that the low #s of PCs do not capture the variation in the data, making the nearest neighbor space for that modality poor. As a result, since WNN considers both the ability of the RNA and ADT PCs to predict cell identity in each space when determining the weights, this adds some noise.

However, I do not understand why, in the original example, while selecting more than 15 ADT PCs does not make sense given the elbow plot, it results in a much cleaner WNN UMAP. We've also looked and these additional PCs do not associate with RNA cell type, and ADT UMAPs with more PCs still look very messy. Can you lend any insights as to why adding these PCs would be helpful? Given this, what do you recommend to do when choosing number of ADT PCs in larger datasets?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strong impact of number of ADT PCs on WNN UMAP? #9482

Strong impact of number of ADT PCs on WNN UMAP? #9482

erflynn commented Nov 14, 2024 •

edited

Loading

erflynn commented Nov 20, 2024

Strong impact of number of ADT PCs on WNN UMAP? #9482

Strong impact of number of ADT PCs on WNN UMAP? #9482

Comments

erflynn commented Nov 14, 2024 • edited Loading

erflynn commented Nov 20, 2024

erflynn commented Nov 14, 2024 •

edited

Loading