Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Layer Naming and Barcode Misalignment When Merging Seurat Objects Missing an Assay #9462

Open
maartenciers opened this issue Nov 8, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@maartenciers
Copy link

maartenciers commented Nov 8, 2024

When merging multiple Seurat objects that include assays for RNA, ADT, HTO, and CMO, an issue arises if one object (e.g., NDE005) is missing the CMO assay due to experimental reasons. Upon merging, the layer names for the CMO assay become incorrect, causing a shift in layer labels and barcodes. This misalignment could severely impact downstream analyses, particularly for researchers who have yet to demultiplex their samples after merging. Or did not find a missing assay.

CODE TO GET THE BUG:

seuratObj = merge(x = NDE005_seuratObj,
                  y = c(NDE010_seuratObj,NDE011_seuratObj,NDE012_seuratObj),
                  project = "Cells")

I was merging four Seurat objects, NDE005, NDE010, NDE011, and NDE012, where only NDE005 lacks the CMO assay. After merging, incorrect layer names were observed for the CMO assay:

> seuratObj[["CMO"]]
Assay (v5) data with 12 features for 72632 cells
First 10 features:
 CMO301, CMO302, CMO303, CMO304, CMO305, CMO306, CMO310, CMO312, CMO307,
CMO308 
Layers:
 counts.NDE005, counts.NDE010, counts.NDE011, data.NDE005,
scale.data.NDE005, data.NDE010, scale.data.NDE010, data.NDE011,
scale.data.NDE011 

The layer names should accurately reflect the samples:

  • There should be counts.NDE012, data.NDE012, and scale.data.NDE012, and no counts.NDE005, data.NDE005, or scale.data.NDE005.
    Upon further inspection of the Layers for the CMO assay their data are shifted too:
  • counts.NDE005 incorrectly contains barcodes from NDE010
  • counts.NDE010 contains barcodes from NDE011
  • counts.NDE011 contains barcodes from NDE012
colnames(seuratObj[["CMO"]]$counts.NDE005)
#Output:
TTTGTTGTCCCGATCT-NDE010, ... -NDE10, ...

It appears everything in that assay got shifted by one Object which I only noticed because I always add a sample ID to the barcodes!

I can easily reproduce the issue and found a temporary workaround by adding a placeholder CMO assay to the NDE005 object before merging

# Create an empty sparse matrix for missing Assay5 before merging the objects
fake_CMO = Matrix::sparseMatrix(i = integer(0), 
                                j = integer(0), 
                                x = numeric(0), 
                                dims = c(length(rownames(NDE012_seuratObj[["CMO"]])), 
                                length(colnames(NDE005_seuratObj))))
# Set row names and column names
rownames(fake_CMO) = rownames(NDE012_seuratObj[["CMO"]])
colnames(fake_CMO) = colnames(NDE005_seuratObj)
NDE005_seuratObj[["CMO"]] = CreateAssay5Object(counts = fake_CMO))
# Bug doesn't occur anymore because all assays types are present in all objects

This bug can mislead researchers into using incorrect data for downstream analysis, particularly affecting sample demultiplexing in my case. Although it occurs in niche cases, a warning or error when merging Seurat objects with mismatched assays would be helpfull.

> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-redhat-linux-gnu
Running under: AlmaLinux 9.4 (Seafoam Ocelot)

Matrix products: default
BLAS/LAPACK: FlexiBLAS OPENBLAS-OPENMP;  LAPACK version 3.9.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Brussels
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] dplyr_1.1.4                 scCustomize_2.1.2          
 [3] readxl_1.4.3                writexl_1.5.0              
 [5] DropletUtils_1.24.0         scDblFinder_1.18.0         
 [7] SingleCellExperiment_1.26.0 SummarizedExperiment_1.34.0
 [9] Biobase_2.64.0              GenomicRanges_1.56.0       
[11] GenomeInfoDb_1.40.1         IRanges_2.38.1             
[13] S4Vectors_0.42.1            BiocGenerics_0.50.0        
[15] MatrixGenerics_1.16.0       matrixStats_1.3.0          
[17] readr_2.1.5                 ggridges_0.5.6             
[19] ggpubr_0.6.0                tidyr_1.3.1                
[21] ggplot2_3.5.1               stringr_1.5.1              
[23] tibble_3.2.1                Seurat_5.1.0               
[25] SeuratObject_5.0.2          sp_2.1-4                   

loaded via a namespace (and not attached):
  [1] spatstat.sparse_3.1-0     bitops_1.0-8             
  [3] lubridate_1.9.3           httr_1.4.7               
  [5] RColorBrewer_1.1-3        tools_4.4.1              
  [7] sctransform_0.4.1         backports_1.5.0          
  [9] utf8_1.2.4                R6_2.5.1                 
 [11] HDF5Array_1.32.0          lazyeval_0.2.2           
 [13] uwot_0.2.2                rhdf5filters_1.16.0      
 [15] withr_3.0.1               gridExtra_2.3            
 [17] progressr_0.14.0          cli_3.6.3                
 [19] spatstat.explore_3.3-1    fastDummies_1.7.3        
 [21] spatstat.data_3.1-2       pbapply_1.7-2            
 [23] Rsamtools_2.20.0          R.utils_2.12.3           
 [25] scater_1.32.1             parallelly_1.38.0        
 [27] limma_3.60.4              generics_0.1.3           
 [29] shape_1.4.6.1             BiocIO_1.14.0            
 [31] ica_1.0-3                 spatstat.random_3.3-1    
 [33] car_3.1-2                 Matrix_1.7-0             
 [35] ggbeeswarm_0.7.2          fansi_1.0.6              
 [37] abind_1.4-5               R.methodsS3_1.8.2        
 [39] lifecycle_1.0.4           yaml_2.3.10              
 [41] edgeR_4.2.1               snakecase_0.11.1         
 [43] carData_3.0-5             rhdf5_2.48.0             
 [45] SparseArray_1.4.8         Rtsne_0.17               
 [47] paletteer_1.6.0           grid_4.4.1               
 [49] promises_1.3.0            dqrng_0.4.1              
 [51] crayon_1.5.3              miniUI_0.1.1.1           
 [53] lattice_0.22-6            beachmat_2.20.0          
 [55] cowplot_1.1.3             pillar_1.9.0             
 [57] metapod_1.12.0            rjson_0.2.21             
 [59] xgboost_1.7.8.1           future.apply_1.11.2      
 [61] codetools_0.2-20          leiden_0.4.3.1           
 [63] glue_1.7.0                spatstat.univar_3.0-0    
 [65] data.table_1.15.4         vctrs_0.6.5              
 [67] png_0.1-8                 spam_2.10-0              
 [69] cellranger_1.1.0          gtable_0.3.5             
 [71] rematch2_2.1.2            S4Arrays_1.4.1           
 [73] mime_0.12                 survival_3.7-0           
 [75] statmod_1.5.0             bluster_1.14.0           
 [77] fitdistrplus_1.2-1        ROCR_1.0-11              
 [79] nlme_3.1-164              RcppAnnoy_0.0.22         
 [81] irlba_2.3.5.1             vipor_0.4.7              
 [83] KernSmooth_2.23-24        colorspace_2.1-1         
 [85] ggrastr_1.0.2             tidyselect_1.2.1         
 [87] compiler_4.4.1            curl_5.2.1               
 [89] BiocNeighbors_1.22.0      DelayedArray_0.30.1      
 [91] plotly_4.10.4             rtracklayer_1.64.0       
 [93] scales_1.3.0              lmtest_0.9-40            
 [95] digest_0.6.36             goftest_1.2-3            
 [97] spatstat.utils_3.0-5      XVector_0.44.0           
 [99] htmltools_0.5.8.1         pkgconfig_2.0.3          
[101] sparseMatrixStats_1.16.0  fastmap_1.2.0            
[103] rlang_1.1.4               GlobalOptions_0.1.2      
[105] htmlwidgets_1.6.4         UCSC.utils_1.0.0         
[107] shiny_1.9.1               DelayedMatrixStats_1.26.0
[109] zoo_1.8-12                jsonlite_1.8.8           
[111] BiocParallel_1.38.0       R.oo_1.26.0              
[113] BiocSingular_1.20.0       RCurl_1.98-1.16          
[115] magrittr_2.0.3            scuttle_1.14.0           
[117] GenomeInfoDbData_1.2.12   dotCall64_1.1-1          
[119] patchwork_1.2.0           Rhdf5lib_1.26.0          
[121] munsell_0.5.1             Rcpp_1.0.13              
[123] viridis_0.6.5             reticulate_1.38.0        
[125] stringi_1.8.4             zlibbioc_1.50.0          
[127] MASS_7.3-60.2             plyr_1.8.9               
[129] parallel_4.4.1            listenv_0.9.1            
[131] ggrepel_0.9.5             forcats_1.0.0            
[133] deldir_2.0-4              Biostrings_2.72.1        
[135] splines_4.4.1             tensor_1.5               
[137] hms_1.1.3                 circlize_0.4.16          
[139] locfit_1.5-9.10           igraph_2.0.3             
[141] spatstat.geom_3.3-2       ggsignif_0.6.4           
[143] RcppHNSW_0.6.0            reshape2_1.4.4           
[145] ScaledMatrix_1.12.0       XML_3.99-0.17            
[147] scran_1.32.0              ggprism_1.0.5            
[149] tzdb_0.4.0                httpuv_1.6.15            
[151] RANN_2.6.1                purrr_1.0.2              
[153] polyclip_1.10-7           future_1.34.0            
[155] scattermore_1.2           janitor_2.2.0            
[157] rsvd_1.0.5                broom_1.0.6              
[159] xtable_1.8-4              restfulr_0.0.15          
[161] RSpectra_0.16-2           rstatix_0.7.2            
[163] later_1.3.2               viridisLite_0.4.2        
[165] beeswarm_0.4.0            GenomicAlignments_1.40.0 
[167] cluster_2.1.6             timechange_0.3.0         
[169] globals_0.16.3 
@maartenciers maartenciers added the bug Something isn't working label Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant