Merge pull request #42 from saezlab/NewVignettes

Small corrections
saezlab · Mar 25, 2024 · 8bb8ba9 · 8bb8ba9
2 parents c6576fe + 44a91c1
commit 8bb8ba9
Show file tree

Hide file tree

Showing 5 changed files with 25 additions and 23 deletions.
diff --git a/vignettes/FunctionalAndStructuralPipeline.Rmd b/vignettes/FunctionalAndStructuralPipeline.Rmd
@@ -387,9 +387,9 @@ We are going to create a combined view with the receptors in the intraview as ta
 
 ```{r message=FALSE, warning=FALSE}
 # Create views and combine them
-receptor_view <- create_initial_view(as.data.frame(t(hvg_lig)))
+receptor_view <- create_initial_view(as.data.frame(t(hvg_recep)))
 
-ligand_view <- create_initial_view(as.data.frame(t(hvg_recep))) %>% 
+ligand_view <- create_initial_view(as.data.frame(t(hvg_lig))) %>% 
   add_paraview(geometry, l = 200, family = "gaussian")
 
 lig_recep_view <- receptor_view %>% add_views(create_view("paraview.ligand.200", ligand_view[["paraview.200"]]$data, "para.lig.200"))
@@ -404,7 +404,7 @@ Let's look at important interactions. An additional way to reduce the number of
 
 ```{r}
 misty_results_lig_recep %>%
-  plot_interaction_heatmap("para.lig.200", clean = TRUE, cutoff = 2, trim.measure ="gain.R2", trim = 25)
+  plot_interaction_heatmap("para.lig.200", clean = TRUE, cutoff = 2, trim.measure ="gain.R2", trim = 10)
 ```
 
 Remember that MISTy does not only infer interactions between ligands and their respective receptor, but rather all possible interactions between ligands and receptors. We can visualize one of the interactions with high importance:

diff --git a/vignettes/FunctionalPipelinePathwayActivityLigands.Rmd b/vignettes/FunctionalPipelinePathwayActivityLigands.Rmd
@@ -71,7 +71,7 @@ geometry <- GetTissueCoordinates(seurat_vs, scale = NULL)
 
 ## Pathway activity
 
-Now we create a Seurat object with pathway activities inferred from [`PROGENy`](https://saezlab.github.io/progeny/index.html). We delete the PROGENy assay done by Kuppe et al. and load a model matrix with the top 1000 significant genes for each of the 15 available pathways. We then extract the genes that are both common to the PROGENy model and the snRNA-seq assay from the Seurat object. We compute the weighted sum of both and scale them to infer the pathway activity. We save the result in a Seurat assay and clean the row names to handle problematic variables.
+Now we create a Seurat object with pathway activities inferred from [`PROGENy`](https://saezlab.github.io/progeny/index.html). We delete the PROGENy assay done by Kuppe et al. and load a model matrix with the top 1000 significant genes for each of the 14 available pathways. We then extract the genes that are both common to the PROGENy model and the snRNA-seq assay from the Seurat object. We estimate the pathway activity with a multivariate linear model using [`decoupleR`](https://saezlab.github.io/decoupleR/). We save the result in a Seurat assay and clean the row names to handle problematic variables.
 
 ```{r}
 seurat_vs[['progeny']] <- NULL
@@ -165,15 +165,15 @@ With the collected results, we can now answer the following questions:
 
 ### 1. To what extent can the analyzed surrounding tissues' activities explain the pathway activity of the spot compared to the intraview?
 
-Here we can look at two different statistics: multi.R2 shows the total variance explained by the multiview model. gain.R2 shows the increase in explainable variance from the paraview.
+Here we can look at two different statistics: `multi.R2` shows the total variance explained by the multiview model. `gain.R2` shows the increase in explainable variance from the paraviews.
 
 ```{r}
 misty_results %>%
-  plot_improvement_stats("gain.R2") %>%
-  plot_improvement_stats("multi.R2")
+  plot_improvement_stats("multi.R2") %>%
+  plot_improvement_stats("gain.R2")
 ```
 
-The paraview particularly increases the explained variance for TGFb and PI3K. In general, the significant gain in R2 can be interpreted as the following:
+The paraviews particularly increase the explained variance for TGFb and PI3K. In general, the significant gain in R2 can be interpreted as the following:
 
 "We can better explain the expression of marker X when we consider additional views other than the intrinsic view."
 
@@ -202,7 +202,7 @@ SpatialFeaturePlot(seurat_vs, features = c("tnfa", "nfkb"), image.alpha = 0)
 
 We can observe a correlation between high TNFa activity and high NFkB activity.
 
-Now we repeat this analysis with the pathway activity paraview. With `trim` we display only targets with a value above 0.5 for `gain.R2`.
+Now we repeat this analysis with the pathway activity paraview. With `trim` we display only targets with a value above 0.5% for `gain.R2`.
 
 ```{r}
 misty_results %>%

diff --git a/vignettes/FunctionalPipelinePathwaySpecific.Rmd b/vignettes/FunctionalPipelinePathwaySpecific.Rmd
@@ -95,11 +95,11 @@ SpatialFeaturePlot(seurat_vs, feature = c("ID1", "NID2"), keep.scale = NULL)
 
 ## Misty views
 
-Now we need to create the Misty views of interest. We are interested in the relationship of TGF-beta responsive genes in the same spot (intraview) and the five closest spots (paraview). Therefore we choose the family `constant` which will select the five nearest neighbors. Depending on the goal of the analysis, different families can be applied.
+Now we need to create the Misty views of interest. We are interested in the relationship of TGF-beta responsive genes in the same spot (intraview) and the ten closest spots (paraview). Therefore we choose the family `constant` which will select the ten nearest neighbors. Depending on the goal of the analysis, different families can be applied.
 
 We are also intrigued about the relationship of VEGF-responsive genes with TGF-beta responsive genes in the broader tissue. For this, we again create an intra- and paraview, this time for VEGF, but from this view, we only need the paraview. In the next step, we add it to the TGF-beta views to achieve our intended views.
 
-```{r}
+```{r message=FALSE, warning=FALSE}
 TGFb_views <- create_initial_view(t(expression[TGFb_footprints,]) %>% as_tibble()) %>%
   add_paraview(geometry, l=10, family = "constant")
 
@@ -126,15 +126,15 @@ With the collected results, we can now answer the following questions:
 
 ### 1. To what extent can the surrounding tissues' gene expression explain the gene expression of the spot compared to the intraview?
 
-Here we can look at two different statistics: multi.R2 shows the total variance explained by the multiview model. gain.R2 shows the increase in explainable variance from the paraview.
+Here we can look at two different statistics: `multi.R2` shows the total variance explained by the multiview model. `gain.R2` shows the increase in explainable variance from the paraviews.
 
 ```{r}
 misty_results %>%
   plot_improvement_stats("gain.R2") %>%
   plot_improvement_stats("multi.R2")
 ```
 
-The paraview particularly increases the explained variance for COMP, ID1, and COL4A1. In general, the significant gain in R2 can be interpreted as the following:
+The paraviews particularly increase the explained variance for COMP, ID1, and COL4A1. In general, the significant gain in R2 can be interpreted as the following:
 
 "We can better explain the expression of marker X when we consider additional views other than the intrinsic view."
 
@@ -161,9 +161,9 @@ We can observe that COL4A1 and ID1 are a significant predictor for the expressio
 SpatialFeaturePlot(seurat_vs, features = c("ID1", "SMAD7"), image.alpha = 0)
 ```
 
-We can see that in spots with ID1 mRNA often SMAD7 is also expressed.
+Areas with high levels of ID1 mRNA expression also tend to show high  SMAD7 expression.
 
-Now we repeat this analysis with the TGF-beta paraview. With `trim` we display only targets with a value above 0.5 for `gain.R2`. To set an importance threshold we apply `cutoff`.
+Now we repeat this analysis with the TGF-beta paraview. With `trim` we display only targets with a value above 0.5% for `gain.R2`. To set an importance threshold we apply `cutoff`.
 
 ```{r}
 misty_results %>%

diff --git a/vignettes/MistyRStructuralAnalysisPipelineC2L.Rmd b/vignettes/MistyRStructuralAnalysisPipelineC2L.Rmd
@@ -88,7 +88,7 @@ Based on the plots, we can observe that some cell types are found more frequentl
 
 ## MISTy views
 
-First, we need to define an intraview that captures the cell type proportions within a spot. To capture the distribution of cell type proportions in the surrounding tissue, we add a paraview. For this vignette, the radius we choose is the distance to the nearest neighbor plus the standard deviation. We calculate the weights of each spot with `family = gaussian`. Then we run MISTy and collect the results.
+First, we need to define an intraview that captures the cell type proportions within a spot. To capture the distribution of cell type proportions in the surrounding tissue, we add a paraview. For this vignette, the radius we choose is the mean of the distance to the nearest neighbor plus the standard deviation. We calculate the weights of each spot with `family = gaussian`. Then we run MISTy and collect the results.
 
 ```{r message=FALSE, warning=FALSE}
 # Calculating the radius
@@ -150,7 +150,7 @@ SpatialFeaturePlot(seurat_vs, keep.scale = NULL, features = c("Fib","CM"), image
 
 We can observe that areas with high proportions of cardiomyocytes have low proportions of fibroblasts and vice versa.
 
-Now we repeat this analysis with the paraview. With `trim` we display only targets with a value above 1.75 for `gain.R2`. To set an importance threshold we apply `cutoff`.
+Now we repeat this analysis with the paraview. With `trim` we display only targets with a value above 1.75% for `gain.R2`. To set an importance threshold we apply `cutoff`.
 
 ```{r}
 misty_results %>% plot_interaction_heatmap(view = "para.126", clean = TRUE, 

diff --git a/vignettes/MistyRStructuralAnalysisPipelineDOT.Rmd b/vignettes/MistyRStructuralAnalysisPipelineDOT.Rmd
@@ -53,7 +53,7 @@ library(distances)
 
 ## Get and load the data
 
-For this showcase, we use a 10X Visium spatial slide from [Kuppe et al., 2022](https://doi.org/10.1038/s41586-022-05060-x), where they created a spatial multi-omic map of human myocardial infarction. The tissue example data comes from the human heart of patient 14 which is in a later state after myocardial infarction. The Seurat object contains, among other things, the spot coordinates on the slides which we will need for decomposition First, we have to download and extract the file:
+For this showcase, we use a 10X Visium spatial slide from [Kuppe et al., 2022](https://doi.org/10.1038/s41586-022-05060-x), where they created a spatial multi-omic map of human myocardial infarction. The tissue example data comes from the human heart of patient 14 which is in a later state after myocardial infarction. The Seurat object contains, among other things, the spot coordinates on the slides which we will need for decomposition. First, we have to download and extract the file:
 
 ```{r}
 # Download the data
@@ -70,7 +70,7 @@ spatial_data <- readRDS("ACH005/ACH005.rds")
 geometry <- GetTissueCoordinates(spatial_data, cols = c("imagerow", "imagecol"), scale = NULL)
 ```
 
-For deconvolution, we additionally need a reference single-cell data set containing a gene x cell count matrix and a vector containing the corresponding cell annotations. Kuppe et al., 2022, obtained from each sample isolated nuclei from the remaining tissue that they used for snRNA-seq. The data corresponding to the same patient as the spatial data will be used as reference data in `DOT`. First download the file:
+For deconvolution, we additionally need a reference single-cell data set containing a gene x cell count matrix and a vector containing the corresponding cell annotations. Kuppe et al., 2022, isolated nuclei from each sample's remaining tissue for snRNA-seq. The data corresponding to the same patient as the spatial data will be used as reference data in `DOT`. First download the file:
 
 ```{r}
 download.file("https://www.dropbox.com/scl/fi/sq24xaavxplkc98iimvpz/hca_p14.rds?rlkey=h8cyxzhypavkydbv0z3pqadus&dl=1",
@@ -89,7 +89,7 @@ ref_ct <- ref_data$celltypes
 
 ## Deconvolution with DOT
 
-Next, we need to set up the DOT object. The two inputs we need are the count matrix and pixel coordinates of the spatial data and the count matrix and cell annotations of the single-cell reference data.
+Next, we need to set up the DOT object. The inputs we need are the count matrix and pixel coordinates of the spatial data and the count matrix and cell annotations of the single-cell reference data.
 
 ```{r message=FALSE, warning=FALSE}
 dot.srt <-setup.srt(srt_data = spatial_data@assays$Spatial@counts, srt_coords = geometry) 
@@ -134,6 +134,8 @@ Based on the plots, we can observe that some cell types are found more frequentl
 
 ## MISTy views
 
+First, we need to define an intraview that captures the cell type proportions within a spot. To capture the distribution of cell type proportions in the surrounding tissue, we add a paraview. For this vignette, the radius we choose is the mean of the distance to the nearest neighbor plus the standard deviation. We calculate the weights of each spot with family = gaussian. Then we run MISTy and collect the results.
+
 ```{r message=FALSE, warning=FALSE}
 # Calculating the radius
 geom_dist <- as.matrix(distances(geometry))  
@@ -164,13 +166,13 @@ misty_results %>%
   plot_improvement_stats("gain.R2")
 ```
 
-The paraview particularly increases the explained variance for adipocytes and mast cells. In general, the significant gain in R^2^ can be interpreted as the following:
+The paraview particularly increases the explained variance for adipocytes. In general, the significant gain in R^2^ can be interpreted as the following:
 
 "We can better explain the expression of marker X when we consider additional views other than the intrinsic view."
 
 #### 2. What are the specific relations that can explain the contributions?
 
-To explain the contributions, we can visualize the importance of each cell type in predicting the cell type distribution for each view separately. With `trim`, we display only targets with a value above 50 for `multi.R2`. To set an importance threshold we would apply `cutoff`.
+To explain the contributions, we can visualize the importance of each cell type in predicting the cell type distribution for each view separately. With `trim`, we display only targets with a value above 50% for `multi.R2`. To set an importance threshold we would apply `cutoff`.
 
 First, for the intrinsic view:
 
@@ -212,7 +214,7 @@ misty_results %>% plot_interaction_heatmap(view = "para.126",
                                            trim.measure = "gain.R2") 
 ```
 
-Here, we select the target adipocytes, as we know from previous analysis that the paraview contributes a large part to explaining its distribution. The best predictor for adipocytes are Myeloid cells. To better identify the localization of the two cell types, we set the color scaling to a smaller range, as there are a few spots with a high proportion, which makes the distribution of spots with a low proportion difficult to recognize.
+Here, we select the target adipocytes, as we know from previous analysis that adipocytes have the highest `gain.R2`. The best predictor for adipocytes are Myeloid cells. To better identify the localization of the two cell types, we set the color scaling to a smaller range, as there are a few spots with a high proportion, which makes the distribution of spots with a low proportion difficult to recognize.
 
 ```{r fig.height=7, fig.width=5, message=FALSE}
 draw_maps(geometry,