Here we load essential packages, setup input/output directories, and load visium_samples_manifest.xlsx from the data directory, that includes information about visium samples. We generate a coords dataframe that includes coordinates of all samples which will be essential for the analysis later.
We will use BayesSpace (Zhao et al. 2021) for the spatially-informed clustering. One can review benchmarking methods for spatially-informed clustering (Yuan et al. 2024).
Here we create a merged Seurat object for all spatial data from the individual objects created in Spatial Data Processing, and integrate them using RPCA integration.
Code
## Import samples and merged themmerged <-lapply(samples_info$sample_id, function(x) { seurat <-readRDS(paste0(inputDir, x, ".rds")) seurat$sample_id <- x seurat <-RenameCells(seurat, add.cell.id = x)names(seurat@images) <- x seurat@images[[x]]@key <- x seurat[["percent.mt"]] <-PercentageFeatureSet(object = seurat,pattern ="^MT-" )return(seurat)})merged <- purrr::reduce(merged, merge)merged <-JoinLayers(merged, assay ='Spatial')## Applying filters for QC. We will remove spots with less than 500 genes or more than 20% mitochondrial percentmerged <-subset(merged, subset = nFeature_Spatial >500& percent.mt <20)## RPCA data integrationmerged[["Spatial"]] <-split(merged[["Spatial"]], f = merged$sample_id)merged <-NormalizeData(merged)merged <-FindVariableFeatures(merged)merged <-ScaleData(merged)merged <-RunPCA(merged)merged <-IntegrateLayers(object = merged,method = RPCAIntegration,new.reduction ="integrated.rpca",verbose =TRUE)merged <-RunUMAP(merged, reduction ='integrated.rpca', dims =1:50)merged <-JoinLayers(merged)merged <-AddMetaData(merged, coord)
3 BayesSpace Clustering
The workflow described below is an adapdation from the one described here.
1- BayesSpace works with SingleCellExperiment objects, so we need to convert our Seurat object to this format 2- We add the coordinates information to the colData 3- The object is then processed using spatialPreprocess function 4- We stitch the samples together by adding 150 units shift to the coordinates to create 1 x 14 grid 5- We used qTune to find the optimal number of clusters 6- We will pick 23 clusters 7- Then we add the cluster information back to the Seurat object
Here we use the markers and the distribution of cell type fractions from the deconvolution results and the histology to annotate the clusters as described in our manuscript (elhossiny_manuscript?). RCTD generates two outputs, all_weights and sub_weights, where all_weights is the result of RCTD full-mode where all cell types are used and get a weight, whereas in sub_weights, the algorithm iteratively selects a subset of cell types that are likely to be on the pixel.
The following code extracts the weights from the RCTD results and add them to the merged seurat object.
The clusters annotation are written in BayesSpace_23_spatial_clusters_annotation.xlsx within the outputs/Spatial_Clustering/ directory. We merged certain clusters based on having ~1.0 pearson correlation value indicating potential over-clustering.
R version 4.4.0 (2024-04-24)
Platform: aarch64-apple-darwin20
Running under: macOS 26.0
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Detroit
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] htmlwidgets_1.6.4 compiler_4.4.0 fastmap_1.2.0 cli_3.6.2
[5] tools_4.4.0 htmltools_0.5.8.1 yaml_2.3.8 rmarkdown_2.27
[9] knitr_1.47 jsonlite_1.8.8 xfun_0.52 digest_0.6.35
[13] rlang_1.1.3 evaluate_0.23
References
Dann, Emma, Neil C Henderson, Sarah A Teichmann, Michael D Morgan, and John C Marioni. 2022. “Differential Abundance Testing on Single-Cell Data Using k-Nearest Neighbor Graphs.”Nature Biotechnology 40 (2): 245–53.
Yuan, Zhiyuan, Fangyuan Zhao, Senlin Lin, Yu Zhao, Jianhua Yao, Yan Cui, Xiao-Yong Zhang, and Yi Zhao. 2024. “Benchmarking Spatial Clustering Methods with Spatially Resolved Transcriptomics Data.”Nature Methods 21 (4): 712–22.
Zhao, Edward, Matthew R Stone, Xing Ren, Jamie Guenthoer, Kimberly S Smythe, Thomas Pulliam, Stephen R Williams, et al. 2021. “Spatial Transcriptomics at Subspot Resolution with BayesSpace.”Nature Biotechnology 39 (11): 1375–84.