GLM-based PCA using the SpaNorm model. The null model is considered to consist of the library size effects, batch effects, and the gene mean. GLM-PCA is approximated by regressing the null model from the data, and performing PCA on the residuals (Pearson or deviance).
Usage
SpaNormPCA(
spe,
nsvgs = 3000,
ncomponents = 50,
svg.fdr = 1,
BSPARAM = bsparam(),
BPPARAM = SerialParam(),
residuals = c("deviance", "pearson"),
name = "PCA"
)
# S4 method for class 'SpatialExperiment'
SpaNormPCA(
spe,
nsvgs = 3000,
ncomponents = 50,
svg.fdr = 1,
BSPARAM = bsparam(),
BPPARAM = SerialParam(),
residuals = c("deviance", "pearson"),
name = "PCA"
)
Arguments
- spe
a SpatialExperiment or Seurat object, with the count data stored in 'counts' or 'data' assays respectively, and a SpaNorm model fit.
- nsvgs
the number of SVGs to use for PCA.
- ncomponents
the number of components to compute.
- svg.fdr
the FDR threshold for SVG calling.
- BSPARAM
a BiocSingularParam object specifying which algorithm should be used to perform the PCA.
- BPPARAM
a BiocParallelParam object specifying whether the PCA should be parallelized.
- residuals
the type of residuals to use for PCA. Either "deviance" (default) or "pearson".
- name
the name of the reducedDim to store the PCA results.
Value
a SpatialExperiment or Seurat object with PCA results. For SpatialExperiment objects, these are stored in the reducedDims.
Details
SpaNorm PCA works by using the SpaNorm model fit for data normalisation to approximate a GLM-based PCA as described in Townes et al. (Genome Biology, 2019). The model used for normalisation represents the library size effects and the gene mean. Regressing these covariates, we remain with the deviance or Pearson residuals, upon which PCA can be performed to approximate the GLM-PCA.
Examples
library(SpatialExperiment)
library(ggplot2)
data(HumanDLPFC)
HumanDLPFC = SpaNorm(HumanDLPFC, sample.p = 0.05, df.tps = 2, tol = 1e-2)
#> (1/2) Fitting SpaNorm model
#> 201 cells/spots sampled to fit model
#> iter: 1, estimating gene-wise dispersion
#> iter: 1, log-likelihood: -1149654.530080
#> iter: 1, fitting NB model
#> iter: 1, iter: 1, log-likelihood: -1149654.530080
#> iter: 1, iter: 2, log-likelihood: -817875.051427
#> iter: 1, iter: 3, log-likelihood: -730289.096272
#> iter: 1, iter: 4, log-likelihood: -715477.819485
#> iter: 1, iter: 5, log-likelihood: -713420.649743
#> iter: 1, iter: 6, log-likelihood: -713065.316554
#> iter: 1, iter: 7, log-likelihood: -712982.402574
#> iter: 1, iter: 8, log-likelihood: -712957.598604 (converged)
#> iter: 2, estimating gene-wise dispersion
#> iter: 2, log-likelihood: -712613.990872
#> iter: 2, fitting NB model
#> iter: 2, iter: 1, log-likelihood: -712613.990872
#> iter: 2, iter: 2, log-likelihood: -712457.521967
#> iter: 2, iter: 3, log-likelihood: -712448.597511 (converged)
#> iter: 3, log-likelihood: -712448.597511 (converged)
#> (2/2) Normalising data
HumanDLPFC = SpaNormSVG(HumanDLPFC)
#> (1/3) Retrieving SpaNorm model
#> (2/3) Fitting Null SpaNorm model
#> 201 cells/spots sampled to fit model
#> iter: 1, estimating gene-wise dispersion
#> iter: 1, log-likelihood: -1149654.530080
#> iter: 1, fitting NB model
#> iter: 1, iter: 1, log-likelihood: -1149654.530080
#> iter: 1, iter: 2, log-likelihood: -818186.913558
#> iter: 1, iter: 3, log-likelihood: -736604.099951
#> iter: 1, iter: 4, log-likelihood: -723768.349498
#> iter: 1, iter: 5, log-likelihood: -722449.976988
#> iter: 1, iter: 6, log-likelihood: -722432.852076 (converged)
#> iter: 2, estimating gene-wise dispersion
#> iter: 2, log-likelihood: -722418.533962
#> iter: 2, fitting NB model
#> iter: 2, iter: 1, log-likelihood: -722418.533962
#> iter: 2, iter: 1, log-likelihood: -722418.533962
#> iter: 2, iter: 1, log-likelihood: -722418.533962
#> iter: 2, iter: 2, log-likelihood: -722418.533962
#> iter: 2, iter: 2, log-likelihood: -722418.533962
#> iter: 2, iter: 2, log-likelihood: -722418.533962
#> iter: 2, iter: 3, log-likelihood: -722418.533962 (converged)
#> iter: 3, log-likelihood: -722418.533962 (converged)
#> (3/3) Finding SVGs
#> 1244 SVGs found (FDR < 0.05)
HumanDLPFC = SpaNormPCA(HumanDLPFC)
reducedDims(HumanDLPFC)
#> List of length 1
#> names(1): PCA