Skip to contents

GLM-based PCA using the SpaNorm model. The null model is considered to consist of the library size effects, batch effects, and the gene mean. GLM-PCA is approximated by regressing the null model from the data, and performing PCA on the residuals (Pearson or deviance).

Usage

SpaNormPCA(
  spe,
  nsvgs = 3000,
  ncomponents = 50,
  svg.fdr = 1,
  BSPARAM = bsparam(),
  BPPARAM = SerialParam(),
  residuals = c("deviance", "pearson"),
  name = "PCA"
)

# S4 method for class 'SpatialExperiment'
SpaNormPCA(
  spe,
  nsvgs = 3000,
  ncomponents = 50,
  svg.fdr = 1,
  BSPARAM = bsparam(),
  BPPARAM = SerialParam(),
  residuals = c("deviance", "pearson"),
  name = "PCA"
)

Arguments

spe

a SpatialExperiment or Seurat object, with the count data stored in 'counts' or 'data' assays respectively, and a SpaNorm model fit.

nsvgs

the number of SVGs to use for PCA.

ncomponents

the number of components to compute.

svg.fdr

the FDR threshold for SVG calling.

BSPARAM

a BiocSingularParam object specifying which algorithm should be used to perform the PCA.

BPPARAM

a BiocParallelParam object specifying whether the PCA should be parallelized.

residuals

the type of residuals to use for PCA. Either "deviance" (default) or "pearson".

name

the name of the reducedDim to store the PCA results.

Value

a SpatialExperiment or Seurat object with PCA results. For SpatialExperiment objects, these are stored in the reducedDims.

Details

SpaNorm PCA works by using the SpaNorm model fit for data normalisation to approximate a GLM-based PCA as described in Townes et al. (Genome Biology, 2019). The model used for normalisation represents the library size effects and the gene mean. Regressing these covariates, we remain with the deviance or Pearson residuals, upon which PCA can be performed to approximate the GLM-PCA.

Examples


library(SpatialExperiment)
library(ggplot2)

data(HumanDLPFC)

HumanDLPFC = SpaNorm(HumanDLPFC, sample.p = 0.05, df.tps = 2, tol = 1e-2)
#> (1/2) Fitting SpaNorm model
#> 201 cells/spots sampled to fit model
#> iter:  1, estimating gene-wise dispersion
#> iter:  1, log-likelihood: -1149654.530080
#> iter:  1, fitting NB model
#> iter:  1, iter:  1, log-likelihood: -1149654.530080
#> iter:  1, iter:  2, log-likelihood: -817875.051427
#> iter:  1, iter:  3, log-likelihood: -730289.096272
#> iter:  1, iter:  4, log-likelihood: -715477.819485
#> iter:  1, iter:  5, log-likelihood: -713420.649743
#> iter:  1, iter:  6, log-likelihood: -713065.316554
#> iter:  1, iter:  7, log-likelihood: -712982.402574
#> iter:  1, iter:  8, log-likelihood: -712957.598604 (converged)
#> iter:  2, estimating gene-wise dispersion
#> iter:  2, log-likelihood: -712613.990872
#> iter:  2, fitting NB model
#> iter:  2, iter:  1, log-likelihood: -712613.990872
#> iter:  2, iter:  2, log-likelihood: -712457.521967
#> iter:  2, iter:  3, log-likelihood: -712448.597511 (converged)
#> iter:  3, log-likelihood: -712448.597511 (converged)
#> (2/2) Normalising data
HumanDLPFC = SpaNormSVG(HumanDLPFC)
#> (1/3) Retrieving SpaNorm model
#> (2/3) Fitting Null SpaNorm model
#> 201 cells/spots sampled to fit model
#> iter:  1, estimating gene-wise dispersion
#> iter:  1, log-likelihood: -1149654.530080
#> iter:  1, fitting NB model
#> iter:  1, iter:  1, log-likelihood: -1149654.530080
#> iter:  1, iter:  2, log-likelihood: -818186.913558
#> iter:  1, iter:  3, log-likelihood: -736604.099951
#> iter:  1, iter:  4, log-likelihood: -723768.349498
#> iter:  1, iter:  5, log-likelihood: -722449.976988
#> iter:  1, iter:  6, log-likelihood: -722432.852076 (converged)
#> iter:  2, estimating gene-wise dispersion
#> iter:  2, log-likelihood: -722418.533962
#> iter:  2, fitting NB model
#> iter:  2, iter:  1, log-likelihood: -722418.533962
#> iter:  2, iter:  1, log-likelihood: -722418.533962
#> iter:  2, iter:  1, log-likelihood: -722418.533962
#> iter:  2, iter:  2, log-likelihood: -722418.533962
#> iter:  2, iter:  2, log-likelihood: -722418.533962
#> iter:  2, iter:  2, log-likelihood: -722418.533962
#> iter:  2, iter:  3, log-likelihood: -722418.533962 (converged)
#> iter:  3, log-likelihood: -722418.533962 (converged)
#> (3/3) Finding SVGs
#> 1244 SVGs found (FDR < 0.05)
HumanDLPFC = SpaNormPCA(HumanDLPFC)
reducedDims(HumanDLPFC)
#> List of length 1
#> names(1): PCA