This vignette demonstrates how to generate a detailled gene module
reports using the gm_report()
function from the
scigenex R package. The reports are created in
Bookdown format and provide an integrated view of gene
module analysis from any Seurat supported technology:
The gm_report()
function combines cluster information,
functional annotation, and visualization into a shareable HTML report.
It also optionally leverages AI-based tools for cell-type
annotation.
Before running gm_report()
, you must have:
A ClusterSet
can be built in several ways:
select_genes()
and then
gene_clustering()
on a Seurat object or matrix.filter_cluster_size()
,
filter_cluster_sd()
) to select modules based on gene
number, variance, or other criteria.cluster_set_from_seurat()
with a Seurat object and
the result from Seurat::FindAllMarkers()
.cluster_set_from_seurat()
with a Seurat object and
a named vector (clusters with gene_names as names).To use the gm_report()
function, you need to load the
scigenex, Seurat, an organism-related
packages (here the org.Hs.eg.db packages).
We will demonstrate with built-in example datasets from the
scigenex package. - We will use the pbmc3k_medium
dataset, which is a subset of the pbmc3k dataset from SeuratData. - The
pbmc3k_medium_clusters
ClusterSet stores the gene modules
and was created from the pbmc3k_medium
object using the
SciGeneX pipeline.
The report will be saved in the folder specified by
out_dir
. Open index.html
in your browser to
view it.
# Unset verbosity to avoid cluttering the output
set_verbosity(0)
# Load example gene module clusters and Seurat object
load_example_dataset("7871581/files/pbmc3k_medium_clusters")
load_example_dataset("7871581/files/pbmc3k_medium")
# Create a temporary directory for the report output
tmp_dir <- tempdir()
# Run gm_report() to generate the report
gm_report(
cluster_set = pbmc3k_medium_clusters[1:3,],
smp_stage = "adult",
annotation_src = "CC",
bioc_org_db = "org.Hs.eg.db",
seurat_object = pbmc3k_medium,
report_title = "PBMC Gene Module Report",
report_subtitle = "PBMC 3k Medium Dataset",
report_author = "scigenex Vignette",
out_dir = file.path(tmp_dir, "pbmc_report"),
quiet = FALSE
)
As indicated previously, gm_report()
can be run from a
Seurat object together with on cell-specific marker genes obtained from
Seurat::FindAllMarkers()
. This represent a convenient way
to create a Seurat analysis report. In this case we need first to run
the cluster_set_from_seurat()
function.
# Unset verbosity to avoid cluttering the output
set_verbosity(0)
markers <- Seurat::FindAllMarkers(pbmc3k_medium, only.pos = TRUE)
cs <- cluster_set_from_seurat(pbmc3k_medium, markers, p_val_adj = 0.001)
# Create a temporary directory for the report output
tmp_dir <- tempdir()
# Run gm_report() to generate the report
gm_report(
cluster_set = cs[1:3,],
seurat_object = pbmc3k_medium,
annotation_src = "CC",
bioc_org_db = "org.Hs.eg.db",
api_key = NULL, # Optional: Gemini API key for IA-based annotation
report_title = "PBMC Gene Module Report",
report_subtitle = "PBMC 3k Medium Dataset",
report_author = "scigenex Vignette",
out_dir = file.path(tmp_dir, "pbmc_report"),
quiet = FALSE
)
When running the gm_report()
function on spatial
transcriptomics data, you can visualize gene modules in the context of
tissue architecture. The function supports Seurat objects with spatial
data and can generate spatial plots. Here we will use a tiny subset of
the 10X genomics lymph_node VISIUM (V1) dataset.
# Create a temporary directory for the report output
tmp_dir <- tempdir()
# Load example gene module clusters and Seurat object for spatial data
load_example_dataset("7870305/files/lymph_node_tiny_clusters_2")
load_example_dataset("7870305/files/lymph_node_tiny_2")
# Run gm_report() for spatial transcriptomics data
gm_report(
cluster_set = lymph_node_tiny_clusters_2[1:2,],
seurat_object = lymph_node_tiny_2,
annotation_src = "CC",
bioc_org_db = "org.Hs.eg.db",
is_spatial_exp = TRUE,
SpatialFeaturePlot_params = list(pt.size.factor = 3000),
SpatialDimPlot_params = list(pt.size.factor = 3000),
out_dir = file.path(tmp_dir, "spatial_report")
)
In the same way, we can apply gm_report()
with spatial
transcriptomics data and markers identified using the
Seurat::FindAllMarkers()
function.
# Create a temporary directory for the report output
tmp_dir <- tempdir()
# Load example gene module clusters and Seurat object for spatial data
load_example_dataset("7870305/files/lymph_node_tiny_2")
markers <- Seurat::FindAllMarkers(lymph_node_tiny_2, only.pos = TRUE)
cs <- cluster_set_from_seurat(lymph_node_tiny_2, markers, p_val_adj = 0.001, assay = "Spatial")
# Run gm_report() for spatial transcriptomics data
gm_report(
cluster_set = cs[1:2,],
seurat_object = lymph_node_tiny_2,
smp_species = "Homo sapiens",
smp_region = "total",
smp_organ = "lymph node",
smp_stage = "adult",
annotation_src = "CC",
bioc_org_db = "org.Hs.eg.db",
subsample_by_ident_params = list(nbcell=10),
is_spatial_exp = TRUE,
SpatialFeaturePlot_params = list(pt.size.factor = 3000), # Object was created with an older seurat version
SpatialDimPlot_params = list(pt.size.factor = 3000), # Object was created with an older seurat version
out_dir = file.path(tmp_dir, "spatial_report")
)
The gm_report()
function accepts many parameters to
tailor the analysis and appearance:
api_key
for cell-type predictions via Gemini
AI.is_spatial_exp = TRUE
if working with ST data.The gm_report()
function in Scigenex offers a powerful
way to analyze and document gene modules using Seurat and clusterSet
objects. It supports spatial transcriptomics, functional annotation, and
AI-based interpretation.
For more information, please refer to the Scigenex
documentation.