Introduction

This vignette demonstrates how to generate a detailled gene module reports using the gm_report() function from the scigenex R package. The reports are created in Bookdown format and provide an integrated view of gene module analysis from any Seurat supported technology:

  • Single-cell RNA-seq experiments
  • Spatial transcriptomics (ST) experiments (Visium, Xenium, Merscope…)
  • Bulk RNA-seq experiments

The gm_report() function combines cluster information, functional annotation, and visualization into a shareable HTML report. It also optionally leverages AI-based tools for cell-type annotation.

What You Need

Before running gm_report(), you must have:

  • A ClusterSet object, representing gene modules.
  • A corresponding Seurat object containing your expression data, cell cluster annotation, spatial location…
  • Optional: An API key for Gemini AI to enhance cell-type annotation.

How to Create a ClusterSet

A ClusterSet can be built in several ways:

  • From the SciGeneX pipeline


  • From a Seurat object based on cell-specific marker genes
  • From a Seurat object based on any markers of interest
    • Use cluster_set_from_seurat() with a Seurat object and a named vector (clusters with gene_names as names).
    • Objective:
      • Construct gene modules based on known or identified pathways.

Loading libraries

To use the gm_report() function, you need to load the scigenex, Seurat, an organism-related packages (here the org.Hs.eg.db packages).

Some examples

Running gm_report() from a Seurat single-cell and corresponding Scigenex modules

We will demonstrate with built-in example datasets from the scigenex package. - We will use the pbmc3k_medium dataset, which is a subset of the pbmc3k dataset from SeuratData. - The pbmc3k_medium_clusters ClusterSet stores the gene modules and was created from the pbmc3k_medium object using the SciGeneX pipeline.

The report will be saved in the folder specified by out_dir. Open index.html in your browser to view it.

# Unset verbosity to avoid cluttering the output
set_verbosity(0)

# Load example gene module clusters and Seurat object
load_example_dataset("7871581/files/pbmc3k_medium_clusters")
load_example_dataset("7871581/files/pbmc3k_medium")

# Create a temporary directory for the report output
tmp_dir <- tempdir()

# Run gm_report() to generate the report
gm_report(
  cluster_set = pbmc3k_medium_clusters[1:3,],
  smp_stage = "adult",
  annotation_src = "CC",
  bioc_org_db = "org.Hs.eg.db",
  seurat_object = pbmc3k_medium,
  report_title = "PBMC Gene Module Report",
  report_subtitle = "PBMC 3k Medium Dataset",
  report_author = "scigenex Vignette",
  out_dir = file.path(tmp_dir, "pbmc_report"),
  quiet = FALSE
)

Running gm_report() from a Seurat single-cell experiment and corresponding Seurat markers

As indicated previously, gm_report()can be run from a Seurat object together with on cell-specific marker genes obtained from Seurat::FindAllMarkers(). This represent a convenient way to create a Seurat analysis report. In this case we need first to run the cluster_set_from_seurat() function.

# Unset verbosity to avoid cluttering the output
set_verbosity(0)

markers <- Seurat::FindAllMarkers(pbmc3k_medium, only.pos = TRUE)
cs <- cluster_set_from_seurat(pbmc3k_medium, markers, p_val_adj = 0.001)

# Create a temporary directory for the report output
tmp_dir <- tempdir()

# Run gm_report() to generate the report
gm_report(
  cluster_set = cs[1:3,],
  seurat_object = pbmc3k_medium,
  annotation_src = "CC",
  bioc_org_db = "org.Hs.eg.db",
  api_key = NULL,  # Optional: Gemini API key for IA-based annotation
  report_title = "PBMC Gene Module Report",
  report_subtitle = "PBMC 3k Medium Dataset",
  report_author = "scigenex Vignette",
  out_dir = file.path(tmp_dir, "pbmc_report"),
  quiet = FALSE
)

Running gm_report() from a Seurat spatial experiment and corresponding Scigenex modules

When running the gm_report() function on spatial transcriptomics data, you can visualize gene modules in the context of tissue architecture. The function supports Seurat objects with spatial data and can generate spatial plots. Here we will use a tiny subset of the 10X genomics lymph_node VISIUM (V1) dataset.

# Create a temporary directory for the report output
tmp_dir <- tempdir()

# Load example gene module clusters and Seurat object for spatial data
load_example_dataset("7870305/files/lymph_node_tiny_clusters_2")
load_example_dataset("7870305/files/lymph_node_tiny_2")

# Run gm_report() for spatial transcriptomics data
gm_report(
  cluster_set = lymph_node_tiny_clusters_2[1:2,],
  seurat_object = lymph_node_tiny_2,
  annotation_src = "CC",
  bioc_org_db = "org.Hs.eg.db",
  is_spatial_exp = TRUE,
  SpatialFeaturePlot_params = list(pt.size.factor = 3000),
  SpatialDimPlot_params = list(pt.size.factor = 3000),
  out_dir = file.path(tmp_dir, "spatial_report")
)

Running gm_report() from a Seurat spatial experiment and corresponding Seurat markers

In the same way, we can apply gm_report()with spatial transcriptomics data and markers identified using the Seurat::FindAllMarkers() function.

# Create a temporary directory for the report output
tmp_dir <- tempdir()

# Load example gene module clusters and Seurat object for spatial data
load_example_dataset("7870305/files/lymph_node_tiny_2")

markers <- Seurat::FindAllMarkers(lymph_node_tiny_2, only.pos = TRUE)
cs <- cluster_set_from_seurat(lymph_node_tiny_2, markers, p_val_adj = 0.001, assay = "Spatial")

# Run gm_report() for spatial transcriptomics data
gm_report(
  cluster_set = cs[1:2,],
  seurat_object = lymph_node_tiny_2,
  smp_species = "Homo sapiens",
  smp_region = "total",
  smp_organ = "lymph node",
  smp_stage = "adult",
  annotation_src = "CC",
  bioc_org_db = "org.Hs.eg.db",
  subsample_by_ident_params = list(nbcell=10),
  is_spatial_exp = TRUE,
  SpatialFeaturePlot_params = list(pt.size.factor = 3000),  # Object was created with an older seurat version
  SpatialDimPlot_params = list(pt.size.factor = 3000),  # Object was created with an older seurat version
  out_dir = file.path(tmp_dir, "spatial_report")
)

Customizing Your Report

The gm_report()function accepts many parameters to tailor the analysis and appearance:

  • Plot customization
    • plot_profiles_params
    • FeaturePlot_params
    • SpatialDimPlot_params
  • Selective content
    • Enable or disable specific report sections with the section argument.
  • AI-based annotation
    • Provide an api_keyfor cell-type predictions via Gemini AI.
  • Spatial transcriptomics
    • Set is_spatial_exp = TRUE if working with ST data.

Conclusion

The gm_report() function in Scigenex offers a powerful way to analyze and document gene modules using Seurat and clusterSet objects. It supports spatial transcriptomics, functional annotation, and AI-based interpretation.
For more information, please refer to the Scigenex documentation.