This function compares two lists of gene sets using either the Jaccard index or the hypergeometric test.
compare_genesets(
set_1 = NULL,
set_2 = NULL,
stat = c("jaccard", "hypergeom", "intersection", "union", "size_set_1", "size_set_2",
"diff_set_1", "diff_set_2"),
background = NULL
)
A list containing gene sets to be compared.
A list containing gene sets to be compared.
The statistics to be computed between gene sets. It can be either "jaccard", "hypergeom", "intersection" "size_set_1", "size_set_2", "diff_set_1" (specific to set_1), "diff_set_2" (specific to set_2). The background is taken into account. Note that hypergeometric tests check for enrichment.
The background (universe) to consider. Default to the non-redundant list of elements merged from set_1 and set2. You may provide a vector with all genes of the genome for instance.
A matrix of comparison results where each row corresponds to a gene set in set_1, and each column corresponds to a gene set in set_2.
The Jaccard index is a measure of similarity between two sets defined as the size of the intersection divided by the size of the union of the sets. The hypergeometric test is used to determine whether the overlap between two sets is more significant than expected by chance. The 'intersection' method, simply computes the size of the intersection between a and b. The "union", "size_set_1", "size_set_2", "diff_set_1" and "diff_set_2" compute the union of the two sets, the size of gene sets from set_1, the size of gene sets from set_2, the gene that are specific to set_1, the gene that are specific to set_2, respectively.