Pathway Over-representation Analysis

To do pathway over-representation analysis (ORA) you first need to upload a list of gene identifiers and associated fold-change in gene expression values (and P values) as described here.
You have two options to do so:

  • Upload All genes from your array dataset, not just differentially expressed (DE) genes (probes mapping to multiple different genes should be removed). The pathway ORA tool then uses the proportion of DE genes on the whole array to determine if a particular pathway is significant.
  • Upload only a subset of genes, which is then analyzed using a slightly different algorithm that does not take gene expression values into account. This is necessary as the algorithm does not know the proportion of DE genes on the array. Therefore, this analysis cannot handle data from multiple conditions.

If you have multiple probes for the same gene these values will be averaged for the purposes of the pathway ORA.

A list of pathways associated with the uploaded genes will be returned.

To do the pathway ORA click on the red Pathway ORA button at the top of the page.
This will take you to a page where you can choose the parameters for the pathway over-representation analysis.

First you need to specify whether you are analyzing an entire array dataset or just a subset of genes. If you try to analyze a subset of genes using the entire dataset algorithm or vice versa your results will NOT be correct. If you are analyzing a complete array dataset choose the following parameters for the pathway over-representation analysis:

  • Fold-Change Cutoff (+/-): choose what fold-change in gene expression threshold should be used to determine which genes are differently expressed. Default = +/- 1.5.
  • Expression P-Value Cutoff: choose what P value threshold associated with each fold-change in gene expression value should be used to determine which genes are differently expressed. Default P < 0.05.

Now choose the analysis algorithm and multiple testing correction method:

  • Choose algorithm: several different statistical methods are available to determine if pathways are significantly associated with DE genes - Hypergeometric, Fisher & Chi Square.
  • Choose Correction Method: two options to correct for multiple testing are included - The Benjamini & Hochberg correction for the FDR and the more conservative Bonferroni correction.

Hit submit.
A new page will be returned showing the pathways that are significantly associated with up-regulated genes.
Click the green button to see pathways that are significantly associated with down-regulated genes.
Click on the 'summary' link to see information for all genes in the pathway. The interactions in the pathway, along with overlaid gene expression data can be visualized by clicking on the 'visualize' link.