1. Generals
    1. What is agriGO?
    2. What is GO?
    3. What is updated in agriGO compare with EasyGO?
    4. Why my results in agriGO is different to EasyGO?
    5. What are the unique features of agriGO compare with other GO webserver/database?
    6. How users evaluate agriGO?
  2. Analysis tools
    1. What is SEA analysis?
    2. Which statistics method should I choose in SEA tool?
    3. What is PAGE analysis?
    4. What is BLAST4ID?
    5. How to use comparsion tool?
    6. Which tool should I choose?
    7. Why agriGO recommend doing multi-test adjustment for the p-value?
    8. What does flash bar chart mean and how to use it?
    9. Why graphical/chart image does not display on my PC?
    10. We found that the enriched functional catergories changed when we used different numbers for Minimum number of mapping entries (Advanced options), for example, 1 or 5. Based on what described online, the list coming from 1 should include the one from 5, correct? However, some categories in the list from 5 are not included in the list from 1. Although P values are the same in both cases, the corrected P values are different. Do you have any comments on that? What should we do?
    11. I used 20000 probesets at 4 different time points to the PAGE analysis, however no enriched GO terms were detected. Why?
  3. Datatype, update and GO annotation in agriGO
    1. How many datatypes are supported by agriGO?
    2. How agriGO obtains its data source?
    3. How often does agriGO update?
    4. Can I check result produced by EasyGO using agriGO ?
    5. How to make agriGO add new species/datatype?
    6. Could you please explain how you define GO terms for each probe?
    7. Given any GO term, its relative abundance (the ratio of contained probes to total probes) in Affymetrix array or soybean genome is the same/similar?
  4. Miscellaneous questions
    1. What is your lab focusing?
    2. What features used in agriGO construction?
    3. Who are using agriGO?
What is agriGO?
The agriGO is designed to automate the job for experimental biologists to identify enriched Gene Ontology (GO) terms in a list of microarray probe sets or gene identifiers (with or without expression information) and it is also a GO-related database. The agriGO specially focus on agricultural species.
 
What is GO?
"The Gene Ontology (GO) project provides a controlled vocabulary to describe gene and gene product attributes in any organism. The GO project is a collaborative effort to address the need for consistent descriptions of gene products in different databases. The GO collaborators are developing three structured, controlled vocabularies (ontologies) that describe gene products in terms of their associated biological processes, cellular components and molecular functions in a species-independent manner. There are three separate aspects to this effort: first, we write and maintain the ontologies themselves; second, we make cross-links between the ontologies and the genes and gene products in the collaborating databases, and third, we develop tools that facilitate the creation, maintainence and use of ontologies." Definition from http://www.geneontology.org/
 
What is updated in agriGO compare with EasyGO?
The agriGO is a successor of EasyGO, and it go further. 1. We create new website interface. The database structure and scripts of agriGO are redesigned. Both page loading speed and analysis speed of agriGO now are improved because of the change. 2. The agriGO service is especially focus on agricultural species. It supports species is extended to 38 including as much as 283 datatypes. 3. We added new analysis tools for new agriGO, such as PAGE analysis and BLAST4ID tool. 4. The result output and information are richer compare with EasyGO. 5. The agriGO could also work as a GO database with search and download service.
 
What are the unique features of agriGO compare with other GO webserver/database?
The agriGO provides heavy support to agricultural species. Not only limited to SEA analysis, GSEA which is achieved using PAGE method is also available. Furthermore we have BLAST4ID tool for ID transfer or annotation. And search as well as download function is accessible. The agriGO can give out rich outputs like graphical result, bar chart result and hierical tree which composing a comprehensive understanding of biological meaning of user's input data.
 
How users evaluate agriGO?
The comments from Faculty of 1000 biology:
This paper describes a new bioinformatic resource that will be of great use to any plant scientist carrying out genomic studies.
agriGO provides an intuitive and relatively user-easy platform for carrying out Gene Ontology (GO) analyses of genomic data from over 30 plant species. The tools provided include Singular Enrichment Analysis (SEA), which analyses a simple gene list for GO enrichment, and Parametric Analysis of Gene Set Enrichment (PAGE), which takes expression levels into account when analyzing GO enrichment. The platform provides publication quality outputs.
 
Why my results in agriGO is different to EasyGO?
Yes, such situation happens. You may first check whether you selected same parameters in two tools. What should be noted that agriGO provides multiple backgrounds which may effects the result and should be noticed. What more important is that the GO annotation source is somehow different in agriGO and EasyGO. This may also the reason for differential results.
 
What is SEA analysis?
SEA analysis means Singular enrichment analysis which is tranditional but widely used. SEA analysis is designed to identify enriched Gene Ontology (GO) terms in a list of microarray probe sets or gene identifiers. Finding enriched GO terms corresponds to finding enriched biological facts, and term enrichment level is judged by comparing query list to a background population from which the query list is derived.
 
Which statistics method should I choose in SEA tool?
When the input list is compared with the previously computed background, or is a subset of reference list, choose hypergeometric or fisher, for latter only when your query number is quite small. When the input list has few or no intersections with the reference list, the Chi-square tests are more appropriate.
 
What is PAGE analysis?
PAGE is Parametric Analysis of Gene Set Enrichment [Kim et. 2005 BMC Bioinfomatics]. PAGE method is using Central Limit Theorem in statistics, this method is simple and efficient. Different to SEA, it takes expression level into account, and can deal with a long list of genes/probesets. PAGE use a two-tailed test to count Z score, and the caculation of p-value using R software
 
What is BLAST4ID?
The BLAST4ID tool is not an analysis tool, but an associated one used mainly for two purposes: 1. Transfer your IDs which are not available to agriGO to available ones, 2. use blast search to annotate your sequences with GO.
 
How to use comparsion tool??
A comparsion tool for results from SEA is developed as one selective tool. User can upload a list of session IDs to do the comparison job. For PAGE analysis, the comparison function is already integreted to the normal output.
 
Which tool should I choose?
It will depend on what data you have. If you only have a list of identifiers or only interested about them, SEA will be your choice. And if you like take expression data into count and would like compare several dateset then you may try PAGE. The BLAST4ID is only an associated tool, use it if you really need it.
 
Why agriGO recommend doing multi-test adjustment for the p-value?
In statistics, the multiple comparisons (or "multiple testing") problem occurs when one considers a set, or family, of statistical inferences simultaneously. P-value is used for control the type I error rate in one statistical test. Errors in inference, including confidence intervals that fail to include their corresponding population parameters, or hypothesis tests that incorrectly reject the null hypothesis, are more likely to occur when one considers the family as a whole. Several statistical techniques have been developed to prevent this from happening, allowing significance levels for single and multiple comparisons to be directly compared. These techniques generally require a stronger level of evidence to be observed in order for an individual comparison to be deemed "significant", so as to compensate for the number of inferences being made.
 
What does flash bar chart mean and how to use it?
In SEA analysis result, bar in the chart means percentage of genes. The input one represents the percentage of number of genes mapping to the very GO term against the number of all the gene in the input list. And background/reference bar is similar. In PAGE analysis, the bar may mean Z-score or mean value. To use the bar chart is simly compare the height of bars. In practice, custom selection and adjustment of bar chart will benefit you in generating better appearance.
 
Why graphical/chart image does not display on my PC?
The bar chart result need flash player to browse correctly. Also, please check whether your browser is blocking flash since such setting is possible used in some ad. blocker software. And you may need different tool to display different format graphical result, for example: Adobe reader, SVG brower. Contact me if you install related tool but still can not see the results.
 
We found that the enriched functional catergories changed when we used different numbers for Minimum number of mapping entries (Advanced options), for example, 1 or 5. Based on what described online, the list coming from 1 should include the one from 5, correct? However, some categories in the list from 5 are not included in the list from 1. Although P values are the same in both cases, the corrected P values are different. Do you have any comments on that? What should we do?
The option "Minimum number of mapping entries " means that at least N genes can be mapped to one term, then the term will be used in further analysis, in other word, the term is available. In statistics calculation step, each term is computated independently, and this is the reason why the terms have same P-value in two analysis processes. However, since you use different "Minimum number of mapping entries " paramenter, the total number of terms will be different in the two processes, which means the times of statistics calulation is different (aforementioned, one term one calulation). And the times of statistics calulation is the key factor in multiple test adjustment. The default statistics method should be OK if you use pre-computed background provided by agriGO. And the method of multiple test adjustment will be chose by your own judgement. But here is a tip: these methods have different stringency, for example the default BY method is a strict one. And If you like more terms then use looser method, otherwise try strict ones.
 
I used 20000 probesets at 4 different time points to the PAGE analysis, however no enriched GO terms were detected. Why?
The reason may be that too many detected GO terms were detected and then were performed to multiple-test adjustment, and the adjusted p-values were higher than the cutoff. You can use no adjustment, or set higher cutoff, or use GO slim, or set higher 'Minimum number of mapping entries'.
 
How many datatypes are supported by agriGO?
We currently support on 38 species including 276 datatypes. Please check the data statistics page for detail information. We will continue adding more species and datatypes.
 
How agriGO obtains its data source?
Raw GO annotation data is generated using BLAST, Pfam, InterproScan by agriGO or obtained from B2G-FAR center or from Gene Ontology. Arabidopsis genome data is from TAIR. Rice TIGR genome data is from Rice Genome Annotation Project. Rice KOME data is from KOME database. Rice Gramene data is from Gramene center. Populus genome data is collected from JGI. Soybean and Sorghum genome data is compiled from phytozome. Grape genome data is compiled from Genoscope. Medicago genome data is from Medicago truncatula sequencing resources. Maize genome data is from MaizeSequence.org. Castor bean genome data is from Castor Bean Genome Database. Brachypodium distachyon genome data is from Ensembl. Bovine genome data is from Bovine Genome Database. Silkworm genome data is from SilkDB. M. grisea genome data is from Magnaporthe grisea Database. affymetrixmetrix CSV files and array sequences are from NetAffx.
 
How often does agriGO update?
Normally we will update our database every 3 months, but if we will update agriGO if some important data source is newly available. Improvement and updating to agriGO tools are irregulated.
 
Can I check result from old version by new agriGO ?
Sorry, but no. Because we reconstructed the database and redesigned the website organization, analysis result from EasyGO is not supported in agriGO.
 
How to make agriGO add new customized datatype?
Several species have be added to agriGO upon users' requests which are not limited to agricultural organisms. User can contact the agriGO administrator by email (adugduzhou@gmail.com) to discuss more details, and we will finish the addition within 24 hours.
 
Could you please explain how you define GO terms for each probe?
Here is the thing: the affy microarray and soybean genome background are both annotated by GO using InterproScan+BLAST (from B2G-FAR). Your submitted probes will be annotated by GO using pre-computed probe=>GO dataset, no matter which background you choose. The key factor for the difference between your analysis is come from the distant GO distributions in two backgrounds, which may caused by the number of identities, annotation process, or the translation of probeset sequence to protein sequence as GO annotation are normally performed to protein sequence. As there are more identities in soybean genome background, this may be the reason of more enriched terms.
 
Given any GO term, its relative abundance (the ratio of contained probes to total probes) in Affymetrix array or soybean genome is the same/similar?Given any GO term, its relative abundance (the ratio of contained probes to total probes) in Affymetrix array or soybean genome is the same/similar?
No. It should be, but it may be not. In other word, I can not guarantee this. Here I can post some reasons 1. The total number of entities in two data set is different. The microarry is less, and I do not know whether there are bias in the process of array designing. 2. The array is designed based on EST or flcDNA, but the genome appear later which is used to predict for genes and proteins. 3. As probeset sequences in array should be translated to protein sequence for GO annotation, and they are come from EST, it may produce difference. Briefly, I can not say the abundance is same or similar. But if you implement microarray and find interesting a batch of genes, use genome as background is reasonable. Anyhow the mission of array is to minitor genes change, it works, isn't it?
 
Who are using agriGO?
Locations of visitors to this page
 
What is your lab focusing?

Zhen Su's Lab: We mainly focus on plant molecular system biology and provide bioinformatics service. Other databases or web service provided by our lab including:
EasyGO (Gene Ontology enrichment analysis tool)
plantsUPS (The Database of Plants Ubiquitin Proteasome System)
PMRD (plant microRNA database)
MtED (An expression database for Medicago).

 
What features used in agriGO construction?
I present their logos as follow:
MySQL-logo       Apache-logo       Vim-logo       PHP-logo       Python-logo