Gene Ontology (GO): selected portions of the GO ontologies
- GO
comes from the Gene
Ontology consortium, and contains the Directed Acyclic
Graphs (DAGs) representing the is_a relationships among nodes for the
three GO ontologies:
- Biological process,
- Cellular component, and
- Molecular function.
downloaded during October of 2008.
Schema in CLSD
There are 2 tables for each GO ontology kept within CLSD, although they all emanate from the GO text distribution file, gene_ontology_edit.obo. The first of each pair is the ontology subsumption DAG (based on is_a link data within the distribution file). Its table structure is:
| Field name | Type |
|---|---|
| CHILD_ID | VARCHAR |
| CHILD_NAME | VARCHAR |
| PARENT_ID | VARCHAR |
| PARENT_NAME | VARCHAR |
The second file of each pair is a closure on the DAG containing a list of pairs where the first entry is a child of the second. The entries are presented as IDs rather than as names.
These "closure" tables can be used to determine whether an entry is a relative (descendent or ancestor) of any other entry within each DAG, thereby enabling or simplifying some kinds of queries. However, the closures are currently only transitive on the is_a relationship, and not on the part_of relationship.
- BIOLOGICAL_PROCESS_DAG
- BIOLOGICAL_PROCESS_CLOSURE
- CELLULAR_COMPONENT_DAG
- CELLULAR_COMPONENT_CLOSURE
- MOLECULAR_FUNCTION_DAG
- MOLECULAR_FUNCTION_CLOSURE
Suppose you wish to find all the "is_a children" of the nucleus (GO:0005634). You could use a query like:
| CHILD_ID (VARCHAR) | CHILD_NAME (VARCHAR) | PARENT_ID (VARCHAR) | PARENT_NAME (VARCHAR) |
| GO:0031039 | macronucleus | GO:0005634 | nucleus |
| GO:0031040 | micronucleus | GO:0005634 | nucleus |
| GO:0043073 | germ cell nucleus | GO:0005634 | nucleus |
| GO:0043076 | megasporocyte nucleus | GO:0005634 | nucleus |
| GO:0045120 | pronucleus | GO:0005634 | nucleus |
| GO:0048353 | primary endosperm nucleus | GO:0005634 | nucleus |
| GO:0048555 | generative cell nucleus | GO:0005634 | nucleus |
| GO:0048556 | microsporocyte nucleus | GO:0005634 | nucleus |
However, if you wish to find all the cellular locations that are "is_a" descendents of the nucleus (GO:0005634), you could use a query like:
| CHILD_ID (VARCHAR) | PARENT_ID (VARCHAR) |
| GO:0001673 | GO:0005634 |
| GO:0001674 | GO:0005634 |
| GO:0001939 | GO:0005634 |
| GO:0001940 | GO:0005634 |
| GO:0005634 | GO:0005634 |
| GO:0031039 | GO:0005634 |
| GO:0031040 | GO:0005634 |
| GO:0042585 | GO:0005634 |
| GO:0043073 | GO:0005634 |
| GO:0043076 | GO:0005634 |
| GO:0043078 | GO:0005634 |
| GO:0043079 | GO:0005634 |
| GO:0043082 | GO:0005634 |
| GO:0045120 | GO:0005634 |
| GO:0048353 | GO:0005634 |
| GO:0048555 | GO:0005634 |
| GO:0048556 | GO:0005634 |
Note that there would be many more descendents in this list if the "part_of" transitivity were included in this list.




