TRRD data: selections from the TRRD website data
- TRRD
comes from
the Institute of Cytology and
Genetics,
and contains an extract (circa March 2009) including selected
data from the database on Transcription Regulatory Regions of
Eukaryotic Genes (TRRD). This database contains data on:
- each gene containing a binding region or generating a transcription factor,
- each transcription factor regulating a gene, and
- each binding region (site) bound by a transcription factor.
Note that a gene may have mulitple binding regions, a binding region may be bound by mulitple transcription factors, and a transcription factor may bind multiple binding sites.
- Gene IDs take the form "Axxxxx" and there are also with gene accession Ids of the form "Species:Name",
- Transcription factor IDs take the form "Fxxxxx"; and
- Binding region IDs take the form "Sxxxxx".
Fields containing these IDs are usually either primary or foreign key fields in the tables in which they appear.
In this deployment of the TRRD data, there are table for gene data and transcription factor data, but NOT binding region data.
To express the mulitplicity of interconnections, the TRRD data for genes contains records that link binding regions and genes with transcription factors. Locally, the genes data has been separated into 2 tables TRRD.GENES and TRRD.GENES_DR where the latter consists of the DR records from the downloaded TRRD genes file, and provides cross-linking identifiers to many gene ID systems, particularly SWISSPROT and NCBI Entrez GIDs.
See the TRRD web site or the summary paper
http://www.bionet.nsc.ru/meeting/bgrs_proceedings/papers/2006/BGRS_2006_V1_017.pdf
for more details.
Linkages within TRRD are made through the IDs assigned to each gene, transcription factor and binding region:
Fields containing these IDs are usually either primary or foreign key fields in the tables in which they appear.
This data is also available from the gene-regulation.com web site via Web forms.
Within CLSD this TRRD data can, of course, be accessed via SQL commands that can merge it with other data within CLSD.
TRRD Gene data
There are 2 tables for TRRD gene data. All data is of type VARCHAR. The main table is TRRD.GENES with the following layout:
| Field name | VARCHAR Length |
|---|---|
| TRRD_GENE_ID | 20 |
| TRRD_ACC | 20 |
| TRRD_SPECIES | 60 |
| TRANSFAC_GENE_ID | 10 |
| TRDD_GENE_NAME | 100 |
These fields were selected from the gene records provided through the Web site.
The table TRRD.GENES_DR contains entries for the "DR" records associated with a specific gene:
| Field name | VARCHAR Length |
|---|---|
| TRRD_ACC | 20 |
| TRRD_GENE_ID | 20 |
| DR_SCHEME | 15 |
| DR_SCHEME_PART1 | 30 |
| DR_SCHEME_PART2 | 100 |
Note that this table includes GO annotations for each gene. Annotations from each of the 3 main GO categories (molecular function, biological process, and cellular component) are included, and there may be multiple annotations for a single GO component.
TRRD Factor data
There are 7 tables for the TRRD factor data. The main table is TRRD.FACTORS with the following layout:
| Field name | VARCHAR Length |
|---|---|
| TRRD_FACTOR_ID | 20 |
| TRRD_GENE_ID | 20 |
| TRRD_SITE_ACC | 20 |
| TRRD_FACTOR_NAME | 100 |
| TRRD_FACTOR_SUBUNIT_NAME | 80 |
| TRRD_FACTOR_NAME_SYNONYM | 80 |
| TRRD_FACTOR_SPECIES | 80 |
| TRANSFAC_FACTOR_ID | 40 |
| TRRD_FACTOR_SOURCE | 40 |
Here is an example showing how to list all the GENES entries for gene Hs:ADH3:
select TRRD_FACTOR_ID, TRRD_GENE_ID, TRRD_SITE_ACC, TRRD_FACTOR_NAME from trrd.factors a where a.trrd_gene_id = 'Hs:ADH3'
which gets the following result:
| TRRD_FACTOR_ID (VARCHAR) | TRRD_GENE_ID (VARCHAR) | TRRD_SITE_ACC (VARCHAR) | TRRD_FACTOR_NAME (VARCHAR) |
| F1945.1 | Hs:ADH3 | S1945 | RARbeta; retinoic acid receptor beta |
| F1946.1 | Hs:ADH3 | S1946 | C/EBPalpha; CCAAT/enhancer binding protein alpha |
| F1947.1 | Hs:ADH3 | S1947 | C/EBPalpha; CCAAT/enhancer binding protein alpha |
| F1950.1 | Hs:ADH3 | S1950 | TFIID; |
showing that 4 transcription factors that bind to Hs:ADH3 at 4 unique sites.
The next example links the records that link to F1945.1 to other ID systems, by joining the GENES table with the GENES_DR table:
select a.TRRD_FACTOR_ID, a.TRRD_GENE_ID, a.TRRD_SITE_ACC, b.TRRD_ACC, b.TRRD_GENE_ID, DR_SCHEME, DR_SCHEME_PART1, DR_SCHEME_PART2 from trrd.factors a join trrd.genes_DR b on a.trrd_gene_id = b.trrd_acc where a.trrd_gene_id = 'Hs:ADH3' and a.trrd_factor_id = 'F1945.1'The result is:
| TRRD_ FACTOR_ID | TRRD_ GENE_ID | TRRD_ SITE_ACC | TRRD_ACC | TRRD_ GENE_ID | Continued=> |
|---|---|---|---|---|---|
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 | |
| F1945.1 | Hs:ADH3 | S1945 | Hs:ADH3 | A00374 |
| Continued=> | DR_SCHEME | DR_SCHEME_PART1 | DR_SCHEME_PART2 |
|---|---|---|---|
| SWISS-PROT | ADHG_HUMAN | P00326 | |
| CleanEx | HGNC:251 | ADH1C | |
| Ensembl | ENSG00000196616 | null | |
| GenAtlas | ADH1C | null | |
| GeneCards | ADH1C | null | |
| GeneLynx | ADH1C | null | |
| GO | GO:0005737 | Cellular component: cytoplasm | |
| GO | GO:0004024 | Molecular function: alcohol dehydrogenase activity, zinc-dependent | |
| GO | GO:0006066 | Biological process: alcohol metabolism | |
| HGNC | HGNC:251 | ADH1C | |
| HOVERGEN | P00326 | null | |
| MIM | 103730 | null | |
| SOURCE | ADH1C | Hs | |
| EntrezGene | ADH1C | 126 |




