Expressed
sequence tags (ESTs) from
Caenorhabditis elegans
An expressed sequence tag is a single pass sequence
taken from a randomly selected cDNA clone. ESTs are used to
investigate the diversity of genes expressed by an organism, tissue
or cell. By looking at only expressed sequences we can
- avoid the expense of complete genome sequencing
(no introns or intergenic DNA are sequenced)
- allow the organism/tissue/cell to instruct us as
to what is "important" in terms of expression levels of
genes
- permit assessment of differential gene expression
by comparing stage or tissue specific datasets
- confirm splicing and coding predictions from
genomic DNA sequences
For C. elegans there are three sources of ESTs. Two small (~1300 sequence) datasets were produced early in the genome project, one by TIGR and the other by Washington University. However, the vast majority of ESTs have been produced by the lab of Yuji Kohara in Japan. To date (2007) over 300,000 ESTs have been deposited by Prof. Kohara's lab in public databases.
The "Orfeome Project" has also attempted to define the transcriptome of C. elegans by PCR amplifying from cDNA using primenrs predicted to span genes in the genome sequence (called "ORF" or open reading frame sequences).
A third sequence-based method that has been used to define and analyse the C. elegans transcriptome is serial analysis of gene expression (SAGE). Expression levels of predicted genes have also been defined using microarrays.