avoid the expense of complete genome sequencing
(no introns or intergenic DNA are sequenced)
allow the organism/tissue/cell to instruct us as
to what is "important" in terms of expression levels of
genes
permit assessment of differential gene expression
by comparing stage or tissue specific datasets
confirm splicing and coding predictions from
genomic DNA sequences
For C. elegans there are three sources of ESTs. Two small (~1300 sequence) datasets were produced early in the genome project, one by TIGR and the other by Washington University. However, the vast majority of ESTs have been produced by the lab of Yuji Kohara in Japan. To date (2007) over 300,000 ESTs have been deposited by Prof. Kohara's lab in public databases.
The "Orfeome Project" has also attempted to define the transcriptome of C. elegans by PCR amplifying from cDNA using primenrs predicted to span genes in the genome sequence (called "ORF" or open reading frame sequences).
A third sequence-based method that has been used to define and analyse the C. elegans transcriptome is serial analysis of gene expression (SAGE). Expression levels of predicted genes have also been defined using microarrays.
These pages were written by Mark Blaxter and last updated in early 2007. Contact the www.nematodes.org webmaster if there are problems.