BaNG - Blaxter Nematode and Neglected Genomics
  Caenorhabditis elegans
     Introduction to the biology of a model nematode
       Mark Blaxter at the Institute of Evolutionary Biology, University of Edinburgh
 

C. elegans hermaphrodite
a C. elegans adult hermaphrodite

An introduction to the genome of the nematode Caenorhabditis elegans

Mark Blaxter, Institute of Evolutionary Biology, University of Edinburgh, UK

The C. elegans genome is spread across six approximately equally sized chromosomes (5 autosomes, one X). It has been completely sequenced. Follow this link for a more detailed overview of the C. elegans genome. The community genome database WormBase has full information...

The genome size is 100.2 Megabases (Mb)

Humans have a genome size of 3,000 Mb.
The baker's yeast (S. cerevisiae) genome is 12 Mb.
The genomes of other nematodes are in the same range. Brugia malayi, a filarial nematode parasite of humans, has a genome of ~95 Mb. However Ascaris suum, the pig roundworm, has a larger germ line genome (>500 Mb) which undergoes somatic diminution.

The AT content is 44%.

How was the genome sequenced?

First, a physical map of the genome was constructed. The map is based mostly on 17,000 cosmid clones of genomic DNA (insert size 35-40 kb). These clones were "fingerprinted" using restriction enzymes, and the fingerprints used to order the clones in overlapping contiguous sets, or contigs. These cosmid contigs were supplemented by a set of 3,000 yeast artificial chromosome (YAC) clones (insert sizes 100 kb and above). Because the yeast host tolerates sequences which E. coli will not, the YAC clones can "bridge" gaps between contigs of cosmids. With these two resources, contigs covering >95% of all the chromosomes were assembled. The genome sequence was generated using a clone-by-clone strategy using these cosmid and YAC clones, and supplemented by directed cloning of 'difficult' regions.

The genome is completely sequenced.

It was essentially completed around Christmas 1998. It was the first animal genome completed. Sequencing was started on the cosmid clones, and moved from them to the bridging YACs. From the genome sequence, protein-coding and RNA genes have been identified and novel features of gene organisation and chromosomal structure discovered.

Gene identification.

There are about 20,000 protein-coding genes. As an aid to gene identification, cDNA copies of mRNAs are being "tag sequenced" to identify them. Over 200,000 cDNAs have been tag sequenced and >300,000 ESTs deposited. These "expressed sequence tags" or ESTs offer a set of snapshots of gene expression in the nematode, and have identified around half of the organism's genes. The cDNA data is used in the prediction of genes from the genome sequence along with database searches for similarities between C. elegans genes and those of other organisms such as humans. This estimate is based on the correspondance between genomic DNA sequence and cDNA sequences, and on the prediction of coding genes from genomic sequence.


These pages were written by Mark Blaxter and colleagues.
Contact the www.nematodes.org webmaster if there are problems.