Coding the Brugia Genome Project ESTs


see here for a key to decoding Brugia malayi EST names


Rules for coding the ESTs for submission to dbEST
From Mark Blaxter, following the 2nd FGN meeting at Smith College, MA, 14/5/95

Please see this page for very important information about how you must designate the libraries in your EST submissions.


First off, each CLONE has a unique identifier (the dbEST "CLONE_ID")

1 Investigator ID

A two character note of the originating lab.
This is the lab doing the sequencing not the origin of the library

The current list of designators is:
AS = Al Scott
SW = Steve Williams
RR = Reda Ramzy
MB = Mark Blaxter
BS = Barton Slatko
KJ = Kunthala Jayaraman
TE = Tom Egwang
BG = Bill Gregory
TN = Tom Nutman
GW = Gary Weil

2 Library Code

A three character library code, to indicate the stage from which the library was made

The current list of designators is:

2SL= mosquito derived second stage larval library (spliced leader) = Al Scott's
3IS = mosquito infective L3 library (spliced leader) = Al Scott's
3IC = mosquito infective L3 library (conventional) = Steve Williams'
3D6 = L3 library from day 6 post-infection (conventional) = Steve Williams'
3D9 = L3 library from day 9 post-infection (conventional) = Steve Williams'
4SL = L4 spliced leader library (spliced leader) = Al Scott's
AMC = adult male (conventional) = Steve Williams'
AFC = adult female (conventional) = Steve Williams'
MFC = microfilaria (conventional) = Steve Williams'

 

3 A four or five character clone code

This should be unique to the clone, made up of an alphabetic and three numeric characters
The alphabetic character is intended to indicate distinct phases of each lab's sequencing effort. For example the Scott lab used A001-A100 for the first set of cDNAs. After hybridising these to gridded clones, a non-hybridising B001-B800 set was chosen for sequencing...

Currently, the Blaxter/Scott labs have used the following designators for their clone sets:
A series a first set of 37 clones from the L3 SL library
B series a second set of 230 clones from the L3 SL library
C series a first set of about 350 clones from the L4 SL library

The Williams and Ramzy labs use A001-A999, A1000, A1001, etc

The Blaxter lab is now using microtitre plate addresses for all clones: E1A08 means plate E1, row A clone 8.


Then, for each EST submitted the clone code is supplemented with a primer or direction code

eg SL for clones sequenced with the SL primer
5' for clones sequenced with the SK primer from SW's lambda UniZap libraries


RESULT:

The result of this should be codes which are intelligible to the whole community... For example:

B.malayi EST # AS3ISA001SL
clone 001 from Al Scott's A series from a L3 infective SL library sequenced with SL

B.malayi EST # SWAMCB345T3

clone 345 from Steve Williams' B series from an adult male conventional library sequenced with T3.
Please email me (Mark.Blaxter@ed.ac.uk) if you have any comments or queries.