Filarial Genome Project

Report of the coordinating network meeting

held at Smith College, Northampton, USA
May 10-11 1996

Participants
Kunthala Jayaraman, Madras
Tom Egwang, Kampala
Reda Ramzy, Cairo
Ibrahim Kamal, Cairo
Steve Williams, Northampton
Mark Blaxter, Edinburgh
Al Scott, Baltimore

WHO Observer
Boris Dobrokhotov, WHO

Scientific Advisers
Tom Nutman, NIH
Ted Bianco, Liverpool
Sara Lustigman, New York


The meeting was called to


Task 1: Reviewing Progress To Date


The goals of the first year of the Filarial Genome Initiative were to
1.1 · initiate global cooperation in filarial genome research
1.2 · construct and distibute high quality cDNA libraries to cooperating labs
1.3 · obtain 3000 expressed sequence tag sequences from these libraries
1.4 · Provide training in sequencing and sequence analysis for the endemic partner labs and set them up with the necessary equipment.

1.1 The email network "filarial-genome"
was initiated and now has 160 subscribers world wide. A world wide web site "Filarial Genome Network" or FilGenNet was constructed and is now the home site of not only the Filarial Initiative but also a coordinating centre for the WHO parasite genome projects as a whole.

1.2 Five high quality cDNA libraries (from microfilaria, infective L3s [two libraries], L4 and adult male nematodes were constructed. This completes the initial goals of the project in this area. In addition, libraries from Onchocerca and Wuchereria were constructed using funding from other sources. Over 70 aliquots of libraries had been distributed by May 1996 to labs worldwide.

1.3 The expressed tag sequencing project has been very succesful. At the end of the year (end Jan 1996) there were 2500 Brugia ESTs in Genbank/dbest. These were obtained from the different libraries as follows through a coordinated effort linking endemic and developed country labs.

A review of the EST dataset was presented. Initial analysis suggests that the 2500 sequences represent 1000-1300 different genes, or about 10% of the total gene content of Brugia. Importantly, even in the heavily sampled L3 infective library (1200 clones sequenced) the rate of new gene discovery is still about 45% of all clones sequenced. The cost-per-clone was estimated at between $7 and $9 per clone or $0.02 per base. Over 70 clone requests have been dealt with by Edinburgh and Smith College. Many interesting genes have been identified, and over 100 clones have been sent out to both genome initiative and other labs for further characterisation. Significantly, many requests have come from labs outside the filarial field, indicating that the data is reaching a wide audience and that the Brugia genome initiative is impacting on wide areas of biology. Amongst the clones sequenced are
· copies of nearly all the previously sequenced Brugia genes
· genes encoding 80% of the ribosomal proteins of Brugia
· many clones of immunological interest, such as macrophage migration inhibition factor, natural killer cell enhancing factor and vespid allergen homologues
· many clones of nematological interest such as collagens
· many clones which identify Caenorhabditis elegans homologues
· many clones with exciting or intriguing database similarities such as LIM domain, MAD box and protein kinase domains.

Table 1: EST submissions to May 1996


Laboratories

Total

Stage

Library

Ramzy (Egypt) & Williams (US)

Egwang (Uganda), Williams (US) & Slatko (US)

Jayaraman (India), Scott (US) & Blaxter (UK)

microfilaria

conventional library

331

331

L3 infective

SL library

280

280

L3 infective

conventional library

1180

1180

L4

SL library

280

280

adult male

conventional library

544

544

Totals

1724

331

560

Overall Total

2615


1.4 A succesful intensive training course was held at Smith College in May 1995 and the endemic country labs were supplied with sequencing rigs and associated equipment. All the endemic labs were up and sequencing ESTs within the year. A fruitful exchange between Kampala and Madras has been established



Task 2: Goals for the year 1996


Goals for the year 1996-1997 were discussed:
2.1 · cDNA library construction
2.2 · EST sequencing
2.3 · Genomic library construction and screening
2.4 · Gridding of libraries
2.5 · Training Workshop
2.6 · The genome database and data access

2.1
It was agreed that three additional cDNA libraries should be constructed and screened by EST sequencing. The goal is to get representative sequencing on as many of the lifecycle stages as is possible.

Table 2: New cDNA Libraries


Lifecycle Stage

Type of Library

Responsible Laboratory

L2

SL library

Scott (US)

L3-L4 transition

conventional library

Blaxter (UK)

Adult Female

conventional library

Williams (US)


Al Scott and Steve Williams announced the availability of high quality cDNA libraries from Wuchereria bancrofti and Onchocerca volvulus. While not funded by the WHO project, these libraries were inspired by its success, and will be advertised through the FilGenNet system.

2.2 EST sequencing is continuing rapidly. Participating laboratories reported an additional 600 ESTs sequenced but not yet submitted, and the total output of the labs is approaching 200 per week. Sequencing goals were agreed to equalise the number of clones sequenced per lifecycle stage. Two points were made regarding data acquisition:
(1) No reduction in discovery of new genes had been noted by the Williams lab in their sequencing of over 1500 clones from the L3 conventional library. This indicated that the library is of exceptional quality.
(2) the L4 SL cDNA library however is less diverse and significant cross screening will have to be performed.
Sequencing will continue as funds allow to reach a total of 1000 clones from each stage by the end of 1996.
Nonradioactive sequencing methods will be researched for possible transfer to the endemic labs.

Table 3: EST goals for 1996-1997


Stage

Library

Sequencing Labs

Current total

additional sequencing

total goal

microfilaria

conventional library

TE, SW & BS

350

650

1000

L2

SL library

AS

0

500

500

L3 infective

SL library

MB & AS

280

L3 infective

conventional library

SW

1450

50

1500

L3-L4 transition

conventional library

MB *

0

1000

1000

L4

SL library

KJ, AS & MB

485

515

1000

adult female

conventional library

MB *

0

1000

1000

adult male

conventional library

RR & SW

760

240

1000

Totals

3055

3955

7000


Notes to Table 3: Laboratories: KJ, K. Jayaraman, Madras; RR, R. Ramzy, Cairo; TE, T. Egwang, Kampala; AS, A. Scott, Baltimore; SW, S. Williams, Northanmpton; BS, B. Slatko, NEB.
* next to MB indicates that these sequences will be obtained using external non-WHO funding to M. Blaxter (MRC, UK).

2.3 Two genomic libraries will be constructed for the Filarial Genome Initiative.

2.3.1 A cosmid library (insert size ~40 kb, titre ~20,000 independent clones) will be constructed for distribution to the network. There is no currently available good quality Brugia genomic library. This cosmid library will allow researchers to isolate and study their gene-of-interest on a relatively small DNA fragment. It will be constructed by M. Blaxter and A. Scott.

2.3.2 A bacterial artificial chromosome (BAC) library (average insert size 80 kb, titre 10,000 independent clones) will be constructed by Barton Slatko and Steve Williams. This library will be the substrate for subsequent genomic mapping. The BAC system was chosen in preference to cosmids and YACs in this regard because (1) most gene rich areas appear to be easily clonable in bacteria (2) the downstream processing of the clones is relatively simple compared to YACs and (3) the insert size means a significant reduction in clone numbers needed compared to cosmids. The 10,000 clone library will give an eight fold over-representation of the genome (estimated at 100 Mb). If the telomeric and sattelite repetitive regions are as expected poorly represented, the effective overrepresentation will be nearer 10-fold. Initial proof of quality will be made by (a) analysis of insert size from a set of randomly selected clones and (b) hybridisation selection of gDNA clones corresponding to a small number of the ESTs.

2.4 Gridding of cDNA and gDNA libraries was identified as an important goal for the year 1996. The libraries will be picked to 96 well microtitre plates and gridded using robotics technology. Contact has been made with Genome Sciences Inc. of St. Louis who are prepared to perform the process for a reagents-only cost. Each library will incur a $450 setup charge and $700 per 10,000 clones picked and gridded. Filter mats will cost $1500/six library sets. As many filter mats as is possible (within financial constraints) will be printed from each library for distribution to network laboratories. Funds for consumables to provide additional filter sets were identified as a core component of further funding requests. Barton Slatko and Steven Williams will coordinate library gridding and filter mat production. The Filarial Research Community will be asked to place "orders" for additional filter sets to reduce costs.

Table 4: Gridded Libraries


Library

# clones to be gridded

# filter mats to be produced

cDNA libraries (current)

microfilariae conventional

18,000

6

L3 SL

18,000

6

L3 conventional

18,000

6

L4 SL

18,000

6

Adult male conventional

18,000

6

cDNA libraries (new)

L2 SL

18,000

6

L3-L4 conventional

18,000

6

Adult female (conventional)

36,000

12

gDNA libraries (new)

BAC

10,000

10

cosmid

not gridded


The gridded BAC library will be used for constructing a STS-linked physical map. The gridded cDNA libraries will be used for screening by hybridisation and for producing reduced redundancy clone sets for EST sequencing.

2.5 The advanced training workshops involve advanced training in sequencing and analysis for endemic lab trainees. Three workshops are planned: the first was held in April-May (Smith College; Egypt trainee) and the other two (New England Biolabs; Ugandan trainee and Baltimore; Indian trainee) will be held in August and September.

2.6 The informatics side of the project was the subject of much discussion. Four goals were set for 1996.

2.6.1 The first was the management of the EST dataset. A clustering system was devised and will be implemented across the entire dataset. This will allow simplified searching of the dataset, the production of easy-to-browse lists of interesting and useful clone similarities and the rationalisation of the different labs' systems. David Guiliano and Mark Blaxter are to coordinate this.

2.6.2 The second was to speed up the production of the Filarial Genome Database, FilDB. Mark Blaxter is to instruct the WHO Parasite Genomes computer scientist (to be stationed at the European Bioinformatics Institute, Cambridge) in this. The appointee, Martin Aslett, has extensive experience of acedb, the database engine, and of the Leishmania and Toxoplasma genome projects. FilDB will be made available over the internet as soon as is possible. In a separate development, Mark Blaxter obtained funding from the Clark Foundation to produce a web version of the Filariasis Association Bibliography, a huge resource of over 15,000 references from the 1800's to the present day. This will be integrated with FilDB.

2.6.3 The third was to provide computer and informatics support for the endemic labs. Mark Blaxter is the contact point for database and other questions.

2.6.4 The fourth was to continue to sponsor publication of Filarial Genome Initiative papers and reviews in parasitological, tropical medicine and general journals.

Task 3: Goals for 1997-2000


The network identified goals for the next four years of the WHO initiative as follows:
3.1 EST sequencing
3.2 Genome Mapping
3.3 Other approaches to genome analysis
3.4 Training workshops
3.5 A Filarial Resource Center
3.6 Involving additional laboratories

3.1 EST sequencing


The network produced the following list of long term goals in EST sequencing for the filarial project. The network was pleased to hear that the UK MRC had awarded a grant to M. Blaxter for EST and genomic analysis in Brugia. The program of goals was designed to include this important addition to the work of the network. The MRC project will sequence from libraries not in the initial remit of the WHO project, namely the L3-L4 transition and the Adult Female libraries. The elevated total for the adult female library reflects the expectation that this library (since it includes larval transcripts) will be more diverse than the others.

Table 5: EST sequence goals for 2000

Stage

Library

Sequencing Labs

1996/97 goal

1997-1999 goal

total goal

microfilaria

conventional library

TE, SW & BS

1000

1500

3000

L2

SL library

AS

500

1500

3000

L3 infective

SL library

MB & AS

L3 infective

conventional library

SW

1500

1500

3000

L3-L4 transition

conventional library

MB *

1000

1500

3000

L4

SL library

KJ, AS & MB

1000

1500

3000

adult female

conventional library

MB *

1000

1500

6000

adult male

conventional library

RR & SW

1000

1500

3000

Totals

7000

10500

24000


Notes to table 5: Laboratories: KJ, K. Jayaraman, Madras; RR, R. Ramzy, Cairo; TE, T. Egwang, Kampala; AS, A. Scott, Baltimore; SW, S. Williams, Northanmpton; BS, B. Slatko, NEB.
* next to MB indicates that these sequences will be obtained using external non-WHO funding to M. Blaxter (MRC, UK).

3.2 Genome Mapping Goals

It was decided that all the laboratories would start to perform genome mapping hybridisations to the gridded BAC library from 1997. This distributed approach promises to allow the network to involve more laboratories and permits the labs involved to diversify from sequencing to other genome research areas. Each lab will have the task of mapping 100 unique EST clones to the grids each year. The six labs involved will thus generate 1200-1800 mapped clones. The Blaxter lab will follow a directed strategy using BAC end clones to link contigs: a near-complete map should be available by end-1998 or early 1999.

Initial data from the EST project suggests that a lot of cross talk will be generated with the C. elegans genome project. The C. elegans genome database will in future releases include the Brugia ESTs and it will also be possible to construct sets of parallel physical maps in the C. elegans database.

3.3 Other Approaches to Genome Research

The network discussed the application of other techniques to Brugia genome research. The conclusion of the discussions was that many if not all will be applicable in the near future (1997-9) and that the network should encourage additional labs with special expertise to join the global effort.
Specific techniques discussed were:
· Pulsed Field Gel long-range mapping. This technique has already been applied by the Williams lab to Brugia research, and it was thought that it would be a valuable addition to the BAC mapping strategy to have the long-range linkage data from PFG studies. The Williams lab is to troubleshoot and develop standard protocols for wider use.
· Fluorescence In Situ Hybrisisation of ESTs to chromosome preparations. The FISH technique has been used to great effect in both the Schistosoma and the Anopheles projects. It would permit a very high level of integration of the physical map, and the network agreed to seek out labs willing to collaborate on applying it to Brugia.
· Fluorescence Activated Chromosome Separation. The FACS technique has been used to separate mammalian chromosomes, and is in principle applicable to chromosomes from Brugia spermatids. Al Scott will investgate this possibility
· Genease-Like Restriction Enzymes. The nucleotide bias in the Brugia genome means that most restriction enzymes with GC-rich recognition sequences will cut within genes. Libraries constructed from restricted DNA could be end-sequenced to identifiy chromosomal genes in an abundance-of-expression-unbiased manner. Barton Slatko is pursuing this idea.

3.4 Advanced Training

It was agreed that a genome mapping workshop would be held in Edinburgh (Blaxter lab) in Spring 1997. The Network agreed that the training and communication aspects of the WHO sponsorship were extremely valuable to both endemic and non-endemic labs.

3.5 A Filarial Resource Center

It was recognised that the load imposed on the Williams lab by requests for clones and libraries can only increase. A half-time position will be requested to service these at Smith College. The Resource Center will also handle filter requests.

3.6 Encouraging Further Labs to Join the Initiative

The network supported suggestions from the scientific advisers and member labs that the Network should be widened. While there are budgetary constraints we would like to encourage new groups to join the initiative, either to extend the sequencing or mapping sides of the project or to bring in new techniques (such as FISH). The secretary (Mark Blaxter) and coordinator (Steve Williams) were asked to pursue this goal.


Mark Blaxter
Filarial Genome Initiative Secretary