
Filarial Genome Meeting 14-18 March, 1998
REPORTS FROM THE WORKBENCHES
[ ests | efficiency | mapping | analysis | research | guest nematodes | training ]
THE cDNA/EST SEQUENCING EFFORT
The current results of library sequencing were presented as short reports by various individuals (Barton Slatko, Mark Blaxter, Michelle Lizotte-Waniewski, Kunthala Jayaraman, Ibrahim Kamal). There are currently [05 March 1998] 13,641 Brugia malayi ESTs in the database.
|
stage |
number of ESTs |
redundancy |
comments |
|
MF |
2086 |
2.3 |
high % ribosomal RNAs |
|
L2SL |
615 |
2.0 |
many chimaeras in dataset |
|
L3 |
2306 |
1.3 |
excellent libraries |
|
L3D6 |
1549 |
2.4 |
some short inserts |
|
L4 |
1077 |
2.4 |
high redundancy |
|
AM |
1974 |
1.5 |
excellent library |
|
AF |
3775 |
1.2 |
excellent library, some short inserts |
|
total |
13382 |
2.1 |
redundancy 1.8 without rRNAs |
* dataset from 26 February
Slatko Laboratory MF Library
2256 ESTs have been sequenced from the MF library, with ~200 in the process of being submitted. There is a high proportion of ribosomal RNA in the library, but subtraction by hybridisation to the 18,000 clone filters (refer to Barton's and Steve's talks on subtraction, following) eliminates this problem. The average read length is over 600 nt, 24% of the sequences are polyA+.
Blaxter Laboratory AF, L2, L4, L3D6 Libraries
The Edinburgh lab (Jen Daub) has been sequencing ESTs in collaboration with the Sanger Centre (Laurent Baron and Steve Jones) from AF (3700 ESTs), L2SL (600), L3D6 (1550) and L4SL (500) libraries. For the L2 and L4 libraries, there was high redundancy in the datasets from the first few hundred clones, and thus these libraries were crosscreened with pools of abundantly expressed genes and nonhybridising clones selected for additional sequencing rounds. For example L4SL library is screened with X, B, C ribosomal/abundant protein pools, over 50% of the clones are removed. A second screen is sometimes necessary, and from that, most of the redundant clones are removed. The AF and L3D6 libraries had a large proportion of short inserts (a problem with the ingredients of the library kit we believe) and these are screened out by PCR prior to sending for sequencing. The L4 library is very redundant and Edinburgh has stopped sequencing from it. The L2 library is also of low complexity and sequencing is finished there too.
Williams Lab AM, L3D9 Libraries
The Williams lab have recently set up a collaboration with the Genome Center at Washington University, St. Louis. Clones are prepared in 96 well plates; the St. Louis Genome Center does the sequencing and gels and submits the data. One condition of the collaboration is that at least 80% of the clones in each plate must result in successful, submittable sequencing events, free of contaminating E. coli or small PCR products. For PCR reactions, they use 1/10th of the normal primer concentration. No further PCR purification is required. The cost is $8 a run for sequencing and the collaboration intends to process 10 plates a month.
Jayaraman Lab L4 Library
K. J. spoke about her lab's work on L4SL ESTs. Ribosomal protein L20 is 12% of the library and ribosomal protein A2 is next abundant at 1.5%. Manual sequencing gives 280 bases (average) and automated sequencing gives 415 base pairs as an average read length. Novels in the library are about 54% and ribosomal RNA is about 19%. gpd is one gene which appears to be differentially expressed only in L4. Another gene, TCTP has been over-expressed in E. coli. The lab is investigating the role of sxp (by westerns) which could lead to a positive patient diagnostic.
Ramzy Lab AM Library
The Cairo lab has been sequencing from the AM library, and has produced . Manual sequencing yields 450 bases average. Many interesting genes have been identified such as several different major sperm protein genes (msp). Many of the abundant genes are male specific.
The Resource Center
Steve spoke about the WHO-funded resource center (genome@smith.edu). Over 300 requests for libraries and clones have been processed. Phage will be sent in the future, instead of PCR products. In the Edinburgh lab, over 70 clone requests had been processed. Filter requests, for BACs and libraries, are increasing.
PCR Pre-screening Clones in Microtitre Plates
The Blaxter lab's procedure for cross screening in 96 well plates is on the WWW (www.ed.ac.uk/~ jdaub/genome_protocols). Basically, phage are picked to microtiter plates; PCR reactions are done and run on gels to eliminate those of less than 150 bp inserts. those of correct size are transferred to new microtiter trays, treated with shrimp alkaline phosphatase and exonuclease I (37°C 30 min, 80°C 10 min.). A 2 µl sample is run on a gel and 2 µl is used for sequencing. After sequencing, the reactions are precipitated in the tray, dried and used.
PCR Amplification for Genome Center Sequencing
Michelle Lizotte-Waniewski spoke about how the Williams lab does PCR on the libraries, for submission to Washington University Genome Center. They plate out the library, pick phage to microtiter trays and do the PCR reactions (25 µl), using only 1 pmol primer (10X less than before). They run 2 -4 µl on gel to see which are good (DNA of the correct size and amount). PCR success is variable. To submit for sequencing, the plates must have at least 80% positive.
Hybridisation Cross-screening Using Abundant ESTs
Barton spoke about his and Mehul Ganatra's work on hybridisation subtraction on filters of 18,000 clones selected from the MF library. The data shows a 100% increase in finding clones which are novel to the MF database and also 100% increase in finding clones novel to the entire project. Cross screening (non radioactive) was done using ribosomal RNA and ribosomal protein probes provided by Jen Daub and David Guiliano (Edinburgh). With the clustering data now available, cross screening will now be done with abundant MF sequences. Of 170 sequenced clones after the screening, 0% were members of the screened set. There is some indication that many "blues" have inserts, not just vectors. This will be followed up, as it could be due to the mass excision process for making the filters and not in the original library.
Construction of Subtraction Libraries
Steve Williams' lab has been doing subtraction in the library construction phase. One library is L3 subtracted with MF, AF and AM, and another is L3 subtracted with L3SL. From the subtraction EST analysis, subtraction does work on moderately abundant clones in both; super-abundants still sneak through and are still there. Further cross screening is still necessary and if one does that, then 70% of the sequenced clones are novel. It is possible to remake libraries with more "driver" for those clones. The L3-L3SL subtraction may show the presence of a second SL sequence, if present. The method works by using internal restriction sites for cloning and it is not clear that the "uniques" from this approach may not be, to some degree, part of other cDNAs not fully sequenced yet.
Onchocerca volvulus
With respect to other projects, Michelle Lizotte-Waniewski spoke about her work on Onchocerca, funded by McConnell Clark Foundation. From the OvL3 molting library, the average insert is 1200 base pairs. Many differences to genes in BmL3 are seen. From 1452 cDNAs; 0.01% are E coli, 13% are ribosomal, 10% are nonrecombinant and the redundancy is 1.6. Of the database hits, 37% hit other filarial ESTs, 35% hit non-filarial ESTs, 19% are novel to the filarial genome projects and 53% are novel to the Brugia project. Novel genes include chitinase, kinases, growth factors, etc. From the OvL3 conventional library, the average insert is 1000 bp. Currently, 863 have been sequenced, 1700 are underway. So far, 23% are non recombinant and 10% are E. coli. The lab is also working on O. ochengi L3, OvAM, OvAF, OvL2 and clones will be size selected to eliminate small contaminants. OvMF is also under way. They will try and sequence 2000 from each library.
Necator americanus
Mark Blaxter spoke about projects on Necator americanus (human hookworm) using a mixed adult cDNA library derived from David Pritchard, Nottingham University. Inserts >150 bp are selected. Of 206 ESTs, clustering shows 150 clusters containing only 1 EST and 17 with >1. Over 95% are novel genes for hookworms, including collagens, globins and aspartic acid proteases. Novel homologues of ASP, a gene expressed by infective L3 upon infection in the host, are abundant (13% of clones).
Clustering
David Guiliano (Edinburgh) spoke on clustering and redundancy data and analysis. EST clustering is defined as grouping based on nucleic acid similarity. It is useful for evaluating cDNA libraries and for further analysis of genes. The ESTs are first filtered for ribosomal and bacterial sequences. Clusters are then generated by a stepwise approach, leading to a consensus sequence. the data is subjected to BLAST and the data analyzed. While two types of clustering approaches can be used (greedy vs. stepwise), the stepwise procedure in which ESTs are first given identifiers, are easier to analyze and remove errors. From this analysis, some non E. coli -like bacterial sequences are seen, ribosomal RNA in the MF library, reduces productivity, etc. The redundancy data is tabled above.
If one removes the ribosomal proteins and rRNAs, then sequencing redundancy over the whole dataset looks very good (less than 2). The number of stage specific abundant genes is low. In the L3 dataset novel putatively secreted genes are at high levels. For AM, the abundant stage specific ones are all novel. A consensus sequence can be generated using the cluster data (through collaboration with Steve Jones, Sanger Center). Using an alignment program [PHRAP], the consensus sequence becomes more accurate for open reading frame (ORF) and peptide analysis. It is more robust but it requires more than 2 to work accurately. From the data, it appears that it would not take too many clones from any one library to remove the most abundant by subtraction methods. This data will be in FilDB and also in a File Maker Pro version. It will shortly be released for free downloading from the WWW site.
With respect to FilDB, the database program is "object oriented", with objects linked through hyper-text like links. It contains papers, references, 16,000 EST sequences, cluster information, blast similarities, information on clones, etc., in a C. elegans type of organization. For example, you can search for tubulin and find all matches and papers on it; use graphical displays, see if and how family members are related, and do peptide alignments, as well. It takes up 70 Mb of space now, runs on Unix and Windows and Macs; a CD will be available with all other databases, as well. It will be updated as often as possible and you will be able to update your database by ftp.
Sandy presented the agreed upon clone and library designations,
including designations for organism and whether the clones were
derived from subtracted libraries or not. This will be added to the
FUNK description on the www site. (A copy of "Sandy's List" is
attached). A Parasitology Today article on naming genes by common
protocols (FUNK) was recently
published.
TRAINING SESSIONS:
CLUSTER DATA AND FilDB
Training sessions were held using FileMaker Pro and FilDB for managing and using the cluster datasets. Martin Aslett (EBI, Hinxton) and Mark Blaxter described the features of the programs and the uses of the data sets. Participants gained experience to examine the cluster database, use the query system to select abundant or stage-specific clones and clone families, and discuss how to coordinate their use in mapping. Much useful feedback on program design and desirable features was given.
For FilDB, a reintroduction to ACeDB (the software behind the database) and to the specifics of FilDB were presented, focusing on data display, how FilDB can be used to examine expression patterns and define gene function.
Genomic Libraries
Steve Williams spoke about genomic libraries which have and are being constructed. These include a B. malayi lambda Fix library (12 kb maximum), a cosmid and a BAC library for Brugia. In addition, an O. volvulus genomic library in lambda Zap (Eco RI partial, 12 kb inserts maximum), and a D. immitis lambda zap express Sau IIIA partial DNA library are being constructed or are available.
Cosmid Library Quality Control
The cosmid library is likely heavily contaminated with DNA from Wolbachia, an obligate rickettsial endosymbiont which is also antigenic in Brugia infections. BLAST results and hybridization data suggest high Wolbachia contamination. End sequences of clones done by Jennifer Ware in Barton Slatko's lab and David Guiliano (Edinburgh) do not show many Brugia type-clones. Most sequences are similar to bacterial sequences (Wolbachia?) and they do not have the expected high A + T content. One end sequence matched a Brugia cDNA. A hormone receptor gene has also been found in the cosmid library by Claude Maina's lab. No Hha I repeat-containing clones were found and only 4 positive (expected 90) 18S rRNA positives were found.
BAC Library Quality Control and Results
A 3,000 clone BAC library was constructed in Steve William's lab. Insert sizes are between 60 and 90 kb. For the BAC library, while there may be some Wolbachia content (20%?), the Brugia nuclear DNA content is much higher. cDNAs which may be Wolbachia hybridized to BAC filters (126 hits from 5 clones). David Guiliano (Edinburgh) has performed a number of hybridisations to the gridded library. Brugia clones (8/15) and 18s rRNA clones (1% positive) do hybridize to the BAC filters. The Hha I repeat hybridizes with about 8% positives. Three MIF-1 containing BACs were identified, and end sequence data showed noncoding DNA of the correct A + T content. About 50% of the BAC hybridizations with Brugia specific clones are successful: Are the others missing from the library or simply not hybridizing? It is possible that the size of the inserts used for probes is affecting the hybridisation results. So, there is hope that the BAC library has 3-fold coverage with Brugia DNA.
Wolbachia Contamination of the BAC Library
It can be asked if any BAC has both Brugia and Wolbachia DNA. We can compare spot hybridizations with Wolbachia specific and Brugia specific cDNAs and see if there is overlap on the filters in the same spots. We will have better data when the 50 BAC ends are sequenced (60 kb) and by other tests. If there is Wolbachia DNA in the BAC library, we can use Wolbachia DNA to hybridize to the BACs and eliminate those that are Wolbachia. We may be able to identify Wolbachia clones by using closely related bacterial DNA as probes.
Can we get rid of Wolbachia DNA for new library construction? Why did the Wolbachia DNA clone easier? Can we use pulsed field gels to separate Wolbachia chromosomal DNA (circular) from the Brugia DNA. Is it possible to cure Brugia of Wolbachia with tetracycline? Would this kill or sterilize Brugia? It might even be a way to eliminate Brugia infections. It was suggested that immediate verification of the BAC library be done and that methods to eliminate Wolbachia contamination be done.
Construction of a YAC Library
Jeremy spoke about the construction of a new YAC library. DNA was digested by David Guiliano with Eco RI in high mw PFG plugs, and DNA size selected (several fractions >100 kb). We need to also check the YAC library for Wolbachia, perhaps by PCR.
Mapping Approaches
Al Scott spoke about mapping techniques and reported that the nonradioactive methods work (either Amersham or NEB kits). We would like to know if we need to strip the membranes to make them last longer. Probes will be handed out soon, as David Guiliano is making the list, from the cluster data, of the first 250 to be hybridized. As soon as we have some indication form David that the BAC library is at least 3-fold coverage of Brugia DNA, we will start hybridizing the Brugia ESTs.
We identified the need for 8-10 BAC filters per lab for the mapping. If each lab needs to generate positive hybridisation results for 10 - 50 genes, one may need to do twice that amount, based upon David's results of successes. We will send filters to endemic labs and make sure all protocols are set. Al and Barton will make fully inclusive packages for these labs including all materials and positive controls.
Making BAC End Probes
For mapping onto BACs and YACs, end probes can be made (Jen Daub's method is on her web methods page). Basically nested primers are used for sequential rounds of PCR, three within the vector and one random primer for the insert. Using alternating low and high stringency PCR one can select products which are single bands and clone specific. For the PCR, BAC colonies can be picked, boiled in water and PCR done. The end sequences done by David Guiliano and sequences of the PCR products produced by Jen agree.
Genome Sequencing
Barton Slatko spoke about a Tn7 transposition system under development to map and sequence bacterial plasmid clones and perhaps cosmids and BACs as well. Jen Ware is sequencing a cosmid at this moment which has a nuclear hormone receptor in it.
David Guiliano (Edinburgh) is sequencing a BAC (containing Bm-MIF-1) in collaboration with the Sanger Centre.
Nutman Lab, NIH
Tom Nutman spoke about work involved with immunoscreening. In Onchocerca, (OvL3), Brugia (BmL3) and Wuchereria (WbL3) he is looking for genes which will provide protective immunity. In BmMF, WbMF, WbL3 he is looking at genes which influence regulation of host immune responses. He is also looking through ESTs for possible DNA vaccine candidates using a PCR based technique on whole library sets to see the effect of direct injections. He is also using differential display to look at expression differences for a variety of clones.
Weil Lab, WUSTL
Gary Weil described efforts to clone candidate protective antigens from Brugia and Wuchereria expression libraries by immunoscreening with specialized sera from a longitudinal study of filariasis in Egypt. Sera from persons with incident infections ("proven susceptibles") and from persons who spontaneously clear their infections are especially interesting in this regard. Ben-Wen Li and Shaorong Zhong in his lab are involved in DNA vaccine work with Brugia antigens in rodents. Shelly Michalski in Weil's lab is studying gender specific gene expression in Brugia using differential display and subtraction methods.
Bianco Lab, LSTM
Anthony Underwood, in Ted Bianco's lab, spoke about a locus which appears to be Y chromosome specific in Brugia, confirming that it has an XX/XY system , unlike C. elegans which is XX/XO. DNA from males and females was separated using late L4 stage parasites to avoid female worms being "contaminated" with sperm (fertilisation occurs soon after the final moult). PCR rapids using short arbitrary primers (to give random hits) gave a male specific 2.3 kb band. It is amplified only from males and not in females and not in Onchocerca. In situ hybridizations give signals in males only and show less signal in condensed DNA in sperm. In crosses to B. pahangi, the gene is inherited with the Y chromosome. A similar locus is not found in O. volvulus using these primers. BLAST shows it to contain reverse transcriptase like sequences (with frameshifts in the ORF making it non functional, perhaps part of defunct transposable element. In a BAC screen, many of colonies light up. There may be pBR322 sequences in the probe. The locus has been termed toy (Tag On Y).
Lustigman Lab, NYBC
Sara Lustigman spoke of some of her work in O. volvulus. By immunoscreening the L3 cDNA library with pool sera from putatively immune individuals from Liberia and Ecuador, a number of positives showed up, including some already known: Ov7=cystatin, Ov 17, Ov64 (Di 22),Ov 87 (a novel lectin binding protein), Ov B8 (a novel protein), Ov 93 (venom allergen homologue: at least 3, but different), Ov 66 (a 10kd novel protein), and OvA3 (OvL3-1 homologue). These are being characterized. For example, Ov64 reduces number of L3 worms in chambers and is located in granules released from the glandular esophagus through channels to the cuticle and then disappear (from molting L3 day 1); Ov87 causes a small reduction in L3 survival but not significantly. Ov87 is also located in granules of L3 and by day 1 is released through channels and then disappears. The venom allergen homologue was shown to confer protection against hookworm. Would there be a similar role in onchocerca infection? Interestingly, all the proteins are recognized by antibodies from infected as well as the putatively immune individuals. In addition, there are no differences in IgE responses against these proteins. (O. volvulus disease does give IgE response against OvAg in infected and not the putatively immune).
From the L3 EST data, Ov64 has another member which is shorter in 8 amino acids. Ov64 is like Bm-alt-1 and was found to be L3 specific, while the second member we call Ov-alt-2 is expressed in all the stages. In Dirofilaria, it was shown to be L3 specific, so there are some differences among the parasites. Differences in expression (by PCR) of Ov87 in various stages is seen. Other ESTs include cathepsin L-like protease, similar to Brugia, antibodies to it localized it to the granules of the L3 and SPI.
Scott Lab, JHU
Al Scott spoke about some work he is doing with MIF, as an example of mining the genome project. Some interesting genes include ndk, tpx, MIF, heat shock proteins, etc. MIF (macrophage migration inhibition factor) has been identified and alignment to other MIFs shows 40% identity an 60% similarity. W. bancrofti MIF and O. volvulus MIF genes have been cloned as have 2 C. elegans MIFs. W. bancrofti shows 95% similarity and similarity drops for other nematodes. The genomic copy has 2 exons and the protein is located in the immune system, adrenal and neural tissue, testis, liver, kidney, brain and in macrophage. It stimulates tnf alpha production and activates B cell IL receptor 2 expression, is secreted in response to stimulation from granules, mediates septic shock, regulates glucose secretion, and plays a role in inflammation, parasitic infections in schistosomiasis and leishmaniasis. In Brugia, it is transcribed in all stages, higher in adults than in L3. There is evidence for stage specific and tissue specific expression and there appears to be an interaction with the human MIF in infection, perhaps to change macrophage function. The crystal structure is being worked on and the enzyme has 2 alpha helices and beta sheets as a homotrimer. It folds in a ball with a hole in the middle (same as human MIF but different kinetics and responses). If one immunizes mice with a DNA vaccine of MIF, one gets a 50% reduction in larvae at day 7 (antibody response is low); protein immunization experiments show the opposite: With an L3 challenge one sees enhanced survival on day 13 and these have very big antibody response. The antibodies cross react with host MIF; perhaps host MIF is altered in infection which allows L3 production. Al is now making monoclonals specific to mouse MIF to see if host MIF regulates parasite loads. A MIF knockout mouse is also available. Perhaps an RNA antisense inhibition strategy might be of interest.
Williams Lab, Smith College
Steve Williams spoke about some genes of interest in his lab. These include tpx (in Brugia and Onchocerca and Dirofilaria). He also spoke about the cuticlins in Onchocerca where there are 5 transcripts from 4 genes (1 with an alternate transcript). The collagens in C. elegans are complicated with 100-150 found; they have also identified 44 different ones from the OvL3 and OvL3 molting library. Expression studies (PCR) from libraries show that some are dramatically up-regulated in molting L3 and many are only in the molting L3 library.
Ramzy and Blaxter Labs, Ain Shams and ICAPB
David Guiliano spoke about work carried out by Ibrahim Kamal and himself on her-1, involved in sex determination. Sex determination in nematodes is only known in C. elegans where 1 XO is male and XX is female. her-1 is a soluble mediator whose receptor is tra-2A; binding sends a signal to bind to tra-3 which is a cystine protease controlling TRA-2 function; tra-2 is then turned off when bound to TRA-3; when it is off, fem genes (FEM-1 is ankyrin, fem-2 is phosphatase and FEM-3 is unknown) interact with tra-1 zinc finger protein to turn off tra-1. In XX, tra-1 is on and fem- 1,2 and 3 are activated in males. Cloning of her-1 was aided by the finding of the EST. Ibrahim sequenced and started expression studies on it. Genomic sequencing revealed that compared to C. elegans, the last 2 introns are identical and the first one is off a bit; in C. elegans there is a 5 kb intron but in Brugia, it is smaller. The intron-exon structure is conserved at the sequence junctions. A 2nd promoter makes an inactive short product. her-1 expression decreases as development continues in C. elegans. In Brugia, it appears to be up-regulated in L3-L4 and adults, starting in day 2, peaking in day 5, and being gone by day 10 until adults where goes up again.
Blaxter Lab, ICAPB
David Guiliano has been searching for operons in Brugia but has yet to definitively identify any. He is looking at ribosomal proteins, where 50% are in operons in C. elegans. He has used PCR approaches using primers between the 3' end and 5' end of each pair. He has also looked at operon partners to see if hybridizations to BAC filters give double positives. He is looking at 8 operons found in C. elegans, and has tried mai-1 and gpd-2, snrp and MIF, fibrilarin and s16, RNA binding protein and tpx-1, rpPO and Tctp, rpP1 and rpl37a and rpl5 and ATP binding protein. So far, the PCRs have been negative and most pairs give only one positive on hybridizations. He is now looking at BAC and cosmid sequencing as one method of finding operons.