1
In a few paragraphs
(~1 page) describe the process by which you isolated the EST
sequences.
Note that this should
include comments on why things might have failed (if you had
failures).
2 Draw up a "molecular CV"
(total of ~2 pages) giving the information below for each of the four
sequences you choose to analyse. Remember to include quality data supporting each of your annotation statements.
Clone name
Insert length
Sequence length acquired and comments on sequence quality
Position of start of putative open reading frame or initiation methionine; length of open reading frame
Position of polyA tail (if present)
Best BLAST match in nonredundant database
Best BLAST match in mouse proteins
Best BLAST match in arthropod proteins
Domain and protein family membership (if any)
Cellular location inferred, including presence of signal peptide and transmembrane regions, if any
AND, FINALLY: Your putative functional annotation for the gene (in less than 30 words)
3 Choose one of the
sequences that had a significant similarity in BLAST searches to proteins
from several different organisms, and that you think might play some role in the host-parasite interaction.
a In a few
paragraphs, describe the quality of the similarities identified, what the
proteins matched in the database do in terms of biology or
enzymology in the organisms from which they come, and what you
think your gene might be doing in Heligmosomoides polygyrus.
b What role is it likely that your chosen gene plays in the host-parasite interaction between H. polygyrus and the mouse host? Remember to back up any statement you make with evidence from your bioinformatic analyses.
c What experiments could you suggest that would test the role that you have suggested your chosen gene plays in the host-parasite interaction?
(~2 pages)
4 Answer the following
questions (~1 page).
a Why is an EST
strategy useful for "neglected" organisms such as the Heligmosomoides polygyrus ?
b Why is an EST strategy
also useful for model organisms such as humans and C.
elegans where the whole genome has been
sequenced?
c Based on what we know about other nematodes, we predict that Heligmosomoides polygyrus will have ~18000 protein coding genes in its genome. If we want to identify
as many Heligmosomoides polygyrus genes as we can, how many ESTs
should we aim to sequence from how many cDNA libraries? Would
sequencing the whole genome (~80 Mb) be more efficient?