Acrobeles complexus

nematodes.org - Bioinformatics

Nematode & Neglected Genomics
@ The Blaxter Lab, Institute of Evolutionary Biology, School of Biological Sciences, The University of Edinburgh

University of Edinburgh crest   

MOTU_define.pl

Frequently asked questions (and answers)


  • What is a MOTU?
  • Input files for MOTU_define.pl
  • The base difference cutoff
  • Why do different runs give different clustering?
  • How do I parse the output?
  • What has changed recently?
  • Input files for MOTU_define.pl

  • The input format for MOTU_define.pl is fasta formatted DNA sequence. Fasta format looks like:

    >sequence name
    AGCGGTGGCGTGGCGGTGGCGGTGGCCGGTG
    AGCGGTGGCGTGGCGGTGGCGGTGGCCGGTG
    AGCGGTGGCGTGGCGGTGGCGGTGGCCGGTG

    All the sequences must be in separate fasta files. The programme copes with non-A/G/T/C base calls (all are coded "N").

     

  • The base difference cutoff

  • MOTU_define.pl uses a simple difference cutoff to place sequences into MOTU. The cutoff is set by the user, and can be varied in different runs to explore the effects of different cutoff values on clustering. MOTU_define.pl does not explicitly use a tree to define MOTU, though of course the use of BLAST implies that some phylogenetic relationship is expected.

     

  • Why do different runs give different clustering?

  • MOTU_define,pl uses single-linkage clustering: if a sequence is close enough (given the cutoff) to another, they are clustered. In some cases thie order of analysis of sequences can change clustering. Consider three sequences, A, B and C. A differs from B by 2 changes and B from C by 2 changes. C differs from A by 4 changes. If we use a cutoff of 2>2 differences, the order in which we add sequences will change clustering: A then C then B and C then A then B will yield two clusters as A and C will not be linked. Any other clustering will yield one. The pattern of distribution of different clustering can yield interesting information concerning "clouds" of closely related specimens that do not robustly form distinct MOTU under specific parameters.

     

  • How do I parse the output

  • We have not implemented comparison routines in the software, but the output is easily analysed in, for example, Excel-type spreadsheets, or with simple perl scripts. We will be placing some of these perl scripts on this site soon.


    Website Highlight


    Dirofilaria immitis

    The dog heartworm Dirofilaria immitis.
    Filarial nematodes are tissue and gut parasites of a wide range of vertebrates, including humans. This species is a canine parasite and gets its common name of "heartworm" because the adults reside in the heart. It is closely related tospecies, such as O. volvulus, that cause human diseases, affecting over 120 million people. See NEMBASE4 for analyses of ESTs from this parasite and many other nematodes.

    [nematodes.org v4.0] the content of these pages is copyright Mark Blaxter and colleagues. Contact the webmaster if there are problems.