BaNG - Blaxter Nematode and Neglected Genomics
  BaNG
  Nematode and Neglected Genomics
University of Edinburgh
      The Blaxter Lab at the Institute of Evolutionary Biology, University of Edinburgh
Software
Databases
 
 


 BaNG PartiGene for OSX

version 3.0.2 (includes PartiGene v3.0.2 and trace2dbest v 3.0.1)

25/07/2006

installing Blaxter Labs Neglected Genomics software packages on Mac OSX
in a few Short Steps (and several more longer substeps)

User guide Download Publications Feedback FAQs Change log Contact us


BaNG PartiGene for OSX version 3.0.2     27/07/2006

Installing Blaxter Labs Neglected Genomics software packages on Mac OSX in a few Short Steps (and several more longer substeps)

PartiGene: Context

PartiGene version 3 is a suite of software tools that process sequences (typically ESTs) to produce a "partial genome" of clustered sequences and associated annotations. It produces outputs that are imported into a relational database, which is then available for querying via the world wide web.

The concepts behind the software are

            (a) a one-stop solution to analysis of partial genome datasets

            (b) use of "industry standard" external tools where appropriate

            (c) GNU licence software provision

            (d) ease of use by non-experts

The software was written to run on the UNIX-like LINUX operating system, and has been extensively tested on various "flavours" of LINUX. This installation offers a version of PartiGene customised for the Darwin UNIX-like operating system that underpins the Mac OSX system.

As Darwin does not come quite ready for PartiGene, installation involves downloading and configuring a number of external tools. This in turn involves

* use of the terminal and command-line interaction with the Mac OSX system

* use of the "sudo" system of root user access to the Mac OSX system

If you are unfamiliar with these, do not worry: it is all quite simple, and if you follow the steps below, it should be quite painless.

If you have already installed some of the stand-alone software we use (eg BLAST, phred, phrap) you will have to customise the paths to these programmes to suit your system.

We have tested the install on a number of Macintosh computers, running OS X 10.3 and OS X 10.4 (ie G4 and G5 processors).

Please do contact us at nematodebioinf@ed.ac.uk if you have any problems, stating what sort of machine you are using, which version of OS X, and also pasting in any error messages you are given...

Please see the read-mes, how-tos and User Guides for PartiGene and trace2dbest for instructions on the use of the software.

Have fun

Tam Blaxter and Mark Blaxter, BaNG 09/2005 with help from Ralf Schmid

Updated MB 06/2006


Notes on the supplied installation of PartiGene

These installation notes were developed on a Power Mac G5, with dual 1.2GHz CPUs, and 2 Gb of memory. It was running Mac OS X 10.4.6. The versions of the software used were:

FinkCommander v0.5.4;

BLAST v2.2.11;

PostgreSQL v8.0.3;

PartiGene v3.0;

and trace2dbEST v3.0.

We have not yet tested installation of the other components of our package prot4EST and annot8r; these will follow in due course. Keep an eye on what is happening by going to www.nematodes.org and joining our user email list at nematodebioinf@ed.ac.uk. This is also the place to go to report bugs and problems. We will try to help, but please be patient.

We should point out that

# This program is free software; you can redistribute it and/or

# modify it under the terms of the GNU General Public License Version 2

# as published by the Free Software Foundation.

# This program is distributed in the hope that it will be useful,

# but WITHOUT ANY WARRANTY; without even the implied warranty of

# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

# See the GNU General Public License for more details.

Before you start: Academic Licence for phred, phrap and consed

You need to get an academic licence (free) for phred, phrap and consed from the developers of these wonderful programmes. Go to http://www.phrap.org, fill in the application form, and email it off. The developers will reply (relatively soon) with a link for you for the downloadable versions of their tools.

Before you start: Root access to your mac

You will need to be an administrative user for your Mac, and to know the administrative password. Typing "sudo" before a command in the Terminal tells the system you want to carry out the command "as if" you were the root user. Please take care!


Sources of external programmes

This is a guide to installing other peoples software on your OS X Mac so that our software (the PartiGene suite) will run. We have provided in the distribution a set of source files that work, but these may be updated from time to time, and it is a good idea to check that you are using the most up-to-date version.

phred and phrap

Please note you MUST get a licence for phred and phrap to use these programmes: it is free and keys you in to the announcements of upgrades and bug fixes.

Go to http://www.phrap.org

BLAST

BLAST for Mac OSX (standalone BLAST) from http://www.ncbi.nlm.nih.gov/BLAST/download.shtml

FOR USERS WHO MAY HAVE INSTALLED BLAST PREVIOUSLY.

Different versions of BLAST perform subtly differently. These differences result in PartiGene being unable to parse the BLAST output files correctly. We recommend that you remove any older versions of BLAST on your hard disc, and reinstall the latest release from NCBI (see below). If you have more than one set of BLAST executables on your system, things may fail inscrutably.

The recommended place to install BLAST is in its own directory in /usr/local (called, boringly, blast), and to have the executable files available in /usr/local/blast/bin. We have put blast version 2.2.11 in this distribution but you should check with the NCBI web site for updates.

fink

fink and FinkCommander programme from http://fink.sourceforge.net/download/Index.php

(fink helps you to install:

IO-String Perl Module from http://search.cpan.org/~gaas/IO-String-1.06/String.pm

IO-Scalar Perl Module from http://search.cpan.org/~dskoll/IO-stringy-2.110

Mailer Perl Module from http://search.cpan.org/~markov/MailTools-1.67

BioPerl Perl Module from http://search.cpan.org/~craffi/Bundle-BioPerl/)

ReadLine.pm

ReadLine Perl Module from http://search.cpan.org/dist/Term-ReadLine-Gnu/

PostGres

PostgreSQL programme from http://www.postgresql.org/download/


Instructions

Text in purple below is what you should type into the Terminal window at the prompt, and green indicates web addresses. Text in courier is what should appear on the screen.

Text of the form (Source -> Selfupdate) means select the option “Selfupdate” from the “Source” menu bar item.

All the commands we suggest you type are entered into a Terminal window. The Terminal application is in /Applications/Utilities. (You can also use the X11 application if it is installed.) If you are new to UNIX-like operating systems, we suggest you follow at least the basics of one of the many on-line tutorials to learn the core commands used.

Please read through the instructions below BEFORE you start so you have an overview of the process and so you can prepare any special local customisations you may need.

Your computer needs to be connected to the internet. This process will probably take a couple of hours, a lot of which is “coffee time” (waiting for installations to complete), so stay calm.

I: The BaNG Package

Copy the package "BaNG_Package.tar.gz" to your hard disc (it is available from http://www.nematodes.org/bioinformatics/).

Either double-click on it to launch the MacOS decompressor, or in a Terminal window, type

            gunzip BaNG_package.tar.gz

and when this is finished, type

            tar -xf BaNG_package.tar.gz

type

            cd BaNG_Package

Listing the contents of the BaNG_Package directory (type the command ls -l ./ ) should show something similar to the following:

drwxr-xr-x    11 yourname admin    374 Sep 22 17:13 BaNG_prepare

drwxrwxrwx     4 yourname admin    136 Sep 21 23:08 CLOBB

drwxrwxrwx    12 yourname admin    408 Sep 22 17:14 Notes

drwxrwxrwx     5 yourname admin    170 Sep 21 23:09 PartiGene

drwxrwxrwx     5 yourname admin    170 Sep 21 23:06 annot8r

drwxrwxrwx     6 yourname admin    204 Sep 21 23:07 blast

-rwxrwxrwx     1 yourname admin   4233 Sep 22 17:14 prepare.pl

drwxrwxrwx     4 yourname admin    136 Sep 21 23:12 prot4EST

-rwxrwxrwx     1 yourname admin  48300 Sep 22 15:05 readmeplease_v2.rtf

drwxrwxrwx     5 yourname admin    170 Sep 21 23:13 trace2dbEST


II: Install the basic setup from the BaNG Package distribution

a        The “prepare” script

Type

            ./prepare.pl prepare_dirlist.txt prepare_sourcelist.txt prepare_commandlist.txt

(You may have to enter your password.)

Text will scroll past your terminal "screen". You may see reports such as "blank source, copy line skipped" and "blank target, make line skipped" - these can be ignored. You should see lots of "succesfully"s though. The setup script will close with the lines "Completed BaNG setup for user <yourname>. have fun...", but you aren't finished yet.

(NOTE: sourcelist.txt can be edited to give different sources for packages; packages must be given in the same order seen in dirlist.txt; you must be within the BaNG_Package directory to run this command.)

 

b      Using the Terminal and the bash shell

The PartiGene system was written to work in the bash shell. If this means nothing to you, its probably a good thing, but have a look at, for example,

http://www.macdevcenter.com/pub/a/mac/2004/02/24/bash.html

for some information on what shells are and why we might use bash. To find out what shell you are in in the terminal, type echo $SHELL.

If the reply is /bin/bash, great: nothing needs to be done. If the reply from the computer is /usr/tcsh you are in the tc shell. This is not what we want.

To change the shell into which you log in using Terminal, open Terminal > Preferences, and click the radio button next to "Execute this command (specify complete path)", and enter /usr/bash in the text box below.

Close the Preferences window and restart Terminal.

The BaNG package installer will have copied to your home directory two files, .bash_profile and .bashrc (note the leading dots). These should tell Terminal to log you in under the bash shell with all the necessary paths for running the Partigene suite (and BLAST, phrap, phred and consed too).

NOTE: If you already run the bash shell, and have existing .bashrc and .bash_profile files, the prepare.pl script will have renamed these to .bashrc_old and .bash_profile_old. You can then use a text editor to add back your customised .bashrc and .bash_profile entries.

 


III: perl and Xcode tools for your Mac

perl

PartiGene is written in the programming language "perl". While Darwin comes with a pretty complete installation of perl functions, PartiGene uses a number of (relatively standard) "modules" which need to be added to the installed perl on your computer. One major set is “BioPerl”, a massive collection of pre-written routines for all things bioinformatic.

Installing add-on modules to perl can be problematic (and worrying) as the add-ons often rely on other add-ons, which rely on yet additional ones. These reliances are called dependencies, and making sure that all the dependencies of a new perl module are dealt with sensibly can be tricky. For this reason we recommend that you use fink, an open-source project for Mac OSX that does two things: (1) it gives you a “standard”-style unix environment wherein all sorts of new programmes can be installed to run quite happily on MacOSX and (2) it has an easy-to use graphical front end for installing new add-ons, packages and programmes, that deals with the issue of dependencies seamlessly.

The perl installation on Mac OSX 10.4 is version 5.8.6, and this version is the most recent. FinkCommander (see below) has a “virtual package” in its list called perl586-core which is essentially an alias pointing to the Apple OSX system’s installation of perl. FinkCommander does not update this installation of perl but does allow you to add modules to it that extend its functionality.

Xcode

While OSX 10.4 is pretty fully featured, to compile and install new programmes (using fink or any other method) requires that you have installed on your Mac the latest version of the Apple Xcode tools. These are available for free from Apple’s developer site. Installation is painless.

Go to http://www.apple.com/macosx/features/xcode/ for key information about Xcode, and then to http://developer.apple.com/tools/xcode/ to download the latest version of the Xcode suite. You will have to register as an Apple Developer first (not in itself a bad thing), log in, and then follow the instructions on screen for downloading the latest Xcode (2.3 at the time of writing).

Once the file has downloaded to your hard disk (it is quite big), open the disk image, and launch the installer. Follow instructions on-screen.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
If you are installing on OSX versions prior to 10.4, you will need to check that you have as latest a version of perl as is possible. Note that you do not have to do this on OSX 10.4

a       In FinkCommander, find what “perl” packages are available (Edit > Find > Package Search).

b       Select the latest version (perl 5.8.1 as of 20/06/2006, for example)

c       Install (Binary > Install)

Follow the on-screen instructions (selecting the default replies is usually correct here).

Most perl scripts start with #!/usr/bin/perl which will invoke Apple's /usr/bin/perl. If you have a more recent version of perl in your fink installation than in the OSX installation, you need to tell your scripts to look at the fink version.

One way of using this version of perl instead is to overwrite /usr/bin/perl using the following commands typed in to a terminal window (Note the use of sudo):

            sudo /bin/cp /sw/bin/perl5.8.1 /usr/bin/perl

If you later wish to revert to Apple's perl, you would type (for example):

            sudo /bin/cp /usr/bin/perl5.8.0 /usr/bin/perl

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


IV: Install Fink

Fink is a community system for providing Darwin-ready UNIX packages for your Mac. It has a vast array of programmes available at the click of a mouse, and makes installing them very easy.

a       Download Fink from

         http://fink.sourceforge.net/download/Index.php

         (This page also has instructions and troubleshooting for Fink.)

b       Install Fink from this package (instructions are on-screen).

c       Install FinkCommander (see http://finkcommander.sourceforge.net/). Move the FinkCommander application to your Applications directory (just drag and drop the directory from the package to your Applications directory).

d       Launch FinkCommander and follow any instructions about updating FinkCommander. Quit FinkCommander, and perform the update as instructed. Once this is complete, launch FinkCommander again.

e       Use FinkCommander to ensure that you have the latest versions of all the packages in fink (Source -> Selfupdate). Follow the instructions that pop up from time to time: responding with an “accept default” is usually correct (except for the regional location parts).

f        Now run FinkCommander’s “update all” command (Source -> UpdateAll). Follow the instructions that pop up from time to time: responding with an “accept default” is usually correct (except for the regional location parts).

 
V: Install the relational database management programme PostgreSQL

While it is possible to install postgresql using the Fink system (see below), we recommend you install the very latest version of postgresql “manually” by following the instructions below. These instructions are reproduced and slightly adjusted from the instructions available at

            http://developer.apple.com/internet/opensource/postgres.html

a       Using the System Preferences programme, create a new user with the name 'PostgreSQL Administrator’ [or similar], the short name pgadmin, and 'Allow User to Administer this computer' selected (Apple menu > System Preferences > Accounts > New User). Note: Do NOT use the password 'postgres' for this account.

b       Download the latest version of PostgreSQL from

         http://www.postgresql.org/download/

         Navigate to link called  downloads (FTP...) and select the postgresql-8.0.3.tar.gz (or later version) file. Download it onto your user area of the disc.

c       Type sudo sh (You may have to enter your password; your prompt will now look like “sh-2.05b#”)

d       Copy the downloaded file to /usr/local/ Type

         cp whereverthefileis/postgresql-8.0.3.tar.gz /usr/local/

e       Type cd /usr/local/

f        Type tar -xzvf postgresql-8.0.3.tar.gz

g       Type cd postgresql-8.0.3

h       Type ./configure --with-includes=/sw/include/ --with-libraries=/sw/lib (a lot of text will stream past the terminal window)

i        Type make (a lot of text will stream past the terminal window)

j        Type make install  (an awful lot of text will stream past the terminal window)

k       Type mkdir /usr/local/pgsql/data

l        Type chown -R pgadmin /usr/local/pgsql/data

m      Type

         exit

         su -l pgadmin

n       Type

         cp /Users/your_username/.bash_profile /Users/postgres/.bash_profile

         cp /Users/your_username/.bashrc /Users/postgres/.bash_bashrc

         (Where username is your username. If, when running prepare.pl, you chose not to install the recommended .bash_profile, you will have to copy the bash_profile from the BaNG_Package directory)

o       Type source ~/.bash_profile

p       Type initdb -D /usr/local/pgsql/data


The system’s response in the terminal should look something like:

The files belonging to this database system will be owned by user "pgadmin".

This user must also own the server process.

The database cluster will be initialized with locale C.

fixing permissions on existing directory /usr/local/pgsql/data ... ok

creating directory /usr/local/pgsql/data/global ... ok

creating directory /usr/local/pgsql/data/pg_xlog ... ok

creating directory /usr/local/pgsql/data/pg_xlog/archive_status ... ok

creating directory /usr/local/pgsql/data/pg_clog ... ok

creating directory /usr/local/pgsql/data/pg_subtrans ... ok

creating directory /usr/local/pgsql/data/base ... ok

...[snip] ...

vacuuming database template1 ... ok

copying template1 to template0 ... ok

WARNING: enabling "trust" authentication for local connections

You can change this by editing pg_hba.conf or using the -A option the

next time you run initdb.

         (NOTE: If when running the prepare.pl script you chose not to install the recommended .bash_profile then you will have to add /usr/local/pgsql/bin to your path)

q       To start the PostgreSQL server type

         pg_ctl -D /usr/local/pgsql/data -l logfile start

r        Type createuser

         At the propmpt enter your username

         Give yourself permission to create databases (and other users if you want).

s       Type exit to exit from the pgadmin user, and exit again to exit from the shell user.

t        To test the PostgreSQL server type psql -l

         You should see the following:

          List of databases

        Name    |  Owner   | Encoding 

     -----------+----------+-----------

      template0 | postgres | SQL_ASCII

      template1 | postgres | SQL_ASCII

     (2 rows)

         Now type createdb test, and then psql -l

         You should see the following change:

          List of databases

        Name    |  Owner   | Encoding 

     -----------+----------+-----------

      template0 | postgres | SQL_ASCII

      template1 | postgres | SQL_ASCII

      test      | postgres | SQL_ASCII

     (3 rows)


VI: Use FinkCommander to install BioPerl

a       In FinkCommander, find the BioPerl package (Edit > Find > Package Search).

b       Select the latest version (bioperl-pm586 at this time, for example

c       Install (Binary > Install)

Follow the on-screen instructions (selecting the default replies is usually correct here). This installation has a lot of what are known as “dependencies” - other programmes that BioPerl needs to work with. Fink downloads these automatically (over 30 different programmes).

VII: Use FinkCommander to install the database interaction module DBI.pm

The DBI interface allows perl to “talk to” relational databases. We need the dbi package and the specific part to talk to postgresql (dbd-pg-pm)

First, dbi

a       In FinkCommander, search for dbi (Edit > Find > Package Search).

b       Select the latest version (dbi-pm586 for example if you are working in perl5.8.6)

c       Install dbi (Binary > Install)

Then dbd

d       In FinkCommander, search for dbd (Edit > Find > Package Search).

e       Select the latest version (dbd-pg-unified-pm586 for example if you are working in perl5.8.6)

f        Install dbd (Binary > Install). A truly vast amount of text will scroll past. Have a break once you have answered the initial questions from FinkCommander.

VIII: Use FinkCommander to install wget

wget is a simple interface for retrieving data from the internet.

a       In FinkCommander, search for "wget" (Edit > Find > Package Search)

b       Select "wget"

c       Install wget (Binary > Install)

 


IX: Install ReadLine and the GNU ReadLine module for perl

For reasons that are not clear to us, we have needed to install the perl module GNU ReadLine independently from FinkCommander.

a       In FinkCommander, search for "readline" (Edit > Find > Package Search)

b       Select the most current version of "readline"

c       Install readline (Binary menu > Install) if it is not already “current”.

d       Using a web browser, download from http://search.cpan.org/dist/Term-ReadLine-Gnu/ the current version of Term-ReadLine-Gnu [current version at 20/06/2006 is 1.16]. Decompress this archive by double clicking on it in the Mac Finder (Stuffit should launch and generate a folder called “Term-ReadLine-Gnu-1.16”). Move this folder to the BaNG Package directory (either using a Terminal window and the mv command, or by drag and drop in the Finder)

e       Open Terminal (or equivalent), navigate using cd to the Term-ReadLine-Gnu-1.16 folder, and type

         perl Makefile.PL --libdir=/sw/lib/ --includedir=/sw/include/

f        Type make

g       Type make test (a test mode - you should see lots of “ok”s):

PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t

t/callback....ok                   4/7 skipped: since Tk is not installed.

t/history.....ok                                                         

t/readline....ok        Try `/usr/bin/perl -Mblib t/readline.t verbose', if you will.

t/readline....ok                                                          

All tests successful, 4 subtests skipped.

Files=3, Tests=193,  1 wallclock secs ( 0.17 cusr +  0.08 csys =  0.25 CPU)

h       Type sudo make install


 What is next?

The first time you run PartiGene.pl, please go through the setup and configuration steps.

In particular, you will want to set your base blast database directory to

/usr/local/blastdb

and your vector.seq file location to

/usr/local/BaNG/lib/

As part of the setup, we installed a set of files that tell your computerto start up postgresql when the computer is started up, so you dont have to log in a postgres each time. This file is in /System/Library/StartupItems/PostgreSQL. If you want to stop postgres from starting up, just remove these (as root, or using sudo).

            What has yet to be done

            * migrate the PartiGene annot8r modules to Mac OS X Darwin

            * migrate the webPartiGene interface building module to Mac OS X Darwin

                        (this involves installing, in addition, php and the GD libraries)

            * migrate the prot4EST program to Mac OS X Darwin

            These jobs are STILL “on our to do list”.

Version history for this document

version 3: update by MB Edinburgh June 2006

versions 1 and 2: original versions by MB TB and RS September 2005

...other interesting things...


Nippostrongylus brasiliensis
The rat hookworm Nippostrongylus brasiliensis.
Hookworms are gut parasites of a wide range of vertebrates, including humans. This species is a rat parasite used as a research model for human hookworm disease. See NEMBASE3 for analyses of ESTs from this parasite and many other nematodes.
the content of these pages is copyright Mark Blaxter and colleagues. Contact the webmaster if there are problems.