
BaNG PartiGene for OSX
version 3.0.2 (includes PartiGene v3.0.2 and trace2dbest v 3.0.1)
25/07/2006
installing Blaxter Labs Neglected Genomics software packages on Mac OSX
in a few Short
Steps (and several more
longer substeps)
BaNG PartiGene for OSX version 3.0.2 27/07/2006
Installing Blaxter
Labs Neglected Genomics software packages on Mac OSX
in a few Short Steps (and several more longer substeps)
PartiGene: Context
PartiGene version 3 is a suite of
software tools that process sequences (typically ESTs) to produce a
"partial genome" of clustered sequences and associated annotations.
It produces outputs that are imported into a relational database, which is then
available for querying via the world wide web.
The concepts behind the software are
(a)
a one-stop solution to analysis of partial genome datasets
(b)
use of "industry standard" external tools where appropriate
(c)
GNU licence software provision
(d)
ease of use by non-experts
The software was
written to run on the UNIX-like LINUX operating system, and has been
extensively tested on various "flavours" of LINUX. This installation
offers a version of PartiGene customised for the Darwin UNIX-like operating
system that underpins the Mac OSX system.
As Darwin does not come quite ready
for PartiGene, installation involves downloading and configuring a number of
external tools. This in turn involves
* use of the terminal and
command-line interaction with the Mac OSX system
* use of the "sudo" system
of root user access to the Mac OSX system
If you are unfamiliar with these, do
not worry: it is all quite simple, and if you follow the steps below, it should
be quite painless.
If you have already installed some
of the stand-alone software we use (eg BLAST, phred, phrap) you will have to
customise the paths to these programmes to suit your system.
We have tested the install on a
number of Macintosh computers, running OS X 10.3 and OS X 10.4 (ie G4 and G5
processors).
Please do contact us at nematodebioinf@ed.ac.uk if you
have any problems, stating what sort of machine you are using, which version of
OS X, and also pasting in any error messages you are given...
Please see the read-mes, how-tos
and User Guides for PartiGene and trace2dbest for instructions on the use of
the software.
Have fun
Tam Blaxter and Mark Blaxter, BaNG 09/2005 with
help from Ralf Schmid
Updated MB 06/2006
Notes on the supplied installation of
PartiGene
These installation notes were developed on a Power Mac G5, with dual
1.2GHz CPUs, and 2 Gb of memory. It was running Mac OS X 10.4.6. The versions
of the software used were:
FinkCommander v0.5.4;
BLAST v2.2.11;
PostgreSQL v8.0.3;
PartiGene v3.0;
and trace2dbEST v3.0.
We have not yet tested installation of the other components of our
package prot4EST and annot8r; these
will follow in due course. Keep an eye on what is happening by going to
www.nematodes.org and joining our user email list at nematodebioinf@ed.ac.uk. This is also the place to go to
report bugs and problems. We will try to help, but please be patient.
We should point out that
# This
program is free software; you can redistribute it and/or
# modify
it under the terms of the GNU General Public License Version 2
# as
published by the Free Software Foundation.
# This
program is distributed in the hope that it will be useful,
# but
WITHOUT ANY WARRANTY; without even the implied warranty of
#
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# See the GNU General
Public License for more details.
Before you start: Academic
Licence for phred, phrap and consed
You need to get an academic licence (free) for phred, phrap and consed
from the developers of these wonderful programmes. Go to http://www.phrap.org, fill in the application
form, and email it off. The developers will reply (relatively soon) with a link
for you for the downloadable versions of their tools.
Before you start: Root access to
your mac
You will need to be an administrative user for your Mac, and to know the
administrative password. Typing "sudo"
before a command in the Terminal tells the system you want to carry out the
command "as if" you were the root user. Please take care!
Sources of external programmes
This is a guide to installing other peoples software on your OS X Mac so
that our software (the PartiGene suite) will run. We have provided in the
distribution a set of source files that work, but these may be updated from
time to time, and it is a good idea to check that you are using the most
up-to-date version.
phred and phrap
Please note you MUST get a licence for phred and phrap to use these programmes: it is free
and keys you in to the announcements of upgrades and bug fixes.
Go to http://www.phrap.org
BLAST
BLAST for Mac OSX (standalone BLAST) from http://www.ncbi.nlm.nih.gov/BLAST/download.shtml
FOR USERS WHO MAY HAVE INSTALLED BLAST
PREVIOUSLY.
Different versions of BLAST perform subtly
differently. These differences result in PartiGene being unable to parse the
BLAST output files correctly. We recommend that you remove any older versions
of BLAST on your hard disc, and reinstall the latest release from NCBI (see
below). If you have more than one set of BLAST executables on your system,
things may fail inscrutably.
The recommended place to install BLAST is in
its own directory in /usr/local (called, boringly, blast), and to have the
executable files available in /usr/local/blast/bin. We have put blast version
2.2.11 in this distribution but you should check with the NCBI web site for
updates.
fink
fink and FinkCommander programme from http://fink.sourceforge.net/download/Index.php
(fink helps you to install:
IO-String Perl Module from http://search.cpan.org/~gaas/IO-String-1.06/String.pm
IO-Scalar Perl Module from http://search.cpan.org/~dskoll/IO-stringy-2.110
Mailer Perl Module from http://search.cpan.org/~markov/MailTools-1.67
BioPerl Perl Module from http://search.cpan.org/~craffi/Bundle-BioPerl/)
ReadLine.pm
ReadLine Perl Module from http://search.cpan.org/dist/Term-ReadLine-Gnu/
PostGres
PostgreSQL programme from http://www.postgresql.org/download/
Instructions
Text in purple below is what you should type into the Terminal window at the prompt, and green indicates web addresses. Text in courier is what should appear on the
screen.
Text of the form (Source -> Selfupdate) means select
the option “Selfupdate” from the “Source” menu bar item.
All the commands we suggest you
type are entered into a Terminal window. The Terminal application is in /Applications/Utilities. (You can also
use the X11 application if it is installed.) If you are new to UNIX-like
operating systems, we suggest you follow at least the basics of one of the many
on-line tutorials to learn the core commands used.
Please read through the instructions below BEFORE you start so you
have an overview of the process and so you can prepare any special local
customisations you may need.
Your computer needs to be connected to the
internet. This process will probably take a couple of hours, a lot of which is
“coffee time” (waiting for installations to complete), so stay calm.
I: The BaNG Package
Copy the package "BaNG_Package.tar.gz" to your hard disc (it
is available from http://www.nematodes.org/bioinformatics/).
Either double-click on it to launch the MacOS decompressor, or in a
Terminal window, type
gunzip
BaNG_package.tar.gz
and when this is finished, type
tar
-xf BaNG_package.tar.gz
type
cd
BaNG_Package
Listing the contents of the BaNG_Package directory (type the command ls -l ./ ) should show something similar to
the following:
drwxr-xr-x 11 yourname admin 374 Sep 22 17:13
BaNG_prepare
drwxrwxrwx 4 yourname admin 136 Sep 21 23:08 CLOBB
drwxrwxrwx 12 yourname admin 408 Sep 22 17:14 Notes
drwxrwxrwx 5 yourname admin 170 Sep 21 23:09 PartiGene
drwxrwxrwx 5 yourname admin 170 Sep 21 23:06 annot8r
drwxrwxrwx 6 yourname admin 204 Sep 21 23:07 blast
-rwxrwxrwx 1 yourname admin 4233 Sep 22 17:14 prepare.pl
drwxrwxrwx 4 yourname admin 136 Sep 21 23:12 prot4EST
-rwxrwxrwx 1 yourname admin 48300 Sep 22 15:05 readmeplease_v2.rtf
drwxrwxrwx 5 yourname admin 170 Sep 21 23:13
trace2dbEST
II: Install the basic setup from the BaNG
Package distribution
a The
“prepare” script
Type
./prepare.pl prepare_dirlist.txt
prepare_sourcelist.txt prepare_commandlist.txt
(You may have to enter your
password.)
Text will scroll past your terminal "screen". You may see
reports such as "blank source, copy line skipped" and "blank target,
make line skipped"
- these can be ignored. You should see lots of "succesfully"s though. The setup script
will close with the lines "Completed BaNG setup for user
<yourname>. have fun...", but you aren't finished yet.
(NOTE: sourcelist.txt can be edited to give different sources for
packages; packages must be given in the same order seen in dirlist.txt; you
must be within the BaNG_Package directory to run this command.)
b Using the
Terminal and the bash shell
The PartiGene system was written to work in the bash shell. If this
means nothing to you, its probably a good thing, but have a look at, for
example,
http://www.macdevcenter.com/pub/a/mac/2004/02/24/bash.html
for some information on what shells are and why we might use bash. To
find out what shell you are in in the terminal, type echo
$SHELL.
If the reply is /bin/bash, great: nothing needs to be done. If the reply
from the computer is /usr/tcsh you are in the tc shell. This is not what we want.
To change the shell into which you log in using Terminal, open Terminal
> Preferences, and click the radio button next to "Execute this command
(specify complete path)", and enter /usr/bash in the text box below.
Close the Preferences window and restart Terminal.
The BaNG package installer will have copied to your home directory two
files, .bash_profile and .bashrc (note the leading dots). These
should tell Terminal to log you in under the bash shell with all the necessary
paths for running the Partigene suite (and BLAST, phrap, phred and consed too).
NOTE: If you already run the bash shell, and have existing .bashrc
and .bash_profile files, the prepare.pl script will have renamed these to
.bashrc_old and .bash_profile_old. You can then use a text editor to add back
your customised .bashrc and .bash_profile entries.
III: perl and Xcode tools for
your Mac
perl
PartiGene is written in the programming language "perl". While Darwin comes with a
pretty complete installation of perl functions, PartiGene uses a number of
(relatively standard) "modules" which need to be added to the
installed perl on your computer. One major set is “BioPerl”, a massive
collection of pre-written routines for all things bioinformatic.
Installing add-on modules to perl can be problematic (and worrying) as the
add-ons often rely on other add-ons, which rely on yet additional ones. These
reliances are called dependencies, and making sure that all the dependencies of
a new perl module are dealt with sensibly can be tricky. For this reason we
recommend that you use fink, an open-source project for Mac OSX that does two things: (1) it gives
you a “standard”-style unix environment wherein all sorts of new programmes can
be installed to run quite happily on MacOSX and (2) it has an easy-to use
graphical front end for installing new add-ons, packages and programmes, that
deals with the issue of dependencies seamlessly.
The perl installation on Mac OSX 10.4 is version 5.8.6, and this version is the most recent.
FinkCommander (see below) has a “virtual package” in its list called perl586-core which is essentially an alias
pointing to the Apple OSX system’s installation of perl. FinkCommander does not update this
installation of perl but does allow you to add modules to it that extend its functionality.
Xcode
While OSX 10.4 is pretty fully featured, to compile and install new
programmes (using fink or any other method) requires that you have installed on your Mac the
latest version of the Apple Xcode tools. These are available for free from Apple’s
developer site. Installation is painless.
Go to http://www.apple.com/macosx/features/xcode/ for key information about Xcode, and then to http://developer.apple.com/tools/xcode/ to download the latest version of the Xcode suite. You will have to register as an Apple Developer
first (not in itself a bad thing), log in, and then follow the instructions on
screen for downloading the latest Xcode (2.3 at the time of writing).
Once the
file has downloaded to your hard disk (it is quite big), open the disk image,
and launch the installer. Follow instructions on-screen.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
If you are installing on OSX versions prior to 10.4, you will need to check
that you have as latest a version of perl as is possible. Note
that you do not have to do this on OSX 10.4
a In
FinkCommander, find what “perl” packages are available (Edit > Find >
Package Search).
b Select
the latest version (perl 5.8.1 as of 20/06/2006, for example)
c Install
(Binary > Install)
Follow the on-screen instructions (selecting
the default replies is usually correct here).
Most perl scripts start with #!/usr/bin/perl
which will invoke Apple's /usr/bin/perl. If you have a more recent version of perl in your fink installation than
in the OSX installation, you need to tell your scripts to look at the fink version.
One way of using this version of perl instead
is to overwrite /usr/bin/perl using the following commands typed in to a
terminal window (Note the use of sudo):
sudo
/bin/cp /sw/bin/perl5.8.1 /usr/bin/perl
If you later wish to revert to Apple's perl,
you would type (for example):
sudo
/bin/cp /usr/bin/perl5.8.0 /usr/bin/perl
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
IV: Install
Fink
Fink is a
community system for providing Darwin-ready UNIX packages for your Mac. It has
a vast array of programmes available at the click of a mouse, and makes
installing them very easy.
a Download Fink from
http://fink.sourceforge.net/download/Index.php
(This
page also has instructions and troubleshooting for Fink.)
b Install
Fink from this package (instructions are on-screen).
c Install
FinkCommander (see http://finkcommander.sourceforge.net/). Move the FinkCommander application to your Applications directory
(just drag and drop the directory from the package to your Applications
directory).
d Launch
FinkCommander and follow any instructions about updating FinkCommander. Quit
FinkCommander, and perform the update as instructed. Once this is complete,
launch FinkCommander again.
e Use
FinkCommander to ensure that you have the latest versions of all the packages
in fink (Source -> Selfupdate). Follow the instructions that pop up from
time to time: responding with an “accept default” is usually correct (except
for the regional location parts).
f Now
run FinkCommander’s “update all” command (Source -> UpdateAll). Follow the
instructions that pop up from time to time: responding with an “accept default”
is usually correct (except for the regional location parts).
V:
Install the relational database management programme PostgreSQL
While it is possible to install
postgresql using the Fink system (see below), we recommend you install the very
latest version of postgresql “manually” by following the instructions below.
These instructions are reproduced and slightly adjusted from the instructions
available at
http://developer.apple.com/internet/opensource/postgres.html
a Using
the System Preferences programme, create a new user with the name 'PostgreSQL Administrator’ [or
similar], the short name pgadmin, and 'Allow User to Administer this computer'
selected (Apple menu > System Preferences > Accounts > New User).
Note: Do NOT use the password
'postgres' for this account.
b Download
the latest version of PostgreSQL from
http://www.postgresql.org/download/
Navigate
to link called downloads (FTP...)
and select the postgresql-8.0.3.tar.gz (or later version) file. Download it onto
your user area of the disc.
c Type sudo sh (You may have to enter your password; your prompt will now look like
“sh-2.05b#”)
d Copy
the downloaded file to /usr/local/ Type
cp whereverthefileis/postgresql-8.0.3.tar.gz /usr/local/
e Type cd /usr/local/
f Type tar -xzvf postgresql-8.0.3.tar.gz
g Type cd postgresql-8.0.3
h Type ./configure --with-includes=/sw/include/
--with-libraries=/sw/lib (a lot of text will stream past
the terminal window)
i Type make (a lot of text will stream past
the terminal window)
j Type make install (an awful lot of text will stream past the terminal window)
k Type mkdir /usr/local/pgsql/data
l Type chown -R pgadmin /usr/local/pgsql/data
m Type
exit
su -l pgadmin
n Type
cp /Users/your_username/.bash_profile /Users/postgres/.bash_profile
cp
/Users/your_username/.bashrc /Users/postgres/.bash_bashrc
(Where username is your
username. If, when running prepare.pl, you chose not to install the recommended
.bash_profile, you will have to copy the bash_profile from the BaNG_Package
directory)
o Type source ~/.bash_profile
p Type initdb -D /usr/local/pgsql/data
The system’s response in the terminal should look something like:
The files belonging to
this database system will be owned by user "pgadmin".
This user must also own
the server process.
The database cluster
will be initialized with locale C.
fixing permissions on
existing directory /usr/local/pgsql/data ... ok
creating directory
/usr/local/pgsql/data/global ... ok
creating directory
/usr/local/pgsql/data/pg_xlog ... ok
creating directory
/usr/local/pgsql/data/pg_xlog/archive_status ... ok
creating directory
/usr/local/pgsql/data/pg_clog ... ok
creating directory
/usr/local/pgsql/data/pg_subtrans ... ok
creating directory
/usr/local/pgsql/data/base ... ok
...[snip] ...
vacuuming database
template1 ... ok
copying template1 to
template0 ... ok
WARNING: enabling
"trust" authentication for local connections
You can change this by
editing pg_hba.conf or using the -A option the
next time you run
initdb.
(NOTE:
If when running the prepare.pl script you chose not to install the recommended
.bash_profile then you will have to add /usr/local/pgsql/bin to your path)
q To
start the PostgreSQL server type
pg_ctl
-D /usr/local/pgsql/data -l logfile start
r Type createuser
At
the propmpt enter your username
Give
yourself permission to create databases (and other users if you want).
s Type exit to exit from the pgadmin user,
and exit again to exit from the shell
user.
t To test the PostgreSQL server type psql -l
You
should see the following:
List of databases
Name | Owner | Encoding
-----------+----------+-----------
template0 | postgres | SQL_ASCII
template1 | postgres | SQL_ASCII
(2 rows)
Now
type createdb test, and then psql -l
You
should see the following change:
List of databases
Name | Owner | Encoding
-----------+----------+-----------
template0 | postgres | SQL_ASCII
template1 | postgres | SQL_ASCII
test | postgres | SQL_ASCII
(3 rows)
VI: Use FinkCommander to
install BioPerl
a In
FinkCommander, find the BioPerl package (Edit > Find > Package Search).
b Select
the latest version (bioperl-pm586 at this time, for example
c Install
(Binary > Install)
Follow the on-screen instructions (selecting the default replies is
usually correct here). This installation has a lot of what are known as
“dependencies” - other programmes that BioPerl needs to work with. Fink
downloads these automatically (over 30 different programmes).
VII:
Use FinkCommander to install the database interaction module DBI.pm
The DBI interface allows perl to
“talk to” relational databases. We need the dbi package and the specific part to talk to postgresql (dbd-pg-pm)
First, dbi
a In
FinkCommander, search for dbi (Edit > Find > Package Search).
b Select
the latest version (dbi-pm586 for example if you are working in perl5.8.6)
c Install dbi (Binary >
Install)
Then dbd
d In
FinkCommander, search for dbd (Edit > Find > Package Search).
e Select
the latest version (dbd-pg-unified-pm586 for example if
you are working in perl5.8.6)
f Install dbd (Binary
> Install). A truly vast amount of text will scroll past. Have a break once
you have answered the initial questions from FinkCommander.
VIII: Use FinkCommander to install wget
wget is a simple interface for retrieving data from the internet.
a In
FinkCommander, search for "wget" (Edit > Find > Package Search)
b Select
"wget"
c Install wget (Binary >
Install)
IX: Install ReadLine and the GNU ReadLine
module for perl
For reasons that are not clear to us, we have needed to install the perl
module GNU ReadLine independently from FinkCommander.
a In
FinkCommander, search for "readline" (Edit > Find > Package Search)
b Select
the most current version of "readline"
c Install readline (Binary menu
> Install) if it is not already “current”.
d Using
a web browser, download from http://search.cpan.org/dist/Term-ReadLine-Gnu/ the current version of Term-ReadLine-Gnu [current version
at 20/06/2006 is 1.16]. Decompress this archive by double clicking on it in the
Mac Finder (Stuffit should launch and generate a folder called
“Term-ReadLine-Gnu-1.16”). Move this folder to the BaNG Package directory
(either using a Terminal window and the mv command, or by drag and drop in the
Finder)
e Open Terminal (or equivalent), navigate
using cd to the Term-ReadLine-Gnu-1.16 folder, and type
perl Makefile.PL
--libdir=/sw/lib/ --includedir=/sw/include/
f Type make
g Type make
test (a test mode - you should
see lots of “ok”s):
PERL_DL_NONLAZY=1 /usr/bin/perl
"-MExtUtils::Command::MM" "-e" "test_harness(0,
'blib/lib', 'blib/arch')" t/*.t
t/callback....ok 4/7
skipped: since Tk is not installed.
t/history.....ok
t/readline....ok Try `/usr/bin/perl
-Mblib t/readline.t verbose', if you will.
t/readline....ok
All tests successful, 4 subtests skipped.
Files=3, Tests=193, 1 wallclock secs ( 0.17 cusr + 0.08 csys = 0.25 CPU)
h Type sudo
make install
What is next?
The first time you run
PartiGene.pl, please go through the setup and configuration steps.
In particular, you will want to set
your base blast database directory to
/usr/local/blastdb
and your vector.seq file location to
/usr/local/BaNG/lib/
As part of the setup, we installed
a set of files that tell your computerto start up postgresql when the computer is started up, so you dont have
to log in a postgres each time. This file is in /System/Library/StartupItems/PostgreSQL. If you want to stop
postgres from starting up, just remove these (as root, or using sudo).
What
has yet to be done
*
migrate the PartiGene annot8r modules
to Mac OS X Darwin
*
migrate the webPartiGene interface building module to Mac OS X Darwin
(this
involves installing, in addition, php and the GD libraries)
*
migrate the prot4EST program to
Mac OS X Darwin
These
jobs are STILL “on our to do list”.
Version history for this
document
version 3: update by MB
Edinburgh June 2006
versions 1 and 2: original
versions by MB TB and RS September 2005
