Multiple Sequence Alignment -- Online Resources

General Resources:

PBIL's Tools for Multiple Alignments: http://pbil.univ-lyon1.fr/alignment.html

-- An extensive list of multi-alignment resources, including lists of multiple alignment servers, software, alignment editors, etc. Appears to be updated more regularly than the VSNS page below.

Multiple Alignment Resource WWW Page (VSNS BioComputing Division): http://www.techfak.uni-bielefeld.de/bcd/Curric/MulAli/welcome.html

-- Another extensive list of multiple alignment resources, including on-line tutorials.

Multiple Alignment Servers and Software Packages:

MSA: Lipman DJ, Altschul SF, & Kececioglu JD (1989) PNAS 86:4412-4415. Gupta SK, Kececioglu JD, Schaffer AA (1995) J. Comput. Biol. 2:459-472. (http://www.psc.edu/general/software/packages/msa/manual/manual.html )

PIMA: Smith RF & Smith TF (1992) Protein Engng 5:35-41. (Available from the BCM Search Launcher; see below).

Clustal-W: Thompson, JD, Higgins, DG, & Gibson, TJ (1994) Nucleic Acids Res. 22:4673-4680.
ClustalW WWW Server at EBI: http://www.ebi.ac.uk/clustalw
Clustal-W, Clustal-X software packages (Most platforms): ftp://ftp-igbmc.u-strasbg.fr/pub

MAP: Huang, X (1994) CABIOS 10:227-235 (ftp://cs.mtu.edu/pub/huang)

Block Maker: Henikoff S, Henikoff JG, Alford WA, Pietrokovski S (1995) Gene-COMBIS, Gene 163, GC 17-26. (http://blocks.fhcrc.org/blocks)

[MSA, PIMA, Clustal-W, MAP, and Block Maker can be run from the BCM Search Launcher WWW Pages: http://searchlauncher.bcm.tmc.edu]

PRRP/PRRN: Gotoh O (1996). Significant improvements in accuracy of muliple protein sequence alignments by iterative refinments as assessed by reference to structural alignments. J. Mol. Biol. 264:823-838. (http://prrn.ims.u-tokyo.ac.jp/)

DCA: Stoye J (1998). Multiple sequence alignment with the Divide-and-Conquer method. Gene 211:GC45-56 (http://bibiserv.techfak.uni-bielefeld.de/dca/submission.html)

ITERALIGN: Brocchieri L & Karlin S (1998). A symmetric-iterated multiple alignment of protein sequences. J. Mol. Biol. 276:249-264 (http://giotto.stanford.edu/~luciano/iteralign.html)

SAGA: Notredame C, Holm L, Higgens DG (1998). COFFEE: A New Objective Function For Multiple Sequence Alignmnent. Bioinformatics 14:407-422 (http://igs-server.cnrs-mrs.fr/~cnotred/Projects_home_page/saga_home_page.html)

T-COFFEE: Notredame C, Higgins D, Heringa J (2000). T-Coffee: A novel method for multiple sequence alignments. J. Mol. Bio. 302:205-217. (http://igs-server.cnrs-mrs.fr/~cnotred/Projects_home_page/t_coffee_home_page.html).

SAM-T99: Karplus K, Hu B (2001). Evaluation of protein multiple alignments by SAM-T99 using BALIBASE multiple alignment test set. Bioinformatics 17:713-720. (http://www.cse.ucsc.edu/research/compbio/HMM-apps/T99-query.html).

PCMA: Pei J, Sadreyev R, Grishin NV (2003). PCMA: fast and accurate multiple sequence alignment based on profile consistency. Bioinformatics 19:427-428. (ftp://iole.swmed.edu/pub/PCMA/)

ProAlign: Loytynoja A, Milinkovitch MC (2003). A hidden Markov model for progressive multiple alignment. Bioinformatics 19:1505-1513. (http://evol-linux1.ulb.ac.be/ueg/ProAlign/)

MAVID: Bray N, Pachter L (2004). MAVID: Constrained ancestral alignment of multiple sequences. Genome Research 14:693-699. (http://baboon.math.berkeley.edu/mavid)

MUSCLE: Edgar RC (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Research 32:1792-97. (Home Page: http://www.drive5.com/muscle/; Web Server: http://bpg.berkeley.edu/cgi-bin/muscle/input_muscle.py; Use an alignment editor, e.g., Jalview, to view alignment)

Align-m: Walle IV, Lasters I, Wyns L (2004). Align-m -- a new algorithm for multiple alignment of highly divergent sequences. Bioinformatics 20:1428-1435. (binaries: http://bioinformatics.vub.ac.be/software/software.html)

ABA: Raphael B, Zhi D, Tang H, Pevzner (2004). A novel method for multiple alignment of sequences with repeated and shuffled elements. Genome Res. 14:2336-46. (Linux binary: http://nbcr.sdsc.edu/euler)

POA: Grasso C, Lee C (2005). Combining partial order alignment and progressive sequence alignment increases alignment speed and scalability to very large alignment problems. Bioinformatics 20:1546-56. (Note: the "POA Online" server appears to be using the older 2002 version of POA: http://www.bioinformatics.ucla.edu/poa/)

DIALIGN-T: Subramanian AR, Weyer-Menkhoff J, Kaufmann M, Morgenstern B (2005). DIALIGN-T: an improved algorithm for seqment-based multiple sequence alignments. BMC Bioinformatics 6:66. (http://dialign-t.gobics.de)

PRALINE-psi: Simossis VA, Kleinjung, Heringa J (2005). Homology-extended sequence alignment. Nucleic Acids Res. 33:816-24. ( http://ibivu.cs.vu.nl/programs/pralinewww/)

MAFFT-5: Katoh K, Kuma K, Toh H, Miyata T (2005). MAFFT version 5: improvement in accuracy of multiple sequence alignment. (binaries & source: http://www.biophys.kyoto-u.ac.jp/~katoh/programs/align/mafft/)

ProbCons: Do CB, Mahabhashyam MS, Brudno M, Batzpglou S (2005). ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Res. 15:330-40. (http://probcons.stanford.edu)

SPEM: Zhou H, Zhou Y (2005). SPEM: improving multiple sequence alignments with sequence profiles and predicted seconday structures. Bioinformatics 21:3615-21. (Note - server appears to be down at this time: http://theory.med.buffalo.edu/Softwares-Services_files/software-services.htm )

Multiple Alignment Editors/Viewers/Printing Utilities:

Pfaat: Johnson JM, Mason K, Moallemi C, Xi H, Somaroo S, Huang ES (2003). Protein family annotation in a multiple alignment viewer. Bioinformatics 19:544-545. ( http://www.pfizerdtc.com)

QAlign: Sammeth M, Rothganger J, Esser W, Albert J, Stoye J, Harmsen D (2003). QAlign: quality-based multiple alignments with dynamic phylogenetic analysis. Bioinformatics 19:1592-1593. (http://gi.cebitec.uni-bielefeld.de/qalign). Note: This package provides an graphical user interface for a number of multiple alignment programs, including CLUSTALW, DCA, DIALIGN, and T-COFFEE.

Jalview: Clamp M, Cuff J, Searle SM, Barton GJ (2004). The Jalview alignment editor. Bioinformatics 20:426-7. (http://www.jalview.org/index.html).

JAE (Jemboss Alignment Editor): Carver TJ, Mullan LJ (2005). JAE: Jemboss Alignment Editor. Appl. Bioinformatics 4:151-4. (http://emboss.sourceforge.net/Jemboss/).

Also see the list on the PBIL's Tools Page, above.

Misc. Tools To Aid Phylogenetic Analysis:

PhyloBlast: Brinkman F, Wan, I, Hancock R, Rose A, Jones S (2001). PhyloBLAST: facilitating phylogenetic analysis of BLAST results. Bioinformatics 17:385-387. ( http://www.pathogenomics.bc.ca/phyloBLAST/)

RevTrans: Wernersson R, Pedersen A (2003). RevTrans: multiple alignment of coding DNA from aligned amino acid sequences. Nucleic Acids Research 13:3537-3539. ( http://www.cbs.dtu.dk/services/RevTrans/)

RiPE: (Retrieval-induced Phylogeny Environment). Fullen G, Spitzer M, Cullen P, Lorkowski S (2003). BLASTing proteomes, yielding phylogenies. In Silico Biology 3:May 24. (http://ifg-izkf.uni-muenster.de/Bioinformatik/veroeffentlichungen/ripe )


CSHL Computational Genomics Course, Nov 2-8, 2005
Randall F. Smith, Bioinformatics, GlaxoSmithKline R&D