Sample sequence datasets from McClure et al. 1994 (see reference below):
Exercise:
Using the kin10.fasta sequence set above, rank the performance of several of the alignment programs listed on the course "Multiple Sequence Alignment Resources" handout (e.g., CLUSTAL-W, PIMA. MSA, DCA, ITERALIGN, SAGA, T_COFFEE, POA, PCMA, PROALIGN, MAVID, MUSCLE, Align-m, DIALIGN-T, PRALINE-psi, PRRN, MAFFT-5, ProbCons) based on their ability to accurately align each of the 8 structurally conserved sites in the kinase catalytic domain (see McClure et al. 1994, Fig. 2, p. 584). To simplify the scoring, score the accuracy of aligning each site as "all-or-none", i.e., count a site as being correctly aligned only if all of the sequences in the set are correctly aligned within that site.
Note: Pre-run results files for some of these programs are provided below:
References: McClure MA, Vasi TK, and Fitch WM (1994). Compartitive analysis of multiple protein-sequence alignment methods. Mol. Biol. Evol. 11:571-592. (PDF)