(To do these exercises, you must logon to the cerebus Unix server using an Xterm:
| On the PCs: | On the Macs: |
|---|---|
double click the "SSH Secure Shell Client" icon found on the lower left of your desktop;
|
using "Terminal" (the black screen icon right of Netscape in the dock):
|
pwd or
hostname or more /ecg/slib/pir1.
HMMER 2 is installed in your path, in /ecg/bin.
The files mentioned in the tutorial are in /ecg/data/hmmer/demos/.
Copy any/all of these files to your home directory.
cp /ecg/data/hmmer/demos/globins* . (<<-- the "." is important)
hmmbuild to build an HMM from an
alignment (for example, the alignment
globins50.msf in demos/). hmmbuild globins.hmm globins50.msf
/ecg/slib/blast/wormpep.lseg and/or /ecg/slib/pir1 database
with your globin hmm:
hmmsearch globins.hmm /ecg/slib/pir1 > glob_src1.pir_outCheck to see how many globins the hmm can find in pir1. (View the file by typing:
more glob_src1.pir_outThen do a search against the C. elegans wormpep database:
hmmsearch globins.hmm /ecg/slib/wormpep20 > glob_src1.worm_outLook at the E() values for the high scoring worm globins and non-globins.
hmmcalibrate to determine some
statistics for your new HMM, so that HMMER can estimate
E-values fairly accurately in any subsequent searches you do
with the HMM.hmmcalibrate globins.hmm
hmmsearch globins.hmm /ecg/slib/wormpep20 > glob_src2.worm_out
Can the C. elegans globins (worm_glob.html)found by
hmmsearch be identified by single sequence search
(blast, PSI-blast, fasta, ssearch) ? What is the higest scoring
unrelated sequences? Is the worm globin WO1C9.5 found with the globin HMM?
hmmalign to align a large set of
globins (globins630.fa) using the model you've
built from the smaller set of 50.
hmmsearch to search the start with
some known globins (demos/globins630.fa), then
maybe "parse" a multidomain globin (Artemia globin is in
demos/Artemia.fa)hmmsearch globins.hmm globins630.fa > globins630.outpg globins630.out
We will do the example described by Sean Eddy in the handout entitled "Multiple alignment and multiple sequence based searches".
sra4_caeel
in /ecg/data/eddy/sra.lib to your unix directory:cp /ecg/data/eddy/sra.lib sra.lib
sra.lib sequences.
clustalw sra.lib(Alternatively, copy the multiple alignment
cp /ecg/data/eddy/sra.aln sra.aln
hmmbuild to build a multiple alignment of the sra.aln alignment.
hmmbuild sra.hmm sra.aln
hmmcalibrate to calibrate the sra.hmm.
hmmcalibrate sra.hmm(This takes a few minutes.)
hmmsearch to search SwissProt (this will take a long time -
please do it in groups).
hmmsearch sra.hmm /ecg/slib/swissprot > sra_hmm.results
more sra_hmm.results
Compare a few of your results to what you can find at the PFAM website's SwissPFAM repository.
You might also try running PSI-BLAST via the CHAPS interface using the sra.lib files for input. Also,
compare the form of the PSI-BLAST profile (PSSM) with the HMM model
file.