Biochem 503, Fall 2008

Prediction of Protein Secondary Structure

(09/2/2008)
Soluble proteins: Mathews and van Holde, Chapter 6, 194-196
***Rost, B., Schneider, R., and Sander, C. (1993) Progress in protein structure prediction?. Trends Bioch. Sci. 18:120-3.

Membrane proteins: Mathews and van Holde, Chapter 10, 330-331
*** Fasman, G. D. and Gilbert, W. A. (1990) The prediction of transmembrane protein sequences and their conformation: an evaluation. Trends in Biochem. Sci. 15:89-92.
Chen, C. P., Kerntytsky, A. and Rost, B. (2002) Transmembrane helix predictions revisited". Prot. Sci. 11:2774-2791


Topics
Excellent summary and review of methods, with WWW Links:www.pasteur.fr/recherche/unites/neubiomol/secstrpr.html

Quantitation of secondary structure, tertiary structure, and sequence

Kabsch, W. and Sander, C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22:2577-2637.

Russell, R. B. and Barton, G. J. (1993) The limits of protein secondary structure prediction accuracy from multiple sequence alignment. J. Mol. Biol. 234:951-957.

Helical Structure in Membrane Proteins

Engleman, D. M., Steitz, T. A., and Goldman, A. (1986) Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. Ann. Rev. Biophys. Biophys. Chem. 15:321-353.

Kyte, J. and Doolittle, R. F. (1982) A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157:105-132.

"Analytical" approach

Chou, P. Y. and Fasman, G. D. (1974) Prediction of protein conformation. Biochemistry 13:222-245.

Statistical approach

Garnier, J., Osguthorpe, D. J., and Robson, B. (1978) Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120:97-120.

Gibrat, J., Garnier, J., and Robson, B. (1987) Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J. Mol. Biol. 198:425-443.

Neural Nets

Qian, N. and Sejnowski, T. J. (1988) Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol. 202:865-884.

Rost, B. and Sander, C. (1994) Combining evolutionary information and neural networks to predict protein secondary structure. PROTEINS 19:55-72.

Prediction accuracy

*** Fasman, G. D. and Gilbert, W. A. (1990) The prediction of transmembrane protein sequences and their conformation: an evaluation. Trends in Biochem. Sci. 15:89-92.

Jahnig, F. (1990) Structure predictions of membrane proteins are not that bad. Trends in Biochem. Sci. 15:93-95.

*** Rost, B., Schneider, R., and Sander, C. (1993) Progress in protein structure prediction?. Trends Bioch. Sci. 18:120-3.

B. Rost and C. Sander (1996) "Bridging the protein sequence-structure gap by structure predictions" Annu. Rev. Biophys. Biomolec. Struct. 25:113-126

J. A. Cuff and G. J. Barton (1999) "Evaluation and improvement of multiple sequence methods for protein secondary structure prediction" Proteins 34:508-519

WWW sites for Protein secondary structure prediction:

Predict-protein server - one of the best: www.predictprotein.org/ and cubic.bioc.columbia.edu/predictprotein/
See also: http://genomic.sanger.ac.uk/pss/pssb.html   and: http://www.cmpharm.ucsf.edu/~nomi/nnpredict.html
Local Garnier/Osguthorpe/Robson site: http://fasta.bioch.virginia.edu/fasta/garnier.htm
Local Chou-Fasman site: http://fasta.bioch.virginia.edu/fasta/chofas.htm

Membrane helix prediction

Kyte-Doolittle hydropathy plot: http://fasta.bioch.virginia.edu/fasta/grease.htm
TMPred program: http://www.isrec.isb-sib.ch/software/TMPRED_form.html
Classification and Secondary Structure Prediction of Membrane Proteins (SOSUI) http://www.tuat.ac.jp/~mitaku/adv_sosui/
Protein membrane helix predictions
Code Description Kyte-
Doolittle
Image
glpa_human Glycophorin A 9  17 1AFO
bacr_halha bacteriorhodopsin - Halobacterium halobium 9  17 2BRD
aa2a_human alpha2a adenosine receptor - human 9  17 1MMH
trbotr trypsin - bovine 9  17 1TNG
hba_human Human hemoglobin A 9  17 1HBA
rcel_rhovi photosynthetic reaction center - Rhodopseudomonas 9  17 1PRC
pwhu6 H+-transporting ATP synthase - human 9  17  
phoe_ecoli outer membrane protein E - E. coli 9  17 1PHO

Exercises

1. Use the Kyte-Doolittle GREASE program http://fasta.bioch.virginia.edu/fasta/grease.htm to identify the transmembrane domain structure of mellitin, NADH ubiquinone reductase, and the human voltage-gated sodium channel.
a) Use the link to the Entrez protein sequence database to lookup the name of each sequence (hint: mellitin is mel_apime).
b) Go to the Kyte-Doolittle Grease page and change "FASTA format" to "Accession/GI number" and enter the sequence identifier from Entrez.
c) Plot the hydropathy using several window sizes. For better resolution, use the GIF/PDF plot option and use Acrobat to magnify the PDF image.
d) What is the general topology of the three proteins? How many transmembrane regions does each have? Does the Kyte-Doolittle hydropathy assignment agree with the protein annotation from Entrez? (Look at the GenPept report.)

2. Compare the secondary structure predictions from Garnier; Chou-Fasman; Predict-Protein, and PSSB to the actual structure of the class-pi glutathione transferase 22GS (gtp_human).

3. Predict the secondary structure of OB_HUMAN. and compare it to the known structure 1AX8.


Questions on this topic from previous exams

  1. Both prediction of transmembrane helices and some protein secondary structure prediction methods use amino-acid dependent scoring tables and a "window" over which the score is calculated.
    1. How do the membrane helix and general protein secondary structure scoring tables differ?
    2. How are the scoring values determined (where do they come from)?
    3. What is the biophysical justification for using windows used when predicting transmembrane helices? How wide should the windows be? What trade-offs are involved in window size selection?
    4. What is the justification for using windows in predicting alpha-helices and beta-strands in soluble proteins?
  2. Pick 5 amino acids, 2 hydrophobic, 2 charged, and 1 uncharged hydrophilic. (a) Name the 5 amino-acids, using their full name and either the standard 3-letter or 1-letter code. (b) Give each of the amino-acids an approximate hydropathy value, using a range of +2 .. 2. (c) write down a 10-amino-acid sequence using your 5 amino-acids and plot a Kyte-Doolittle hydropathy plot using a 3-amino-acid window for the 10-amino-acid sequence.
  3. Name two scales have been used for transmembrane helix prediction and describe how were they derived? Transmembrane helix prediction is very accurate; does this accuracy support the observation that most soluble proteins have a hydrophobic core? Why or why not?
  4. How were Sander and Rost able to improve the accuracy of their neural-net based secondary structure prediction program from about 65% accuracy to 80% accuracy? Why is the improvement effective?
  5. Transmembrane helix prediction is relatively accurate; does this accuracy support the observation that most soluble proteins have a hydrophobic core? Why or why not?
  6. Assign integer hydrophobicity values to the amino-acids: Ala, Glu, Thr, Val, Met. Calculate and draw a hydrophobicity plot for the sequence below using your hydrophobicity scale and a 5 residue window.
    A E T V M V A A V M T E
    

Biochem 503 Home page