Biochem 503, Fall 2006

Protein Domains/Motifs - structure, evolution, and function

(9/11/2006)

Suggested reading:

Branden and Tooze, 2nd ed. Chap. 2, esp. 29-32.

Defining Protein Domains

***Doolittle, R. F. (1995) The multiplicity of domains in proteins. Annu. Rev. Biochem. 64:287-314.

Protein Domain Databases

Hofmann, K., Bucher, P., Falquet, L., and Bairoch, A. (1999) The PROSITE database, its status in 1999. Nucleic Acids Res. 27:215-219. http://www.expasy.org/prosite/

Bateman, A., Birney, E., Durbin, R., Eddy, S. R., Finn, R. D., and Sonnhammer, E. L. L. (1999) Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins. Nucleic Acids Res. 27:260-262. http://pfam.wustl.edu/

Henikoff, J. G., Henikoff, S., and Pietrokovski, S. (1999) New features of the Blocks Database servers. Nucleic Acids Res. 27:226-228. http://blocks.fhcrc.org/

Protein Domains -> Introns Early?

Gilbert, W., de Souza, S. J., and Long, M. (1997) Origin of Genes Proc. Natl. Acad. Sci. USA 94:7698-703

Hurst, L. D. and McVean, G. T. (1996) A difficult phase for introns-early. Molecular evolution. Curr. Biol. 6:533-536.

Stoltzfus, A., Spencer, D. F., Zuker, M., Logsdon, J. M., and Doolittle, W. F. (1994) Testing the intron theory of genes: the evidence from protein structure. Science 265:202-207.

Cho, G. and Doolittle, R. F. (1997) Intron distribution in ancient paralogs supports random insertion and not random loss. J. Mol. Evol. 44:573-84.

Logsdon Jr., J. M. (1998) The recent origins of spliceosomal introns revisited. Curr. Opin. Genet. Dev. 8:637-48.

Protein Domains - infering interactions

Heringa, J. and Taylor, W. R. (1997) Three-dimensional domain duplication, swapping and stealing. Curr Opin Struct Biol 7:416-421.

**Marcotte, E. M., Pellegrini, M., Ng, H. L., Rice, D. W., Yeates, T. O., and Eisenberg, D. (1999) Detecting protein function and protein-protein interactions from genome sequences. Science 285:751-753.

Protein Domains on the WWW:

*** Lecture by Sean Eddy on Protein domains and Protein Domain Databases: http://www.people.Virginia.EDU/~wrp/cshl97/domain-lecture.html


(from Doolittle, 1995, Fig. 2)
macrophage scavenger receptorMSRE_HUMAN Pfam InterPro
Collagen VI(a3)CO6A3_HUMAN Pfam InterPro
Collagen XIICONA1_HUMAN Pfam InterPro
EnterokinaseENTK_HUMAN Pfam InterPro
Factor XIIFA12_HUMAN Pfam InterPro
Complement C1rC1R_HUMAN Pfam InterPro
Complement C6CO6_HUMAN Pfam InterPro


(Doolittle, Table 2)
NamePfam(US)/Pfam(UK)len
VWFA (VA)von Willebrand factor, type A PF00092 / vwa 174
LAMG (LM)laminin g-like (A-type module) PF00054 / laminin_G 134
FA58C (FC)coagulation factor V/VIII, type C PF00754 / F5_F8_type_C 147
C1Q (CQ)collagen/complement C1q PF00386 / C1q 118
CADHcadherin-like PF00028 / cadherin 94
IGSFimmunoglobulin PF00047 / ig 65
FN3fibronectin, type III PF00041 / fn3 85
HEMOP (HX)hemopexin-like PF00045 / hemopexin 45
LDLY (LY)"YWTD" repeat, LDL-receptor PF00045 / ldl_recept_b 43
LRP (LR)leucine-rich (tolloid) PF00560 / LRR 23

(Doolittle, Table 3)
NamePfamlen
VWFBvon Willebrand factor, type B ??
SOMABSomatomedin (vitronectin) B PF01093 / vwc 44
LDLRA (LA)LDL receptor, type A PF00057 / ldl_recept_a 40
FN1 (F1)Fibronectin, type I PF00039 / fn1 37
EGF (EG)epidermal growth-factor like PF00008 / EGF 34
FOLL1 (FS)follistatin (ovomucoid) ??
PDOM (PD)P domain (trefoil) ??
FN2 (F2)fibronectin type II PF00040 / fn2 41
TSP1 (T1)thrombospondin, type I PF00090 / tsp_1 49
CCP (CP)complement control protein (sushi, SCR) PF00084 / sushi 57


Questions on this topic from previous exams

  1. Briefly summarize the difference between the PFAM and Prosite protein databases. Which database would be more likely to predict accurately domain boundaries? Which database would be more likely to suggest the function of a protein? Why?

  2. Briefly discuss the "Exon Theory of Genes". What evidence supports the theory? What evidence contradicts it? Why is it unlikely that convincing evidence for the theory will ever be found?

  3. GBB2_HUMAN (GTP-binding prot. Gi/GsGtb2) contains 7 WD40 domains (below left). Each WD40 domain is about 40 amino acids. Draw a local similarity plot (dot-plot) of an alignment of GBB2_HUMAN with itself. (A local similarity plot shows lines along the significant alignments. Use the coordinate axes with the identity diagonal provided below right.)

  4. Name two protein domain/motif databases. Of the two domain/motif databases, which would be better for finding homologous domains? Why? Which would be better for finding functional catalytic sites?

Biochem 503 Home page