18
0 Mathematical Methods for Structural Biology Math 801 / Biochem 729 Julie C. Mitchell George N. Phillips, Jr. Stephen J. Wright Fall 2007

Mathematical Methods for Structural Biology

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

0

Mathematical Methods for

Structural BiologyMath 801 / Biochem 729

Julie C. Mitchell

George N. Phillips, Jr.

Stephen J. Wright

Fall 2007

1

Course Overview

• What is a protein?

• How do we determine protein structures?

• How do proteins fold?

• What is the dynamical behavior of molecules?

• How do proteins interact to perform their functions?

• How do we use mathematics and computer science to

model molecular behavior?

2

Prerequisites

• Calculus through vector calc is essential.

• Some programming proficiency is required. Any of Matlab,

perl, C, java, python, etc, should be OK for doing the course

project.

• Linear algebra, ODE’s and PDE’s are helpful but not

essential.

3

Software and WWW

• You may use any programming tools you wish for yourclass projects.

• Some generally useful software tools include • Matlab/Octave

• Xcode

• SwissPDB Viewer

• VMD

• A course website has been set up using Learn@UW.• Log in using your wisc.edu login and password at:

http://learnuw.wisc.edu

4

Homework and Projects

• Homework will be assigned for each topic. It is due the

Tuesday following the lectures for that subject.

• Homework is worth 40% of your grade.

• Projects are due Dec. 13 and constitute 60% of your grade.

• Each student will choose a project in collaboration with one

of the faculty members or TA.

5

And now…

• … let’s learn about proteins

6

DNA unravels

and codes RNA

RNA codes

proteins

A protein folds

into a globular

structure

Proteins interact to

perform biological

functions

A Brief Biology Lesson

7

http://bioinfo. mbb.yale.ed

u

Important Areas of Research

8

What is a protein?

• A protein is a linear sequence of amino acids that folds into

a characteristic three-dimensional shape.

• Proteins consist of a backbone (N-C!-C) and amino acid

sidechain residues (R)

• There are 20 standard amino acids.

• Each amino acid has different biochemical properties.

9

Protein Data Bank Files

ATOM 1 N THR 1 -4.965 28.290 7.243 1.00 31.23

ATOM 2 CA THR 1 -5.255 26.833 7.162 1.00 29.04

ATOM 3 C THR 1 -4.253 26.110 8.081 1.00 27.06

ATOM 4 O THR 1 -3.141 26.590 8.274 1.00 27.32

ATOM 5 CB THR 1 -5.101 26.392 5.694 1.00 29.96

ATOM 6 OG1 THR 1 -5.894 27.267 4.898 1.00 35.08

ATOM 7 CG2 THR 1 -5.599 24.978 5.459 1.00 30.72

ATOM 8 N MET 2 -4.688 25.016 8.699 1.00 23.88

ATOM 9 CA MET 2 -3.826 24.080 9.372 1.00 22.38

ATOM 10 C MET 2 -3.254 23.081 8.384 1.00 24.09

ATOM 11 O MET 2 -4.008 22.471 7.646 1.00 25.98

ATOM 12 CB MET 2 -4.600 23.314 10.406 1.00 21.67

ATOM 13 CG MET 2 -5.204 24.189 11.465 1.00 25.77

ATOM 14 SD MET 2 -4.040 25.028 12.518 1.00 28.87

ATOM 15 CE MET 2 -2.980 23.707 13.105 1.00 24.53

ATOM 16 N CYS 3 -1.943 22.867 8.426 1.00 20.90

ATOM 17 CA CYS 3 -1.225 22.144 7.374 1.00 20.54

ATOM 18 C CYS 3 -0.221 21.228 8.054 1.00 18.98

ATOM 19 O CYS 3 0.370 21.581 9.077 1.00 19.50

ATOM 20 CB CYS 3 -0.432 23.102 6.473 1.00 19.92

ATOM 21 SG CYS 3 -1.327 24.485 5.694 1.00 25.34

10

PDB Format

COLUMNS DATA TYPE DEFINITION

1 - 6 String Type of entry

7 - 11 Integer Atom serial number

12 (not used)

13 - 16 String Atom name

17 Character Alternate location indicator

18 - 20 Residue name Residue name

21 (not used)

22 Character Chain identifier

23 - 26 Integer Residue sequence number

27 Character Code for insertion of residues.

28 - 30 (not used)

31 - 38 Real(8.3) Orthogonal coordinates for X in Angstroms

39 - 46 Real(8.3) Orthogonal coordinates for Y in Angstroms

47 - 54 Real(8.3) Orthogonal coordinates for Z in Angstroms

55 - 60 Real(6.2) Occupancy

61 - 66 Real(6.2) Temperature factor

67 - 72 (not used)

73 - 76 String(4) Segment identifier, left-justified

77 - 78 String(2) Element symbol, right-justified

79 - 80 String(2) Charge on the atom

11

Hydrophobic Amino Acids

• Alanine ala a CH3-CH(NH2)-COOH

• Isoleucine ile i CH3-CH2-CH(CH3)-CH(NH2)-COOH

• Leucine leu l (CH3)2-CH-CH2-CH(NH2)-COOH

• Methionine met m CH3-S-(CH2)2-CH(NH2)-COOH

• Phenylalanine phe f Ph-CH2-CH(NH2)-COOH

• Proline pro p NH-(CH2)3-CH-COOH

|_________|

• Valine val v (CH3)2-CH-CH(NH2)-COOH

12

Polar Amino Acids

• Asparagine asn n H2N-CO-CH2-CH(NH2)-COOH

• Cysteine cys c HS-CH2-CH(NH2)-COOH

• Glutamine gln q H2N-CO-(CH2)2-CH(NH2)-COOH

• Histidine his h NH-CH=N-CH=C-CH2-CH(NH2)-COOH

|__________|

• Serine ser s HO-CH2-CH(NH2)-COOH

• Threonine thr t CH3-CH(OH)-CH(NH2)-COOH

• Tryptophan trp w Ph-NH-CH=C-CH2-CH(NH2)-COOH

|_______|

• Tyrosine tyr y HO-p-Ph-CH2-CH(NH2)-COOH

* Charged at pH > 6.1 13

Charged Amino Acids

• Arginine arg r HN=C(NH2)-NH-(CH2)3-CH(NH2)-COOH

• Aspartic acid asp d HOOC-CH2-CH(NH2)-COOH

• Glutamic acid glu e HOOC-(CH2)2-CH(NH2)-COOH

• Lysine lys k H2N-(CH2)4-CH(NH2)-COOH

• Histidine* his h NH-CH=N-CH=C-CH2-CH(NH2)-COOH

|__________|

And … then there’s glycine

• Glycine gly g NH2-CH2-COOH

14

ArgininePhenylalanine

CysteineTryptophan

Amino Acid “Tour”

C=gray, H=white, O=red, N=blue, S=yellow

15

Primary Secondary

Tertiary Quaternary

Ala Glu Lys Trp

His Cys Gly Ser

His Pro Cys Gln

Ala Met Arg Asn

Ser His Glu Phe

Protein Structure Heirarchy

16

Is it that simple? No!

• Protein function relies on more than protein structure.

• Proteins can have different states (e.g., phosphorylated).

• Proteins can have different environments (e.g., pH, temperature,

salt concentration).

• Proteins need small molecules and prosthetic groups (e.g., ATP,

chloryphyl, sugars, water).

17

Week 1 Homework

• Go to http://scholar.google.com and enter simple searches

for structural biology topics in this course (eg: “protein

folding” or “molecular dynamics”)

• Choose a high-impact paper and use the glossary to work

your way through the paper as much as possible.

• Turn in the paper, and highlight any words you had to look

up in the glossary.

• Also, complete the tutorials that we will start on Thursday.