31
Function-Information Relationship in Nucleic Acids Andrej Luptak aluptak@uci.e d u U NIVERSITY of CALIFORNIA ‧ IRVINE

Function-Information Relationship in Nucleic Acids Andrej Luptak [email protected] U NIVERSITY oU NIVERSITY of C ALIFORNIA ‧ I RVINE Andrej Luptak [email protected]

Embed Size (px)

Citation preview

Information flow in biological systems

In vitro selection

How many solutions are there to a biochemical problem?

In vitro selected RNAs

Aptamers Organic dyes, amino acids, nucleotides, metabolites

Aminoglycosides, peptides, proteins, liposomes

Cells, tissues, single-walled nanotubes

Transition state analogs

Ribozymes Phosphoryl (incl. polymerase), acyl and alkyl transfer

Isomerisation, Diels-Alder, nucleotide synthesis, Michael

Metal insertion into mesoporphyrin

Metal-metal bond formation (palladium nanoparticles)

How many solutions are there to a biochemical problem?

How does one measure complexity?

How does one measure structural complexity?

And what does this have to do with evolution, biosensors and the origin of life?

Informational complexity and functional activity

Hazen et al. PNAS 2007 104

How many solutions are there to a biochemical problem?

How many solutions are there to a biochemical problem?

Isolation of high-affinity GTP aptamers from partiallystructured RNA librariesJonathan H. Davis* and Jack W. Szostak† PNAS 2002 vol. 99 no. 18

How many solutions are there to a biochemical problem?

J.AM.CHEM.SOC. 2004,126, 5130

Informational Complexity and Functional Activity of RNA StructuresJames M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak

How does one measure structural complexity?

J.AM.CHEM.SOC. 2004,126, 5130

Informational Complexity and Functional Activity of RNA StructuresJames M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak

How does one measure informational complexity?

J.AM.CHEM.SOC. 2004,126, 5130

Informational Complexity and Functional Activity of RNA StructuresJames M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak

How does one measure informational complexity?

J.AM.CHEM.SOC. 2004,126, 5130 & RNA 2006 12, 4

Informational Complexity and Functional Activity of RNA StructuresJames M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak

How does one measure informational complexity?

J.AM.CHEM.SOC. 2004,126, 5130 & RNA 2006 12, 4

H = − Pi∑ log2 Pi

i = A,U,G,C

Shannon Uncertainty

Information Content=Max Information

- Shannon Uncertainty

Max Information using 4 bases=2 bit

Informational Complexity and Functional Activity of RNA StructuresJames M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak

How does one measure informational complexity?

J.AM.CHEM.SOC. 2004,126, 5130

Informational Complexity and Functional Activity of RNA StructuresJames M. Carothers, Stephanie C. Oestreich,‡ Jonathan H. Davis,† and Jack W. Szostak

H = − Pi∑ log2 Pi

i = A,U,G,C

Shannon UncertaintyInformation Content =Max Information

- Shannon Uncertainty

Max information using 4 bases=2 bits

Invariant A: P(A)=0.997P(C)=0.001P(G)=0.001P(U)=0.001H= -(-0.997*0.00433 - 3*0.001*9.966) = 0.00432+0.0299 = 0.0342

IC= 2 - 0.0342 = 1.9658

One position in a base-pair:IC=1 bit (a base-pair is 2 bits)

Informational complexity and functional activity

Invariant A or G: P(A)=0.498P(C)=0.002P(G)=0.498P(U)=0.002H= -(-2*0.498*1.006 - 2*0.002*8.965) = 1.002+0.036 = 1.038

IC= 2 - 1.038 = 0.9622

One position in a regular or wobble pair:IC=0.5 (1 bit per loose base-pair)

Another RNA aptamer example: adenosine aptamer

Class II ligase ribozyme

Pitt & Ferré-D’Amaré, J. Am. Chem. Soc., 2009, 131 (10), pp 3532–3540

Class II ligase ribozymeRapid Construction of Empirical RNA Fitness Landscapes

Jason N. Pitt and Adrian R. Ferré-D’Amaré* Science 2010: Vol. 330 no. 6002 pp. 376-379

Evolution is an adaptive walk through a hypothetical fitness landscape

Fitness landscape shows the relationship between genotypes and the fitness of each corresponding phenotype

Empirical fitness landscape is determined for a catalytic RNA by combining next-generation sequencing, computational analysis, and “serial depletion,” an in vitro selection protocol

Abundance in serially depleted pools correlates with biochemical activity

MS = a4-11 master sequence of the ligase ribozyme

Class II ligase ribozymeRapid Construction of Empirical RNA Fitness Landscapes

Jason N. Pitt and Adrian R. Ferré-D’Amaré* Science 2010: Vol. 330 no. 6002 pp. 376-379

Histogram of correlation coefficients of kobs (n = 135 point mutants) with randomly reassorted mutation frequencies

Correlation of genotype frequency and experimental rate constants

Class II ligase ribozymeRapid Construction of Empirical RNA Fitness Landscapes

Jason N. Pitt and Adrian R. Ferré-D’Amaré* Science 2010: Vol. 330 no. 6002 pp. 376-379

Information content per position of the class II ligase ribozyme

In vitro selection of ribozymes

Optimized for single-turnover enzymes

In vitro selected RNAs

Aptamers Organic dyes, amino acids, nucleotides, metabolites

Aminoglycosides, peptides, proteins, liposomes

Cells, tissues, single-walled nanotubes

Transition state analogs

Ribozymes Phosphoryl (incl. polymerase), acyl and alkyl transfer

Isomerisation, Diels-Alder, nucleotide synthesis, Michael

Metal insertion into mesoporphyrin

Metal-metal bond formation (palladium nanoparticles)

In vitro selected ribozymes

Ligase (Bartel & Szostak, Science, 1993)RNA polymerase (Johnston & Bartel, Science 2001) Polynucleotide kinase (Lorsch & Szostak, Nature 1994) Diels-Alderase (Agresti & Griffiths, PNAS 2005)

All of these multiple-turnover ribozymes were converted from single-turnover isolates

ribozyme protein enzyme

Serganov et. al. Nature Structural & Molecular Biology 2005, V 12, pp 218 - 224

Diels-Alderase

H = − Pi∑ log2 Pi

i = A,U,G,C

Shannon UncertaintyInformation Content =Max Information

- Shannon Uncertainty

Max information using 20 amino acids=4.3219 bits or 1.301 dits (base 10)

Informational complexity and functional activity:Peptides

i = Ala...Trp

H = − Pi∑ log2 Pi

i = A,U,G,C

Shannon UncertaintyInformation Content =Max Information

- Shannon Uncertainty

Max information using 20 amino acids=4.3219 bits or 1.301 dits (base 10)

Almost Invariant Glycine: P(Gly)=0.9981P(Ala)=P(Arg)=P(Asn)=...=P(Val)=0.0001

H= -(-0.9981*0.002744 - 19*0.0001*13.28) = 0.002739+0.02523 = 0.05262

IC= 4.3219 - 0.0526 = 4.2693

Informational complexity and functional activity:Peptides

i = Ala...Trp

Informational complexity and functional activity:Peptides

# possible AAsShannon

UncertaintyInformation

Content

1 0.0000 4.3219

2 1.0000 3.3219

3 1.5850 2.7369

4 2.0000 2.3219

5 2.3219 2.0000

6 2.5850 1.7369

7 2.8074 1.5145

8 3.0000 1.3219

9 3.1699 1.1520

10 3.3219 1.0000

11 3.4594 0.8625

12 3.5850 0.7369

13 3.7004 0.6215

14 3.8074 0.5145

15 3.9069 0.4150

16 4.0000 0.3219

17 4.0875 0.2344

18 4.1699 0.1520

19 4.2479 0.0740

20 4.3219 0.0000

Peptide functions to consider:

What’s the information content of a His-tag?

What’s the information content of an HPQ streptavidin tag?

What about two HPQ tags?

A cystine bridge?

What’s the information content of a hydrophobic position?

And charged? What about a salt bridge?

Small domains: zinc finger

Structure of the model peptide and of the residues incorporated at the guest position

Richardson J. M. et.al. PNAS 2005;102:1413-1418Copyright © 2005, The National Academy of Sciences

Comparison of the enthalpy of helix formation Δhα obtained from different peptides

Richardson J. M. et.al. PNAS 2005;102:1413-1418Copyright © 2005, The National Academy of Sciences

# possible AAsShannon

UncertaintyInformation

Content

1 0.0000 4.3219

2 1.0000 3.3219

3 1.5850 2.7369

4 2.0000 2.3219

5 2.3219 2.0000

6 2.5850 1.7369

7 2.8074 1.5145

8 3.0000 1.3219

9 3.1699 1.1520

10 3.3219 1.0000

11 3.4594 0.8625

12 3.5850 0.7369

13 3.7004 0.6215

14 3.8074 0.5145

15 3.9069 0.4150

16 4.0000 0.3219

17 4.0875 0.2344

18 4.1699 0.1520

19 4.2479 0.0740

20 4.3219 0.0000

Informational complexity and functional activity:Peptide secondary structure

# possible AAsShannon

UncertaintyInformation

Content

1 0.0000 4.3219

2 1.0000 3.3219

3 1.5850 2.7369

4 2.0000 2.3219

5 2.3219 2.0000

6 2.5850 1.7369

7 2.8074 1.5145

8 3.0000 1.3219

9 3.1699 1.1520

10 3.3219 1.0000

11 3.4594 0.8625

12 3.5850 0.7369

13 3.7004 0.6215

14 3.8074 0.5145

15 3.9069 0.4150

16 4.0000 0.3219

17 4.0875 0.2344

18 4.1699 0.1520

19 4.2479 0.0740

20 4.3219 0.0000

Informational complexity and functional activity:Peptide secondary structure

Beta-sheet formation propensity(from Minor&Kim Nature 1994)

HighThr, Ile, Tyr, Phe, Val, Met, Ser

MediumTrp, Cys, Leu, Arg

LowLys, Gln

Negative propensity (sheet breakers)Gly, Pro