Computation and computational thinking in Chemistry

Preview:

DESCRIPTION

Computation and computational thinking in Chemistry. Paul Madden School of Chemistry. The “plan”. My interest – atomistic , predictive calculations of the properties of materials Energy minimization – optimization ideas - PowerPoint PPT Presentation

Citation preview

Computation and computational thinkingin Chemistry

Paul Madden School of Chemistry

The “plan”

• My interest – atomistic, predictive calculations of the properties of materials

• Energy minimization – optimization ideas

• Cutting out the computer – application of optimization strategies in synthesis

+ -

numerous technologies benefit from the capability to model thermodynamic and

transport properties accurately & reliably

Pyroprocessing of Nuclear Waste

LiCl/KCl “solvent” – now fluorides

Are the (continuum) models of transport adequate representations of reality?

More principles:

Why simulate?: interpretation/visualization provide data not obtainable by experiment

answer problems of principle, test theory

Molecular Dynamics simulation:

Follow trajectory of interacting atoms

r

Newton’s Laws of Motion

Molecular Dynamics simulation:

Follow trajectory of interacting atoms

r

Newton’s Laws of Motion

Need a “Law of Force” – sometimes “pairwise additive”

(like gravitation F ∞ 1/ r 2 )

Electron Densities and the “Force Laws”

Covalent Ionic, Non-bonding

Overlap of two spherical,non-bonding chargedensities

- +

Electron Densities and the “Force Laws”

Covalent Ionic, Non-bonding

Overlap of two spherical,non-bonding chargedensities

- +

A stiff spring between bonded atoms

Can model the dependenceon interatomic separation

because of the simplicity of these force laws can model (atomistically) molecular materials of great complexity

Cell membrane

Phospholipid

Can visualise (qualitatively)

"These movies were made by Dr. Aleksei Aksimentiev using VMD and are owned by the Theoretical and Computational Biophysics Group, NIH Resource for Macromolecular Modeling and Bioinformatics, at the Beckman Institute, University of Illinois at Urbana-Champaign."

Ion permeation through α-haemolysin

Electron Densities and the “Force Laws”

Covalent Ionic, Non-bonding

Overlap of two spherical,non-bonding chargedensities

- +

Now “easily” manipulated by chemistry (200 years)

Control the “liaisons”affected by thermal motion

Inhibition of Cyclin Dependent Kinases (CDKs)

CDK2

ATP binding pocket

CDK2 is involved in DNA replication

It is overexpressed in cancer cells, => Find inhibitors

Inhibition of Cyclin Dependent Kinases (CDKs)

NU2058

NU6027

9d-NU6027

ATP

NU6102

StaurosporineSU9516

Inhibition of Cyclin Dependent Kinases (CDKs)

CDK2

ATP binding pocket

CDK2 is involved in DNA replication

It is overexpressed in cancer cells, => Find inhibitors

MD simulation:

Follow trajectory of interacting atoms

r

But, this only works if the electrons are moving “trivially” with nucleii

Newton’s Laws of Motion

Need a “Law of Force” – sometimes pairwiseadditive – and this makes large-scale possible

Interatomic interactions mediated by local electron density

generally, this depends on instantaneous coordination environment

Electron density for aself-interstitial in Aluminium

Interatomic interactions mediated by local electron density

generally, this depends on instantaneous coordination environment

Electron density for aself-interstitial in Aluminium

Can obtain the forcesdirect from an electronic structure calculation

“First-Principles”

Such calculations can give accurate binding energies (v.i.)

Interatomic interactions mediated by local electron density

generally, this depends on instantaneous coordination environment

Electron density for aself-interstitial in Aluminium

Can obtain the forcesdirect from an electronic structure calculation (on-the-fly)

Additional benefit: obtain the electronic structure

E.g: mechanism of oxidation of a silicon surface (M. Payne)

The ab initio MD methods are general andparticularly useful when covalent bonds arebroken and formed

But they are very expensive, meaning that many issues, requiring large simulations or long runs, are out of reach

Why simulate?: interpretation/visualization provide data not obtainable by experiment

answer problems of principle, test theory

i.e. quantitative, realistic modelling

Properties of materials under extreme conditions

Mineralogy of the earth’s interior

Phase diagram of H2O -- or is it??

1 GPa = 10,000 atmospheres!!

Direct coexistence simulation – to obtain melting temperature

Determine T & P at which equilibrated solid and liquid

Size Matters:

Gillan, Alfè

The ab initio MD methods are general andparticularly useful when covalent bonds arebroken and formed

But they are very expensive, meaning that many issues are out of reach

Maybe we can use simpler representation ofelectronic structure in some cases

The ab initio MD methods are general andparticularly useful when covalent bonds arebroken and formed

But they are very expensive, meaning that many issues are out of reach

Maybe we can use simpler representation ofelectronic structure in some cases

e.g. in ionic materials simple force laws do not work quantitatively

Ions are not spherical – theyare deformed in thisenvironment

Maybe in “ionic” materials:Electron densityin an AlF3 crystal

Incorporate such ideasinto interaction potentialand parameterize A-IMultiscale modelling

Direct coexistence simulation to determine the melting temperature of MgO

Determine T & P at which equilibration occurs

Melting curve of MgO ab initio model

=

Many problems may beregarded as optimization

e.g. lowest energy structures of a cluster or a crystal

= + +

Finding a global minimum may be easy, or hard

Energy Landscape concept

+ +=

For “hard” problems non-minimization strategies, such as “genetic algorithms” have been adopted

Structures of virus capsids

Hard for minimization

110001101001001001110 00110110101101011100

1100011010010 1011100 0011011010110

1001110

Parents

Offspring

Crossover

“fitness”

mutation

Genetic algorithm

Start with a population of “parents” and evolve successive generations, by stochastically selecting moves, to improve fitness

Representation of problems within GA paradigm

Folding a protein, which should be “hard”, must actually be easy (for nature – simulated annealing works!).

Primary Structure: Sequence• The primary structure of a protein is the amino acid sequence

Typical protein will contain ~ 200 links

Tertiary Structure: A Protein Fold

Proteins onlywork when properly folded

Primary Structure: Sequence

• Twenty different amino acids have distinct shapes and properties

Secondary Structure: , , & loops

helices and sheets are stabilized by hydrogen bonds between backbone oxygen and hydrogen atoms

Tertiary Structure: A Protein Fold

Levinthal paradox, 1968

• A polypeptide chain of 100 residues (amino acids)• Each residue has only 2 possible configurations• 2^100~10^30 configurations• 10^-11 second is required to convert one to another• 10^19 seconds ~10^11years!• Doubling time for a bacteria is <30 minutes• Molten globule (microsecond ~ millisecond)• Native state (millisecond ~ seconds)

Idea of a folding “funnel”

“Foldability” must be encoded in the amino acid sequence

Schematic representation of some of the states accessible to a polypeptide chain following its

biosynthesis

We know the amino acid sequence from the genome project

A major objective is to be able to predict the fold from a knowledge of the sequence

The folded structures of some proteins is known from crystallography

Inhibition of Cyclin Dependent Kinases (CDKs)

CDK2

ATP binding pocket

CDK2 is involved in DNA replication

It is overexpressed in cancer cells, => Find inhibitors

Inhibition of Cyclin Dependent Kinases (CDKs)

NU2058

NU6027

9d-NU6027

ATP

NU6102

StaurosporineSU9516

Binding Energy (eV)

Must go out to large distances to get convergence

Can calculate binding energy of molecule at active site

Identifying drug molecules by direct calculation ofenergetics is far too slow for practical applications

Instead use QSAR Quantitative Structure Activity Relations

Activity = function(prop1,prop2,prop3,prop4,…)

prop is a readily-determined property of each potential drug mol.

Use a training set of drug mols to “determine” function (neural net)

Search huge databases of mols > 106

=> targets for synthesis and testing

However, the “properties” of relevance are defined on 3-d grids

e.g. of the electrostatic potential, or the hydrophobicity of the molecule

- which should match that of the binding site

But, the molecule (& grid) must be aligned with pocket

And, property varies with the conformation of the molecule !

Leads to huge search problems – screen-savershttp://www.bellatrix.ox.ac.uk

Oxide glasses with many components

Step 1: prepare “gene pool” of 54 glasses made up with randomly chosen compositionsStep2: measure their luminosities – “fitness”

Cutting out the computer!

“Engineering” to producearrays of such chemicals and to screen them for desirable characteristicsis now well-established

“Combinatorial Chemistry” e.g. Prof. Mark Bradley

(Huge arrays possible)

Generate a second generation stochastically and “evolve”

Drug Discovery Today !

Recommended