Topological specification of ensembles of molecules as a basis of stereochemical considerations

THEO CHEM

Journal of Molecular Structure (Theochem) 336 (1995) 209-225 ELSEVIER

Topological specification of ensembles of molecules as a basis of stereochemical considerations*

Nicole Miiller, Stefan Reichelt, Antje Senff, Ivar Ugi*

Organisch-Chemisches Insritur, Technische Universitiit Miinchen, Lichtenbergstaje 4, 85747 Garching, Germany

Received 6 September 1994; accepted 6 October 1994

Abstract

A basis for a definite notation of ensembles of molecules is found. It also considers dynamically stereochemical and conformational properties. Mathematical methods introduced by permutational isomerism allow the consistent description of atoms of any valency. A concept for modelling of stereochemical reactions is created.

1. Introduction

Our intention is to resolve complex stereo-

chemical problems by dialogues through a computer. Therefore it is necessary to find a suit-

able way of modelling ensembles of molecules and their reactions. The stereochemical information

must be structured accordingly. For this reason a reduction to the essential details of the ensemble is

required. This means important parts of the molecule must be found automatically. The resulting reduced molecule data are structured by relations

of equivalence. For a general statement we do not take into account physical or metric data.

Stereochemical questions concern mostly reac-

tions where different stereoisomers may be produced. The theory of the chemical identity group [1] serves as a basis of the solution of these problems. It has been established as a universal

* Dedicated to the profound progress of Howard E. Simmons in chemistry and topology. * Corresponding author.

instrument for the formal description of dynamic

chemistry. For the description of molecules, their constitu-

tion as well as their configuration, the definite nota-

tion of the molecule data is necessary; accordingly there exists exactly one notation for every single

ensemble. No further rules for special cases should be defined. This is the main topic of this work.

There does not exist any algorithm which makes

possible a complete definite notation. The Cahn- Ingold-Prelog rules [2] fail in atoms with higher valency. CANON is also based on the CIP rules.

The Morgan algorithm produced in the early 1960s applied by the Chemical Abstracts Service also

succumbs to limitations. For the definite description of ensembles of

molecules our algorithm makes use of topological and graph theoretical methods and works step by

step. Based on uncoloured molecular graphs, first the chemical nature of the vertices and finally by

group theoretical aspects their stereochemistry are introduced. If the specification of the molecules is not done completely, this hierarchical ordering

0166-1280/95/$09.50 0 1995 Elsevier Science B.V. All rights reserved SSDI 0166-1280(94)04075-3

210 N. Mdler et aLlJournal of Molecular Structure (Theochem) 336 (1995) 209-225

allows us to compare structural properties of chemically different molecules, for example to transfer reaction patterns.

Due to Sir Arthur Cayley [3], many teams make successful use of the mathematical methods in chemistry. These are the teams of H.E. Simmons, R.E. Merrifield, G. Polya, D.H. Rouvray and A.T. Balaban, just to mention a few of them.

The data that are produced by graph theoretical and topological considerations are often correlated with physical or physicochemical properties like boiling point [4] or electrotopological state [5]. The results of these correlations are specific for certain substances or classes of substances. How- ever, we must avoid the considerations of any specific properties for developing a universal concept.

Many chemical and physical phenomena which characterize a molecule are based on stereochemistry. The dynamic structural properties of molecules cannot be described sufficiently by rigid geometric properties of models. These do suggest not existing symmetries. A rigid model considers no isomerizations or motions like the inner rotations and vibra- tions of the objects.

The theory of the chemical identity group avoids unjustified assumptions of the objects. It considers the geometric structure and the chemical aspects of molecules simultaneously. The foundation of this theory forms the concept ofpermutational isomerism [6] which permits the group theoretical treatment of stereoisomerism. Thus, it is possible to describe the inner and outer motions. Furthermore, this theory not only considers single molecules but also considers ensembles of molecules. These are important for the modelling of chemical reactions.

In order to apply this theory, to characterize and quantify the different permutation isomers, it is required to find equivalent atoms and groups of atoms (ligands). Therefore, the topology of an ensemble of molecules must be analysed. Any topology of a molecule belongs to the result of the combined topology of all atoms (monocentre).

2. The theory of the chemical identity group as a basis for the solution of stereochemical questions

The chemical identity group provides algebraic

solutions to stereochemical questions. The complete introduction is found in Refs. [l] and [7]. The basis of this theory is the concept of permutation isomerism 161. This enables the introduction of group theoretical methods to stereochemistry. Ugi and Dugundji did not consider single rigid molecules but ensembles of molecules with a certain dynamic. Thus, they combined the geometrical aspects with the chemical aspects of the ensembles of molecules. A chemical compound may be represented by a set of chemically identical molecules. Chemically identical compounds can differ in their structure, but they are not distinguishable by any chemical or physicochemical method, since they interconvert spontaneously.

In order to solve chemical questions it is sufli- cient to consider only the parts of a molecule which concern this problem. For that purpose, a molecule must be divided into a skeleton and into a set of ligands. If the composition of a ligand is not important to a problem, it is considered as one macro- atom.

The quality of a solution depends on an ingenious partition in skeleton and ligands as well as on the specification of these sets. The support of a computer provides the condition for efficient application of this concept. Analysing the molecule, stereochemically important molecular parts are found. The ingenious partitions are then pro- posed. By this abstraction, known reactions of similar molecule structures can be taken into account by the solution.

The permutational isomerism is the central part of this theory. At a skeleton the changing of ligands produces permutational isomers. For the charac- terization and the quantitative determination of the permutational isomers, it is necessary to find the equivalent ligands. Therefore, the topological properties of molecules are interpreted.

An ensemble of molecules is characterized by its chemical composition, the connectivities, the stereochemistry of the monocentres, the conformation and additional identity preserving permutations. Generally, the identity preserving permutations are defined as permutations that transform an ensemble of molecules into a chemically identical ensemble. The most important identity preserving permutations are outer rotations, but they can

N. Mdler et al./Journal of Molecular Structure (Theochem) 3360 (1995) 209-225 211

also be produced by the identity of ligands, the isomerizations like the tautomerism and fluctua- tions or delocalized electrons.

The new aspects here are the intensive analysis of the chemical graph of molecules and the modelling of stereochemical properties of the ensemble by the stereochemical data of its monocentres. Particu- larly, an advantage is to avoid the cumbersome creation of the identity preserving permutations of polycentres like the whole molecule.

Using models, the stereochemistry of the monocentres is determined. Each ligand occupies a topological position of this model. Permutations of ligands mean the exchange of the ligands on these positions. The relative arrangement of the positions (e.g. axial) is fixed. Thus, the current relative arrangement of the ligands is determined. These models are not produced by metric data like bond length or bond angles but the bases are idealized structures like an octahedron if the monocentre is hexavalent. Variations from the idealized model which results from different ligands or inter- molecular forces are registered by identity preserving permutations. Molecular symmetries are not determined by geometry but exclusively by the relative arrangement of identical ligands.

Considering the skeleton of a model E with n different ligands L, there exists n! permutation isomers; those form the family of permutation isomers P(E). Some of these permutation isomers can be inter- converted by identity preserving permutations. The following group theoretical representation for permutational isomerism is given, if exclusively the outer rotations are taken into account.

The family of permutation isomers forms the symmetric group Sym(L) with the cardinality FZ!. The identity preserving permutations build a subset S(E) of Sym( L). These are divided into cosets of which each represents a single permutation isomer. The cardinality in each of these cosets corresponds to S(E), since this is the number of outer rotations. Therefore the maximal number of different cosets is n!/S(E).

The ligands are enumerated. Permutations of ligands are noted as permutations of these numbers. The resulting permutations of each co-set are ordered lexically. In the following context, the respectively smallest element shall represent its

co-set. This criterion is fixed arbitrarily. Also the biggest element could be chosen. This requires only a definite and consistent assignment in order to avoid cumbersome comparisons of the complete cosets. Nevertheless if this is necessary, the cosets can be generated through the identity preserving permutations.

3. Characterizing the topology of different stereoisomers by identity preserving permutations

If the only reason for different chemical or physical properties of a molecule is the distinction of the arrangement of the ligands, these atoms or groups of atoms are stereochemically interesting. The theory of the chemical identity group enables an algebraic formulation: at stereochemically important sites there exist permutations of ligands. However, they preserve connectivities, but they lead to stereochemically different ensembles of molecules. Of course, the interchanging of chemically identical ligands results in chemically identical ensembles. Therefore, it is required to find those equivalent ligands.

There is an important difference between the permutational isomerism and stereoisomerism: there exist permutation isomers which are not simultaneously stereoisomers and vice versa. “Permutation isomers with a monocentric skeleton are always stereoisomeric, but in the case of permutation isomers with a polycentric skeleton some members of a family of permutation isomers have the same chemical constitution and thus are stereoisomers, while others are only constitutional isomers” [l(b)]. In the following context we only consider monocentric permutation isomers, since the monocentres may be combined to polycentric skeletons.

3.1. SpeciJication of monocentres

Let A be a hexavalent octahedral monocentre with six distinguishable ligands. There exist 6! = 720 permutational isomers. The number of all identity preserving permutations is 24. That means 24 outer rotations. They divide the 720 permutational isomers into 30 cosets (stereoisomers).

212 N. Miller et al./Journal of Molecular Structure (Theochem) 336 (1995) 209-225

If some ligands of a monocentre are equivalent, there are additional identity preserving permutations. These describe the interchange of the ligands. However, some cosets are combined, and the number of different stereoisomers decreases. The number and the structure of these stereoi- Somers and additional identity preserving permutations can be recognized automatically by analysing substitutional patterns and the relative arrangements of the ligands.

The algorithm is as follows.

(1) Copy the substitutional pattern to all representatives of the cosets, and put it into a list L.

(2) Get the first element of L and put this in a list L’. Construct the complete co-set by applying the identity preserving permutations to this element.

(3) Compare the co-set with all elements of L. (4) Unite equivalent cosets, remove them from L,

and generate the additional identity preserving permutations.

(5) If there still exist more than two representatives to step (2).

(6) Place the last representative in L’.

In L’ the number of the elements is the number of the different stereoisomers, and the stereoisomers can be visualized by this model.

The thus developed algorithm allows the compu- tation of all cosets. They belong to the stereoisomers with a given monocentre and further identity preserving permutations.

The algorithm is shown by the following example. The monocentre A is a model with the trigonal bipyramidal skeleton. This has the substitutional pattern AZBCD. The trigonal bipyramidal model is enumerated as shown in Fig. 1. The representatives of the cosets and the corresponding substitutional patterns are shown in Table 1. The co-set of the first representative is generated by the identity preserving permutations (idP).

1

4

+

5

3

i

Fig. 1. Enumeration of the trigonal bipyramidal model.

Table 1 Substitutional patterns corresponding to the representatives

idP: [l 2 3 4 51, [l 2 4 5 31, [l 2 5 3 41,

[2 1 3 5 41, [2 1 4 3 51, [2 1 5 4 31

the co-set of the first representative:

[l 2 3 4 51 [A A B C D]

[ 1 2 4 5 31 [A A C D B]

[l 2 5 3 41 [A A D B C]

[2 1 3 5 41 [A A B D C]

[2 1 4 3 51 [A A C B D]

[2 1 5 4 31 [A A D C B]

Comparing this co-set with the other representatives the following cosets are combined:

1. [l 2 3 4 51, [l 2 3 5 41

2. [I 3 2 4 51, [2 3 1 4 51

3. [I 3 2 5 41, [2 3 1 5 41

4. [l 4 2 3 51, [2 4 1 3 51

5. [l 4 2 5 31, [2 4 1 5 31

6. [l 5 2 3 41, [2 5 1 3 41

7. [l 5 2 4 31, [2 5 1 4 31

8. [3 4 1 2 51, [3 4 1 5 21

9. [3 5 1 2 41, [3 5 1 4 21

10. [4 5 1 2 31, [4 5 1 3 21


A

C 4-- D

B A

Fig. 2. The stereoisomer [l 2 3 4 51 with the substitutional pattern A*BCD.

Additional idP:

1. [I23541 6. [3 2 1 4 51

2. [3 2 1 4 51 7. [3 2 1 4 51

3. [32145] 8. [l 2 3 5 41

4. [3 2 1 4 51 9. [l 2 3 5 41

5. [3 2 1 4 51 10. [l 2 3 5 41

Generally, the number of the united cosets and the additional identity preserving permutations can be different for each new co-set.

In Fig. 2 the first stereoisomer is shown.

3.2. Chirality at monocentres

A definition of chirality which reflects also the inner dynamic of an ensemble of molecules is given in Ref. [8]: “A molecule is chemically chiral under given observation conditions, if there is one momentary geometry of that molecule, which cannot be superimposed on its mirror image by using only rotations, translations, and those intramole- cular motions, that can occur under the observation conditions”.

A possibility of analysing chirality is the determination of the point group of a molecule. A molecule is chiral if there is no mirror axis S,. Thus chiral molecules belong to the following point groups: C,, D,, I, T, 0 [9]. The metric information must be known for considering also the distortions of a molecule. This is a great disadvantage of this method. Also the concept of the chiral elements [2] requires metric data to describe such distortions. Our aim is to determine chirality and other stereochemical properties by using models and a definite notation, the s- and r-vectors.

Chirality depends on the chemical difference and the relative arrangement of the ligands. With a high inner symmetry a molecule must have more

Table 2 Relations between chirality and numbers of identical ligands

I Number of

I Type of

coordination coordination I Ligands

I Substitutional

pattern I I

I I I

I 4 I

tetrahedral I

l+l+l+l I

ABCD

I I I I I I

I 5 trigonal

bipyramidal I 2+2+ 1 AABBC

I 7 pentagonal bipvramidal I

4+2+ 1 I

AAAABBC

I I I I

different ligands in order to form chiral isomers. Models of the permutations describe the relative arrangement of the ligands.

In order to determine whether a stereoisomer is chiral first the substitutional pattern must be proved. For each number of coordination and each coordination type there must exist a certain number of different ligands as an isomer can be chiral (Table 2). if these criteria are accom- plished, the arrangement of the ligands is considered. For example, two equivalent ligands never may be in axial positions since the interchange of these ligands is allowed. Thus, an identical mirror image can exist (Table 3).

The chiral monocentres of each validity and each coordination type are thus determined. We give no

Table 3 Non-chiral arrangements with a given number of identical ligands

I 1 Arrangements without 1

I _ chirality I Coordination type Number of

identical Relative

ligands positions

2 axial trigonal bipyramidal

2 equatorial

2 tetragonal pyramidal

diagonal

3 in the base I I

I octahedral 1 2 ) axial ) 1 !

L 2 axial

pentagonal bipyramidal 4 equatorial

214 N. Mtiller et al./Journal of Molecular Structure (Theochem) 336 (1995) 209-225

cr Fig. 3. Molecule 1.

predicates like the R- or the S-enantiomer since this nomenclature is only valid for tetrahedral centres. But we note the representative of each co-set of the stereoisomer. Enantiomeric pairs can be found by permutations corresponding to the reflection of the mirror. Two stereoisomers are diastereomers if they cannot be assigned by these permutations.

The resulting stereochemical information of the monocentres is combined with stereochemical properties of the polycentres.

4. Graphs of chemical molecules

Many properties of chemical molecules can be represented by their (chemical) graphs.

Let M be an ensemble of molecules as a set of atoms A = {A,, AZ, A3,. . . A,} and a set of bonds

B= {B,,&,B,,.. . B,}. Thus, a graph G may be built by the set V of vertices which corresponds to A and a set E of edges as a subset of B. The set B is reduced to the set E. All bonds between two connected atoms are described by only one edge. The nature of the bonds may be characterized by a label like SINGLE, DOUBLE, AROMATIC, CONJU- GATED, etc. But this is not absolutely necessary, since all information is indicated in A and B. Nevertheless, it is also possible to describe non- classic bonds like hydrogen bonds in the graph.

Table 4

Adjacency matrix of 1

101000100

010110000

001000000

001001011

000010000

010000000

000010000

000010000

4.1. The adjacency matrix of a molecule

The topological properties of a given molecule result from the connectivity of the atoms (Figs. 3 and 4). The adjacency matrix A(G) of the graph G(M) reflects the connectivity of the molecule M. For the calculation of this matrix the nodes of G must be enumerated.

The matrix A is represented by

i

1, if there is a connection between i and j fZ[i =

0, otherwise

The adjacency matrix is simply calculated by the list of bonds of M. For all atoms, which participate in a bond, the corresponding entries are set to the value 1. Bonds with delocated electrons and multicentre bonds are separated by an according number of edges (Table 4). The sum of all entries in the row j is called the degree dj of the vertex j. Note that d, may differ from the chemical valence of the corresponding atom j.

l With dj = 1, build all vertices of the set AT of terminal atoms (in Table 4, atoms 1,4,6,7,8,9).

l With dj > 1, build all vertices of the set AI of non- terminal atoms (in Table 4, atoms 2, 3, 5).

Fig. 4. The graph of 1 and the arbitrarily enumerated graph


Table 5

Adjacency matrix of the non-terminal atoms of 2

For many problems it is sufficient to examine only the backbone of the molecule in Ai. The gen- eration of this adjacency matrix of the non-terminal atoms is the result of striking out all rows and columns which contain only a single “1” (Table 5). Also all lateral chains can be removed by an iterate application of this step.

l The number of edges closing rings in M is represented by the sum of all dj in A. A linear graph with n vertices has n - 1 edges. Any additional edge closes a ring. The adjacency matrix is symmetric: aii = aj;. Therefore, with si the sum of all entries “l”, is the number of ring closing edges r:

r = (s,/2) - n + 1. Generally, an ensemble corresponds to c molecules with the number of ring closing edges r given by r = (s/2) - n + c (Table 6).

4.2. The distance matrix of a molecule

Another important matrix due to the topological analysis of chemical graphs is the distance matrix D. D contains the distances between all vertices of G.

0, if i =j

dij = otherwise, the shortest distance between iandj

For example, an algorithm in order to calculate D is found in Ref. [lo].

In molecule 1 the distance between the atoms 1 and 4 is three bonds (Table 7). If the molecule is completely linear, in a single molecule the highest

Table 6

Number of edges closing rings in some molecules

Table 1 Distance matrix of 1

123456769

012334244

101223133

210112222

321023333

321201311

432310422

212334043

432312402

432312420

Table 8

Distance matrix of methanol

2 jl02232j 3 I12 0 2 3 21

4 (12 2 0 3 21

5 12333011

6 j122210]

Table 9

Distance matrix of water

Atom 1 2 3

1 0 1 1 m 2 102

3 120

possible value in D is n - 1 (with n, the number of atoms in M). The value n means there is no way between two atoms. These belong to different molecules of their ensemble. Thus, it is easy to combine the distance matrices of single molecules to the matrix of the ensemble (Fig. 5, Tables 8 and 9). The combination of these two matrices results in the ensemble of two molecules (the number of atoms n = 9) (Table 10).

l Rings cause characteristic patterns in the distance matrix (Fig. 6).

These patterns appear with a significant fre- quency according to the size of the rings (Table

3

--I+ 6 1

I-@ 1

4

Fig. 5. Graphs of methanol and water.

216

Table 10

N. Miller et al./Journal of Molecular Structure (Theochem) 336 (1995) 209-225

Distance matrices of an ensemble consisting of methanol and

water

11 2 3 4 5 6’7 8 91

Table 11

Patterns of rings with three to six atoms

Table 12

Distance matrix of 2

Atom11 2 3 4 5 6 7

110122211 2 1012322 3 2101232 4 2210121

5 2321012

6 1232102

7 112212201

Table 13

Possible smallest rings in 2

A tom 1 Pattern 1 Max. Size 1 II122 I 5

112231 6 ) 112231 6 1

Fig. 6. Distances in rings.

Fig. 7. Molecule 2.

11). Since the distance matrix contains the shortest distances, it always reflects the smallest rings (Fig. 7, Tables 12 and 13). The atoms 1, 4 and 7 belong to a ring of size five, and the atoms 2,3,5 and 6 to a ring of size six.

5. Topology, a basis for structuring chemical graphs

5.1. Topology of the non-coloured chemical graph

Let the topology of an atom be defined as the shortest distances of all other atoms of an ensemble of molecules. This is expressed by the number of bonds.

The order of the entries in a single row of the distance matrix related to an atom A; gives the number of atoms surrounding Ai in a defined distance.

The atoms in the same topological surroundings have the same entries in the related rows in the distance matrix. However, it is represented in a different order. In order to get such equivalent atoms, all entries are sorted in every single row (e.g. row (0, 1, 2, 3, 3, 4, 2, 4, 4) is transformed to

Table 14

Topologically equivalent atoms in 1: a (I, 7); b (6, 8, 9)

1 4toml I i IO 12 2 3 3 4 4 41a

011122333

011122222

012233333

011112233

012223344

012233444 012223344 012223344

N. Mdler et al.l.Journal qf Molecular Structure (Theochem) 3360 (1995) 209-225 211

Table 15 Table 16 Matrix of 1 sorted by rows and the resulting priorities Groups with the same priority in 1

(0, 1,2, 2, 3, 3,4,4,4)) (Table 14). Then the matrix rows are arranged by analysing the number of their first neighbours, if these are equivalent to the number of second neighbours and so on.

For our example we obtain the matrix given in Table 15. By this new order temporary priorities are given to the atoms. If there are groups with the same priority then the priorities of the

7 8 i 3 1 8 1

neighbours are analysed. These are sorted and used then to calculate new priorities. These steps are repeated, until the priorities remain constant.

In our example, atoms 6, 8 and 9 are given the priority 4, and atoms 1 and 7 get priority 8. Com- paring the priorities of their neighbours does not make any changes (Table 16). Some numbers are spared by allocating the priorities (1,2, 3,4,4,4,7, 8, 8 instead 1, 2, 3, 4, 4, 4, 5, 6, 6). Thus, by changing those priorities the others do not have to be re-enumerated.

The complete application of this algorithm results in the topological priorities of the

Fig. 8. Some examples of topological priorities in molecules.

218 N. MUller et aLlJournal of Molecular Structure ( Theochem) 336 (I 995) 209-225

Fig. 9. Priorities in tartaric acid.

non-coloured chemical graph. The criteria of ordering the matrix rows emphasize the degree of the atoms (and therewith qualitatively the chemical valence). The smaller the index of an atom, the more neighbours it has. Topologically equivalent atoms obtain the same priority! The equivalence of two atoms is valid for the complete molecule.

As a vector p the notation of the priorities may be taken as an index of the topological symmetry p=(1,2,3,4,4,4,7,8,8)(Fig.S).

Molecules like benzene, p = (1, 1, 1, 1, 1, 1, 7, 7, 7, 7,7, 7), or cubane,p = (1, 1, 1, 1, 1, 1, 1, 1,9,9,9, 9, 9, 9, 9, 9), reflect the topological symmetry well. Between molecules their vectors provide a simple topological comparison. The number of atoms with topological symmetry corresponds to the enumeration. It is sufficient to indicate which priorities the atoms have. In benzene (1, 1, 1, 1, 1, 1) is followed by (7, 7, 7, 7, 7, 7), although (1, 7) contains the same information if the number IZ of atoms in the molecule is known. This number is noted as an index. For example the shortcut for benzene is p12 = (1,7).

For the examples in Fig. 8, the short notations

Cl

\ i

HNC 0

/ H

Fig. 10. Molecule 3.

are

P = (1,1,1,1,1,1,1,1,9,9,9,9,9,9,9,9)

*PI6 = (119)

p=(1,1,1,4,4,4,7,7,7,10,10,10)

=+P12 = (1,4,7,10)

p = (1,2,2,4,5,6,7,8,&S, 8,12,12,14, 15)

+p15 = (1,2,4,5,6,7,8, 12,14,15)

The length of these vectors is a qualitative measure of the inner topological symmetry in molecules.

Stereochemical aspects The topological priorities provide some ordering

of atoms. Atoms with equivalent indices may be interchanged freely. This may be interpreted by free rotations (e.g. CH3 groups) or symmetries (ethene, benzene, tartaric acid, etc.) (Fig. 9).

Important stereochemical aspects of these non- coloured graphs are demonstrated by tartaric acid. An atom (1) with four topologically different ligands (1, 3, 5, 9) may be chiral. The existence of an equivalent atom in the same molecule results here in an inner-molecular symmetry. It reduces the number of possible stereoisomers.

5.2. Topology of the coloured chemical graph

Due to the following advantages the chemical nature of the atoms has been suppressed: l The knowledge of graph theory is fully applic-

able. l With similar skeletons but different chemical

composition the molecules are comparable.

As follows, the chemical nature of the vertices will be introduced as an additional differentiation (Figs. 10 and 11). The pure topological ordering of 3 generates the same priority in the vertices 1,5 and 6 (Tables 17 and 18). The different sorts of atoms H and Cl lead to new priorities. The atom with the higher number of atoms is given the higher priority (the smaller number) (Table 19). Therefore all equal priorities must be recalculated.

These steps are repeated until the priorities are not changed. This guarantees that the equivalence

N. Miiller et al./Journal of Molecular Structure (Theochem) 3360 (1995) 209-225 219

101211 210122 l

321033 212202 0

Table 18 Topological priorities in 3

Table 17 Distance matrix of 3

of nodes is valid all over the molecule. Important topics are the following:

The group with the highest priority must be recalculated first. Within one step only the priorities of a single group of equivalent indices are changed. Atoms within a single group of equivalent indices and with the same sort of atoms are not distinguishable.

Stereochemical aspects Certain symmetries break up in the chemical

distinction of the vertices (Fig. 12). The symmetry in tartaric acid (A) is broken up by replacing a single atom (15).

Table 19 Chemical-topological priorities in 3

Atom Atom sort Priority 1 012223 H 4

2 011112 1

3 011222 2

6. Algebra of the S- and r-vectors

The quadratic complexity of the xbe- and xr- matrices [l l] had been reduced to s- and r-vectors (stereochemical vectors and reaction vectors) [ 121

4 012333 6 with a linear complexity. In analogy to the xbe-

5 012223 Cl 3 matrices the s-vectors describe the connection of

6 012223 H 4

Fig. Il. Graph of 3 and the arbitrarily enumerated graph

Fig. 12. Priorities in tartaric acid (A) and a derivative (B).

220 N. Mtiler et aLlJournal of Molecular Structure (Theochem) 336 (1995) 209-225

HI

k&H, + Clg-CIT -

T1 I

H,-y&, + Hg Cl,

H3 H3

Fig. 13. Reaction between methane and chlorine (the molecules

are arbitrarily enumerated).

positions. These positions are the topological positions of some ensembles of molecules. An affiliation vector z assigns each position to the corresponding atom. Reactions are represented by the r-vector, which contains permutations of these positions. The s- and r-vectors are applicable by the theory of the chemical identity group and the concept of the permutational isomers.

Bonds between atoms are described by the permutation of the topological positions. A disjunct permutation between two different positions, a transposition, means a covalent bond. A position, which is in relation to itself is non-bonded. Permu- tations with more than two elements can be interpreted. For example they may be delocalized electron systems or multicentre bonds.

The reaction shown in Fig. 13 shows the structure and the algebra of the s- and r-vectors. The r-vector r describes the reaction. Applied to the s- vector s, of the reactants (r - s, = sp) the s-vector sp of the products results (Table 20).

The new s-vector must be normalized. All topological priorities are recalculated. The results is a new sequence in the affiliation vector (Table 21). In this example, only the constitution of the ensemble of molecules is necessary. Now we show how to

Table 20

The S- and r-vectors for the reaction in Fig. 13

Number I 2 134567

Atom sort C H H H H Cl Cl

P 6 6 f-m 8 9 10

Table 21

The normalized s-vector

create an s-vector, if the stereochemistry of the ensemble is also considered.

6.1. Creating s-vectors of molecules

In order to create an s-vector some information is needed:

l the number of positions in the current monocentre;

l the topology of the centre expressed by identity preserving permutations;

l the representatives of all cosets; l the topological priorities of all centres, which are

used to distinguish stereochemical isomers; l the stereochemical influence of the atoms.

If only constitutional aspects are taken into account, all positions A(A) of an atom A are inter- changeable. It is not of interest to note which one participates in a defined bond. Nevertheless for a definite specification, the order of the positions as well as an unambiguous relation between positions and ligands is decisive. Enumerating the positions of the monocentre gives their definite order in the vector. However, this determines only the relative arrangement of the ligands, due to the identity preserving permutations. For every identity preserving permutation, there exists an arrangement of ligands which represents the same molecule (Fig. 14).

r-vector 12 31(

New s-vector 5 6 7 9 12 3104 8 Fig. 14. Model of tetrahedral topology and the sequence of the

position in the s-vector.

N. Mtiller et al.iJournal qf Molecular Structure (Theochem) 3360 (1995) 209-225 221

For a definite representation, the order of the ligands is transformed, corresponding to the representative of the co-set of the current stereoisomer. If there exist prochiral centres, it is to be proved whether there is a chiral influence (e.g. two chiral ligands with the same chemical composition but an inverse stereochemistry attached to the monocentre).

Let A be a monocentre with the positions A(A), which are a part of the s-vector. The stereochemical configuration of A is represented by the sequence of bonds connecting the neighbouring atoms. The topological priorities of the neighbours are used in order to create an absolute description. These priorities are transformed by their value for a permutation, e.g. the priorities (4, 8, 12, 1) are transformed to pi = [2 3 4 11. With the identity preserving permutations of the topology of A, this representative of the current co-set is found. The set [2 3 4 1] is a member of the co-set with the representative Pa = [l 2 4 31. The sequence of bonds must permutated, so that this sequence corresponds &. The transforming permutation is found by

-I bans = PR ‘PI

be to

Monocentres with the same priority, however with other representatives, can now be distinguished. The smaller representatives lead to the higher priority.

If there still exist groups with equivalent priorities, we have to prove for such centres whether there is a chiral influence. This is done by accumu- lation of priorities (Fig. 15).

Let Z = (A, B, C) be a set of atoms in the molecule M with an equivalent topological priority of 1.


Fig. 16. New topological priorities in 4.

Analysing the stereochemical conformation gives

A: lc 4 la 7

B: 4 7 IA lc

c: 4 1A 7 la

Forming the smallest representative of all cosets [l 2 3 41:

A: ]lc 3 2a 41 +C= l,B=2

B: l3 4 lA 2,] =+A= l,C=2

c: I3 lA 4 2a] +A=l,B=2

Sum up these values to give new ranks: A = 2,

B = 4, C = 3 and therewith new priorities: A = 1, B = 3, C = 2.

This also gives a new order of indices for the ligands (Fig. 16).

From all necessary information the complete s- vector is created. The set of atoms is sorted by their priorities. The number of required positions is derived from the list of bonds. The current bonds are denoted as permutations of the positions. The stereochemical centres are transformed to the representation of the current stereoisomer. If there exist I~O stereoisomers of a monocentre, any permutation of its positions leads to the identical molecule. So they are sorted simply by the topological priorities of the connected atoms. This is shown by molecule 5 (Fig. 17, Table 22).

6.2. Specification of the conformation

By analogy to the constitutional description, the conformational information is denoted as a vector. Therefore, the meaning of “conformational isomers” has to be understood in a broader sense. Let P be a set of permutations describing all

222 N. Miller et al./Journal of Molecular Structure (Theochem) 336 (199s) 209-225

H-O 6-5

\ cl+ /” \ 10 i 1 J-2

H” \ I

4’3 \ I 0

\9 0. O-H 11

H ‘12

Fig. 17. Molecule 4 (left), arbitrarily enumerated (middle), and showing topological priorities (right).

Fig. 18. Priorities in molecule 5

internal rotations about a bond B in the molecule M. The subset U(B) & P(B) contains all permutations which lead to non-identical molecules. If U(B) is not empty, then M has conformational isomers. The isomers are distinguished by a different spatial arrangement of two connected monocentres.

The interpretations of permutations in U(B) may be

l the ligands are voluminous l B is part of a ring l B is, for example, a double bond.

The topology of two connected monocentres A and B is known. For an arbitrarily selected ligand L of atom A, any conformational isomer is characterized by the notation of these ligands of B which represent the rotational area of L. For a definite


description the ligand with the highest topological priority is chosen (Fig. 18).

Two ligands are fixed to each other. They are denoted as a transposition of the corresponding positions. Permutations with more than two elements represent the restricted torsion between two centres. These contain also the positions the ligand may pass.

In molecule 5 we will note the positions of 1 and 15 (Fig. 18).

In molecule 6 the ligand F may not pass C or D, so only a torsion about the bond A-B is possible. There are two conformational isomers: ligand F is directly between C and D or is not, in other words

Table 22 Creating the s-vector of molecule 5

Topo”indexof the 2 3 7 4 1 3 5 6 1 2 6 9 1 10 2 11 3 12 1 2 3 4 5 6 neighbour

Enumeration of the positions 1 2 3 4 5 6 7 6 9 10 11 12 13 14 15 16 17 16 19 20 21 22 23 24

s-vector 5 9 19 13 1 10 15 20 2 3 17 21 4 22 7 23 11 24 3 6 12 14 16 16


Fig. 20. Combination of the family of permutational isomers with a tetrahedral skeleton and the family of permutational isomers with a

trigonal bipyramidal skeleton via the &2 reaction type.

the area is (C D) or (D E C). The small numbers in Fig. 19 are the numbers of the positions according to the s-vector. The conformational vector (c- vector) is noted as permutations of the positions:

(C D): (1 78)

(D E F): (1 867)

The c-vector contains all necessary data about the conformational properties. This combines the monocentres to the polycentres of molecules. Some molecules may even have several c-vectors. It is possible to describe adducts of molecules in ensembles of molecules. Therefore, this vector is also used for the modelling of reactions when molecules have to build a definite constellation.

6.3. Reaction vectors

Chemical reactions can be separated into elementary steps. New bonds are made (addition of ligands), prevailing bonds are broken (elimination of ligands) or the topological arrangement of ligands on the skeleton changes (e.g. Berry pseudo- rotation). Reaction mechanisms combine elementary steps and give information about the topological arrangement of the ligands on the skeleton.

Reaction mechanisms contain among other things information about

l the stereochemical model of the reactants,

Fig. 21. Combination of the topological positions.

l the sequence of the elementary steps, l the stereochemical models of the intermediates, l the stereochemical models of the products.

With this information the stereochemical progress of the reaction can be regarded. Elementary steps combine families of permutational isomers. A family of permutational isomers is characterized by a number n of ligands, a skeleton with a defined topology represented by the identity preserving permutations. In some cases there exist different ways of combining families of permutational isomers. This leads of course to different products. Whether or not there are different ways, this is determined by the reaction mechanism.

Each topological position of the skeleton of the reactant model is related to a topological position of the skeleton of the product model. This relation is expressed by a permutation of topological positions the reaction vector (r-vector). An r-vector is independent of the type of ligands. Elementary steps are described by one or in some cases several v-vectors (Figs. 20 and 21).

In most cases a chemical reaction is a sequence of elementary steps. Each step combines a family of permutational isomers to another family of permutational isomers. The s-vector of the reactant model is combined with the r-vectors of this step. The results of the assignment are the s-vectors of the products. If another elementary step follows the corresponding r-vectors are applied to the resulting s-vectors of the intermediate. This proce- dure is continued until the end of the reaction. The results are the s-vectors of the possible product isomers. The r-vector of the complete reaction is a combination of the r-vectors of the elementary steps. An illustration of this is given in the example in Fig. 22 [13].

224 N. Mdier et al./Journal of Molecular Structure ( Theochem) 336 ( 1995) 209-225

Br I

+I- - + Br- H CH,

L3 L5

10

+ L - + x4

L L2

Fig. 22. Reaction of 2-bromobutane to 2-iodobutane.

Fig. 23. First elementary step.

This reaction describes in two elementary steps the substitution of the leaving group L3 by the entering group L5 during an S,2-type reaction. The ensemble of the reactants is represented by the s-vector shown in Table 23 on the left side.

For modelling the reaction this s-vector can be used in its reduced form, shown in Table 23 on the right side. The first elementary step is the addition of the ligand L,. The r-vector rl of this step is [ 1 5 2 3 41. This r-vector rI is applied to the reduced s-vector sR of the reactants. The result is the s- vector sI of the intermediate. This s-vector

Table 23

S-vectors of the ensemble of the reactants

1 8

2 6

ffl

3 7

4 9

describes a model of the family of permutational isomers with a trigonal bipyramidal skeleton (Figs. 23 and 24).

In the second elementary step the ligand L3 is eliminated. The r-vector r2 of this step is [2 3 5 4 11. The resulting s-vector sp of this step is the s-vector of the product model (Figs. 25 and 26).

The combination of the r-vectors of the elementary steps forms the r-vector r,,, of the complete reaction. In this example rcom is the permutation [5 2 4 3 11. The resulting isomer is the isomer represented by [l 2 4 31 of the family of permutational isomers with a tetrahedral skeleton. The reactant model is the isomer [l 2 3 41. The

Fig. 24. Combination of the reduced s-vectors.


-

Fig. 25. Second elementary step

Fig. 26. Combination of the reduced s-vectors.

product model is the isomer [l 2 4 31 and the result is an inversion on the skeleton atom.

7. Conclusion

This concept seems to be an ideal approach for the definite description of dynamic chemistry. For a general statement we do not take into account physical or metric data. The modelling of complete molecular structures is avoided by reducing the problem to a set of models of monocentres. Any number of ligands of a monocentre and their topology are described uniformally. Also a great advantage is the possibility of applying single reaction patterns to similar sets of monocentric models. For the modelling of reactions the conformational relations between the reactants must be considered. This arrangement is reflected by the conformation vector.

References

[l] (a) I. Ugi, J. Dugundji, R. Kopp and D. Marquarding, Lecture Notes in Chemistry, Vol. 36, Springer, Berlin, 1984. (b) Ref. [l], p. 20.

[2] R.S. Cahn, C.K. Ingold and V. Prelog, Angew. Chem., 78 (1966) 413.

[3] A. Cayley, Chem. Ber., 8 (1857) 1056. [4] D.A. Morales and 0. Aranjo, J. Math. Chem., 13 (1993)

95. [5] A. Voelkel, Comput. Chem., 18 (1994) 1. [6] I. Ugi, D. Marquarding, H. Klusacek, G. Gokel and P.

Gillespie, Angew. Chem., 82 (1970) 741-771. [7] I. Ugi, J. Bauer, K. Bley, A. Dengler, A. Dietz, E. Fontain,

B. Gruber, R. Herges, M. Knauer, K. Reitsam and N. Stein, Angew. Chem., Int. Ed. Engl., 32 (1993) 201.

[8] J. Dugundji, R. Kopp, D. Marquarding and I. Ugi, Top. Curr. Chem., 75 (1978) 165.

[9] V. Sokolov, Chirality and Optical Activity in Organo- metallic Compounds, Gordon and Breach, New York, 1990.

[IO] W.R. Miiller, K. Szymanski and J.V. Knop, J. Comput. Chem., 8 (1987) 170&173.

[l I] B. Gruber, Algebraische Modellierung der Stereochemie. Doctoral thesis, Technische Universitlt Munchen, 1992.

[12] I. Ugi. J. Bauer, C. Blomberger, J. Brandt, A. Dietz, E. Fontain, B. Gruber, A. v. Scholley-Pfab, A. Senff and N. Stein, J. Chem. Inf. Comput. Sci., 34 (1994) 3.

[13] P.C.K. Vollhardt, Organic Chemistry, W.H. Freeman, San Francisco, 1987, p. 200.

Documents

Topological specification of ensembles of molecules as a basis of stereochemical considerations