80
CHAPTER 9 Random Matrix Theory and Maximum Entropy Models for Disordered Conductors A. Douglas STONE Department of Applied Physics and Center for Theoretical Physics, Yale University P.O. Box 2157, New Haven, CT 06520, USA Pier A. MELLO* Department of Physics, University of Washington Seattle, WA 98195, USA Khandker A. MUTTALIB Department of Physics, University of Florida Gainesville, FL 32611, USA Jean-Louis PICHARD CEA Saclay, 91191 Gif-sur-Yvette Cedex, France * On leave from Instituto de Fisica, Universidad Autonoma de Mexico, D.F. Mexico Mesoscopic Phenomena in Solids Edited by © Elsevier Science Publishers B.V., 1991 B.L. Altshuler, P.A. Lee and R.A. Webb 369

[Modern Problems in Condensed Matter Sciences] Mesoscopic Phenomena in Solids Volume 30 || Random Matrix Theory and Maximum Entropy Models for Disordered Conductors

Embed Size (px)

Citation preview

C H A P T E R 9

Random Matrix Theory and Maximum Entropy Models for

Disordered Conductors

A. Douglas S T O N E

Department of Applied Physics and Center for Theoretical Physics, Yale University

P.O. Box 2157, New Haven, CT 06520, USA

Pier A. M E L L O *

Department of Physics, University of Washington

Seattle, WA 98195, USA

Khandker A. M U T T A L I B

Department of Physics, University of Florida

Gainesville, FL 32611, USA

Jean-Louis P I C H A R D

CEA Saclay, 91191

Gif-sur-Yvette Cedex, France

* On leave from Instituto de Fisica, Universidad Autonoma de Mexico, D.F. Mexico

Mesoscopic Phenomena in Solids

Edited by

© Elsevier Science Publishers B.V., 1991 B.L. Altshuler, P.A. Lee and R.A. Webb

369

Contents 1. Introduction 372

2. The scattering approach to disordered conductors 375

3. Transfer matrix formulation 378

3.1. Symmetries 378

3.2. Polar decomposition and eigenparameters of Μ 381

3.3. Relation of the transfer matrix to the conductance 382

3.4. Active transmission channels, U C F and Imry's conjecture 384

4. Maximum-entropy probability densities 385

4.1. Information content of a discrete probability distribution 386

4.2. Continuous probability densities and constraints 387

4.3. Multivariate densities and the integration measure 389

5. Maximum-entropy matrix ensembles and level repulsion 390

5.1. Level repulsion in the 2 χ 2 orthogonal ensemble 390

5.2. Level repulsion in the 2 χ 2 unitary ensemble 392

5.3. Gaussian orthogonal and unitary ensembles 392

5.4. Joint probability densities 393

5.5. The G O E and G U E as maximum-entropy distributions 393

6. Properties of standard random matrix ensembles 394

6.1. Coulomb gas analogy 394

6.2. Method of orthogonal polynomials for G U E 395

6.3. Correlation functions of G U E and spectral rigidity 396

7. Invariant measure for Q and X matrices 398

7.1. Definition of the invariant measure 398

7.2. Uniqueness of the invariant measure 400

7.3. Derivation of d/i(Q) 403

8. Global maximum-entropy hypothesis 407

8.1. Constraint of given eigenvalue density 407

8.2. Justification for the choice of ρ(λ) 409

8.3. Properties of the Coulomb gas for the charges {v„} 412

9. Metallic regime: The high-density phase 414

9.1. Fluctuation measures in the metallic regime 414

9.2. Confirmation of image charge effect 417

9.3. Statement of U C F in the global approach 419

370

Random matrix theory and maximum entropy models 371

9.4. Proof that v a r g ~ 1 420

9.5. Numerical results for var g 424

9.6. Dependence of var g on symmetry parameter β 424

10. Localized regime: The low-density phase 427

10.1. Effective Hamiltonian in the localized regime 427

10.2. p(g) in the localized regime 428

10.3. Dependence of localization length on β 430

11. The local maximum-entropy approach 432

11.1. Generalities 432

11.2. The ensemble of transfer matrices and combination law 433

11.3. Local maximum entropy-ansatz and diffusion equation 434

11.4. Predictions of the local approach 435

12. Central-limit theorem for local approach 439

12.1. Why do maximum-entropy approaches work? 439

12.2. C L T in weak-disorder limit 439

13. Compatibility of global and local approaches 442

14. Summary and conclusions 444

References 446

1. Introduction

Underlying the scaling approach to disordered conductors (Edwards and Thouless 1972, Abrahams et al. 1979) is the assumption that the quantum transport properties of systems probed at length scales much longer than the elastic mean free path / should be insensitive to the microscopic origin of the disorder which causes the elastic scattering. Since the conductance of a sample for a fixed configuration of disorder is a random variable which depends (in a complex manner) on the many random variables specifying the particular realization of the disordered potential, this assumption can only be correct if the probability distribution of the conductance approaches a universal distribu­tion which has 'lost the statistical information' which distinguishes various random microscopic Hamiltonians. Instead this distribution, P(g) should depend only on a few parameters, such as the Fermi energy, e f, the 'strength of the disorder' as represented by the average conductance or by /, and the size and shape of the sample. In fact if the one-parameter scaling theory of localiza­tion holds for all statistical properties of the conductance, then the distribution can only depend on even these parameters through a single parameter (Shapiro 1986, 1987) such as the localization length, ξ.

It is now known that the conductance exhibits important fluctuation phenom­ena both in the localized (Anderson et al. 1980, Anderson 1981, Erdos and Herndon 1982) and metallic regime (Al'tshuler 1985, Lee and Stone 1985), as the system size L -• oo at Τ = 0. Non-zero temperature defines a length scale below which these quantum fluctuation phenomena can be directly observed in transport measurements. Typically these length scales are of the order of a few microns at liquid-helium temperature, thus recent technological advances which have made possible the systematic study of conductors on this size scale have greatly contributed to the recent surge of interest in the basic quantum transport properties of these systems. Because such systems are intermediate in size and behavior between single atoms and bulk solids, they have been termed 'mesoscopic' (Imry 1986a, Stone 1985). However, we shall see below that the maximum-entropy approach described here yields important insights into low-temperature quantum transport phenomena in conductors not restricted to the 'mesoscopic' size range.

372

Random matrix theory and maximum entropy models 373

Nonetheless it was the discovery of reproducible random structure in the transport coefficients of small conductors as a function of external parameters such as magnetic field or electron density which highlighted the crucial role of fluctuation phenomena in mesoscopic transport (Washburn and Webb 1986, Aronov and Sharvin 1987). In the strongly localized regime at T= 0 the typical conductance depends exponentially on the localization length, gefi ~ e~2 L / <% and it is well-known, at least in one dimension, that ξ'1 has an approximately normal distribution (Anderson et al. 1980, Anderson 1981, Erdos and Herndon 1982), so P(g) is approximately lognormal. This leads to enormous relative resistance fluctuations which actually increase exponentially with L; and fluc­tuation phenomena related to this behavior are observable in the hopping conductivity of small semiconducting samples (Webb et al. 1985, Lee 1984). In the metallic regime ( <J>L>/ ) , conductance fluctuations are also observed (Washburn and Webb 1986, Aronov and Sharvin 1987); in this case they are found to obey the striking theoretical prediction (Al'tshuler 1985, Lee and Stone 1985, Al'tshuler and Khmel'nitskii 1986, Al'tshuler and Shklovskii 1986, Lee et al. 1987, Imry 1986b)

v a r g = < ( g - < g > ) 2> ~ l , (1.1)

independent of the system size (at T = 0 ) and of the average conductance <g> (caveats to this statement will be discussed below). This theoretical prediction was first obtained by diagrammatic perturbation theory and by exact numerical calculations (Al'tshuler 1985, Lee and Stone 1985). Although the physical origin of the effect in random interference of diffusive electron trajectories was immedi­ately understood (Stone 1985, Imry 1986a), initially no simple explanation for the universal amplitude of the fluctuations was given. However the precise quantitative agreement found between calculations of var g in different micro­scopic models (Lee and Stone 1985), strongly supported the idea that there is indeed a universal distribution P(g) describing disordered conductors, which is insensitive to detailed microscopic assumptions.

The occurrence of these fluctuation phenomena, which dominate the trans­port properties of mesoscopic conductors at low temperatures, implies that the universal conductance distribution P(g) cannot be self-averaging (i.e. P(g) does not in general approach d(g — g0) as L-> oo), even in the weakly disordered (metallic) regime. This naturally raises the question: what is the nature of this statistical distribution and what principle determines these universal but non-trivial fluctuation properties? Also, what causes the qualitative change between the fluctuation properties in the localized and metallic regime, and can they both be explained within a unified framework?

In this chapter we describe two related approaches which provide a partial answer to these questions based on a theory of the ensemble of random transfer matrices which determine the conductance through the two-probe Landauer formula (Landauer 1970, Fisher and Lee 1981, Imry 1986a). Both approaches

3 7 4 A.D. Stone et al.

start from the ansatz that this random matrix ensemble is as random as possible (i.e. its probability distribution is that of maximum information entropy) given the symmetry properties of the matrices, and a constraint which characterizes the macroscopic properties of the ensemble of conductors. We shall denote these two approaches as global and local maximum-entropy models for reasons which will become evident below, where the notion of a maximum-entropy distribution will be introduced and discussed in detail. We will show that many of the statistical properties of the conductance distribution can be understood and even derived quantitatively from these maximum entropy approaches. The material in this chapter is primarily based on Imry (1986b), Muttalib et al. (1987), Pichard et al. (1990a,b), Mello et al. (1988a), Mello (1988), Mello et al. (1988b), Mello and Shapiro (1988), Mello and Pichard (1989), Mello and Stone (1990) with somewhat more space devoted to the global approach (Muttalib et al. 1987, Pichard et al. 1990a,b) than the local one. Mello et al. (1988a), Mello (1988), Mello et al. (1988b), Mello and Shapiro (1980), Mello and Stone (1990) give a detailed mathematical development of the local approach, in particular Mello et al. (1988a) give a full derivation of the diffusion equation fundamental to this approach. Much of the material in sections 9 and 10 is new and unpublished (Pichard et al. 1990b, Stone and Jalabert 1990) at the time of this writing.

In the metallic regime we find that the ensemble of random transfer matrices exhibits statistical behavior very similar to that predicted by the classic theory of random matrices first developed for applications to nuclear physics (Brody et al. 1981, Mehta 1967, Porter 1965, Wigner 1953, 1955, 1957, 1958, Dyson 1962a,b,c, Dyson and Mehta 1963), and now widely employed to describe other physical systems, such as quantum systems that are classically chaotic (Bohigas et al. 1984, Berry 1985). In particular the 'universal conductance fluctuations' of eq. (1.1) can be seen as arising from the well-known 'spectral rigidity' (suppression of eigenvalue fluctuations) characteristic of the classic random matrix ensembles, as first proposed by Imry (1986b). Thus these approaches establish a close relationship between the novel fluctuation phe­nomena discovered in mesoscopic conductors, and the well-developed area of statistical analysis of nuclear and atomic level spectra. This relationship illustrates strikingly that mesoscopic electronic systems of size 10~ 6m can exhibit the same quantum statistical phenomena as microscopic systems nine orders of magnitude smaller.

However it must be emphasized that in the limits of physical interest, the ensemble of random transfer matrices is not identical to the standard random matrix ensembles (e.g. the Gaussian orthogonal ensemble). First because we are interested in the properties of a specific statistic, the conductance, which is rather different than those typically considered in the standard theories. We shall see that this difference requires an extension of the standard results in order to derive universal conductance fluctuations, in contrast to earlier claims.

Random matrix theory and maximum entropy models 375

However, more fundamentally, this ensemble differs from the standard ensem­bles because of the multiplicative combination law of the transfer matrix, i.e. each member of the ensemble is a random matrix product, and the statistical behavior of the ensemble depends upon the relationship between N, the dimen­sionality of the matrix, and NL, the number of terms in the product (the former being proportional to the width of the conductor, and the latter to its length, L) . This leads to eigenvalue correlations whose character changes as L increases for fixed N. In general the 'eigenvalues'* of the transfer matrix become less and less densely distributed and their correlations become less and less important, so that eventually the assumptions concerning the nature of those correlations in the standard theories break down (even locally), and the ensemble exhibits qualitatively different statistical behavior. It is shown below, by using a novel Coulomb gas analogy, that this transition corresponds to g ^ 1, i.e. to the cross­over to strong localization. In this limit the ensemble is completely different from the standard ensembles, the universal conductance fluctuations are lost, and instead we recover the lognormal and non-universal conductance fluctua­tions expected for a strongly localized system. Finally, the fact that our ensemble consists of random matrix products means that a kind of central limit theorem can be derived for the joint probability density which justifies to some extent the use of a maximum entropy ansatz. Thus we wish to draw attention to the fact that the ensemble of random transfer matrices (or more generally, random matrix products) represents a new and richer ensemble which merits further and more rigorous mathematical study.

2. The scattering approach to disordered conductors

The classic theory of random matrices was applied to statistical analysis of nuclear level spectra, either by considering the resonant levels as arising from an ensemble of random Hamiltonians following Wigner (1953,1955,1957,1958) or from an ensemble of random scattering matrices (S-matrices) following Dyson (1962a,b,c). In applying this approach to the case of disordered conduc­tors we must begin by finding a simple ensemble of random matrices whose eigenvalues determine the conductance. We can do this by using a conceptual approach pioneered by Landauer (1970), in which the conductance can be expressed solely in terms of the total transmission coefficients of the sample, considered as a single, complex scattering center. We summarize here briefly the arguments leading to the simplest expression of this type, which applies to the ideal two-probe conductance measurement (Imry 1986a) described below.

*Here we use the term eigenvalues loosely; later we show that the transfer matrix is naturally characterized by Ν real numbers which are not its eigenvalues, but are the eigenvalues of a closely related matrix which determines the conductance.

3 7 6 A.D. Stone et al.

We note that any derivation of this type is to some extent heuristic, because no theorist had attempted to model the full complex interaction Hamiltonian corresponding to an experimental resistance measurement.

An ideal two-probe measurement is one in which the sample is attached between two perfect reservoirs with electrochemical potentials, μ1 and μ 2 = μγ + eV respectively (where V is the applied voltage) and these reservoirs serve both as current source and sink and as voltage terminals. A perfect reservoir is defined to have the following properties:

(1) It is initially in equilibrium at electrochemical potential μ and this equilib­rium is negligibly disturbed by the current flow.

(2) Particles entering the reservoirs never return without loss of phase memory.

(3) The connection between the reservoir and the sample generates no addi­tional resistance.

Because of the imbalance of chemical potentials current will flow from reservoir 1 to 2. The total current which flows can be obtained from a 'counting argument'. In the energy interval e V between μ 2 and μι electrons are injected into right-going states emerging from reservoir 1, but none are injected into left-going states emerging from reservoir 2. Thus there is a net right-going current proportional to the number of states in the interval μι - / * 2 , given by

where Ν is the number of propagating channels in the sample, vt is the longitudinal velocity for the ith momentum channel at the Fermi surface, Ttj

is the transmission probability from j to z, and we have used the fact that for a quasi-ID density of states, dn^/de = l/hvi.

If we assume that the conductance is measured by dividing the induced current by the chemical potential difference between two regions deep within the reservoirs (where by assumption equilibrium is negligibly disturbed), then eq. (2.1) yields an expression for the two-probe conductance, g (measured in units of e2jh\

where t is the transmission matrix, a sub-matrix of the S-matrix whose standard definition is given below. Thus the two-probe conductance is expressible solely in terms of the eigenvalues of the matrix t t f , but note that these eigenvalues are not simply related to those of the complete S-matrix (which are modulus unity and mix up reflecting and transmitting channels).

Equation (2.2) can also be obtained from a more formal derivation from quantum linear response theory (Fisher and Lee 1981, Stone and Szafer 1988,

(2.1)

Ν

(2.2)

Random matrix theory and maximum entropy models 377

Baranger and Stone 1989), except that the heuristic construction of perfect reservoirs is hidden in the choice of boundary conditions: the reservoirs are replaced by semi-infinite perfect leads at fixed potentials. A useful intermediate result in such a derivation is that the two-probe conductance is given by the flux of the conductivity tensor through the two surfaces connecting the sample to the leads, g = J dS1 · σ(χί, x2) · dS2; this expression can be transformed using the integral equation of scattering theory to yield eq. (2.2). Since the conductivity tensor for a non-interacting system is given by well-known expressions involving the energies and current matrix elements of the exact eigenstates (Baranger and Stone 1989), this emphasizes that the conductance is not determined solely by the eigenvalues of the Hamiltonian. Therefore we find the approach developed below more natural and potentially more powerful than arguments applying random matrix theory to the ensemble of disordered Hamiltonians. Nonethe­less, Al'tshuler and Shklovskii (1986) have certainly given further insight into the origin of the universal conductance fluctuations of eq. (1.1) using the latter approach combined with microscopic calculations.

The two-probe conductance measurement described by eq. (2.2) does corre­spond reasonably well to one common type of experimental measurement, even in the limit of zero disorder in which it predicts a 'quantized contact resistance' (Imry 1986a) very similar to that recently observed experimentally (Van Wees et al. 1988, Wharam et al. 1988). However, the majority of experiments per­formed on mesoscopic conductors are so-called four-probe measurements in which a current is injected from a source into a sink, and the voltage induced 'across the sample' is measured by separate voltage probes. It is now clear that the behavior in this situation can differ substantially from that predicted by eq. (1.1), particularly when the voltage probes are spaced much less than an inelastic scattering length apart, and an adequate extension of the theory (Stone and Szafer 1988), based upon a multi-probe Landauer formula (Biittiker 1986) has recently been developed.

However for the general questions addressed here, relating to the minimal physical assumptions necessary to generate the universal statistical properties of disordered conductors, we do not need to consider the additional complica­tions introduced by the multi-probe theory. Moreover, in the more general theory, the fundamental statistical quantities are still moments of the transmis­sion matrix, so that the information obtained from the two-probe theory should still be of relevance in that context; although extensions of our approach to the multi-probe situation have not yet been made. We do however remind the reader of the necessity of using the multi-probe theory if one wishes to obtain a quantitative description (or even in some cases qualitative agreement) with many experiments on mesoscopic conductors.

378 A.D. Stone et al.

3. Transfer matrix formulation

3.1. Symmetries

We have just seen that the calculation of the quantum-mechanical conductance is equivalent to the calculation of the transmission matrix t through a disor­dered medium. The disordered medium generates multiple scattering, and the full S-matrix must describe these complex scattering processes. Unfortunately, the S-matrix itself does not satisfy a simple composition rule which allows one to easily determine the full S-matrix if that describing a single scattering event is known. Therefore we consider a different but related matrix, the transfer matrix, M , which has a simple multiplicative composition rule. We shall see below that this approach has both computational and conceptual advantages. We now define these quantities carefully for the system of interest, disordered conductors with only elastic scattering from the random potential.

Employing the two-probe model described above, we imagine the disordered system of interest to be placed between two semi-infinite perfectly conducting leads of finite width. We assume the existence of boundary conditions at the transverse surfaces which quantize the energy of the transverse part of the wavefunction; the theory is not sensitive to the detailed nature of the boundary conditions and so we take them to be infinite hard-walls for definiteness. Then in the perfect conductors, the scattering states at the Fermi energy satisfy the relation k2 = k2 + k2, where kf is the Fermi momentum, k the longitudinal momentum and kn the quantized transverse momentum. The various kn (n = 1, ...,N), which satisfy this relation such that fc2>0 define the Ν channels. Since each channel can carry two waves traveling in opposite directions, the wave function on either side of the disordered region is specified by a 2N-component vector: the first Ν components are the amplitudes of the waves propagating to the right, and the remaining Ν components are the amplitudes of the waves traveling to the left. The scattering matrix S relates the incoming flux to the outgoing flux

where /, Ο, Γ, O' are the N-component vectors describing the wave amplitudes on left and right, respectively. In this quasi-one-dimensional geometry, the S-matrix is a IN χ IN matrix of the form

(3.1)

(3.2)

where t is the Ν χ Ν transmission matrix which yields the conductance in

Random matrix theory and maximum entropy models 379

Just as with S, we can write Μ in terms of four Ν χ Ν blocks

m i m 2 \

eq. (2.2) and r is the JV χ Ν reflection matrix. (Throughout this work we shall frequently represent 2N χ IN in terms of their Ν χ Ν blocks, hence all 2 χ 2 matrices with boldface entries will represent Ν χ Ν block decompositions of IN χ IN matrices.) Current conservation implies that

|/|2 + |/f = |0|2 + |0'|2, (3.3)

which is equivalent to the unitarity of the S-matrix. Although the S-matrix determines the conductance through eq. (2.2), it does

not satisfy a simple composition rule suitable for introducing a scaling approach. Therefore, we instead consider the transfer matrix which contains the same information in a different form.

By definition, the IN χ 2N transfer matrix Μ relates the flux amplitudes on the left-hand side of the disordered region to those on the right,

(3.4)

(3.5)

and from the definitions (3.1), (3.2) and (3.4) one finds the relations

m x =(tt)"1, m 2 = r ' ( tr x, m 3 = - ( ! ) ' ' r , m 4 = (t')-\ (3.6)

t = ( m t ) - \ r = - m 41 m 3 . (3.7)

The flux conservation constraint on the transfer matrix is easily found by reexpressing eq. (3.3) in the form

|/|2-|0|2 = |Of-|/'|2, (3.8)

which means from eq. (3.4) that Μ preserves the hyperbolic norm of the vector (o). This defines a U(iV, N) (pseudo-unitary) matrix. An alternative way of expressing the flux conservation constraint on Μ is

Μ % Μ = Σ Ζ , (3.9)

or equivalently,

M L z M f = Σ Ζ. (3.10)

Here Σ ζ denotes the matrix

m '.)· ( 3 " )

380 A.D. Stone et al.

Here and below I designates the unit matrix, which will be either 2N χ 2N or Ν χ Ν depending on context, and Σ ί ? i = x,y,z denotes the 2N χ 2N analog of the Pauli matrices satisfying β ^ Σ , Σ ; = iL f c, Σ 2 = I. We see that the requirement of flux conservation implies that the transfer matrices, M , form a pseudo-unitary group U(iV, N); I2 = {2N)2 is the number of independent parameters specifying such a matrix, just as for the more familiar unitary group U(2N). If the Hamiltonian governing the system is invariant under the operation of time-reversal and we neglect spin (as we shall do throughout), then a second solution to eq. (3.4) is obtained by interchanging incoming and outgoing channels and complex conjugating the wave function amplitudes. This implies that our transfer matrices Μ must also satisfy the requirement

Μ * = Σ , Μ Σ Χ, (3.12)

where

Σ , = Ι λ . (3.13)

To implement this further constraint, we could perform a unitary transforma­tion on each Μ of the form

(3.14)

(3.15)

It is easy to show that the new matrices Ρ satisfy the relation

Ρ +ϋ Ρ = ϋ, (3.16)

where

J ~ ' * - ( i ".')• , 3 1 7 )

This means that the matrices Ρ are symplectic. This mapping from U(JV, N) becomes useful in the presence of time-reversal symmetry since it can be checked that the condition (3.12) on Μ implies that the Ρ matrices of eq. (3.14) must be real, Ρ = Ρ*, i.e. they form a real symplectic group Sp(2N, R), which is specified by Ix = N(2N + 1) parameters. This mapping to the real symplectic group was used in Muttalib et al. (1987), Mello et al. (1988a) and has the advantage of simplifying the time-reversal symmetry constraint used in the derivation of the relevant invariant measures (see below), but the disadvantage

Random matrix theory and maximum entropy models 381

of introducing a new set of matrices with no simple physical interpretation*. We shall not use this mapping in the present work. Transfer matrices Μ with time-reversal symmetry, i.e. satisfying both eq. (3.9) and eq. (3.12), are the rele­vant quantities to study when the sample is not subject to a magnetic field, and the scattering is spin-independent. This case is analogous to the well-studied orthogonal ensembles (Dyson 1962a,b,c, Mehta 1967, Brody et al. 1981), and is characterized by a symmetry parameter β=1. If a magnetic field is present, time-reversal symmetry is broken and our transfer matrices satisfy the requirement (3.9) alone; this case is analogous to the unitary (β = 2) ensembles (Dyson 1962a,b,c). The origin of this terminology and of the parameter β which measures the strength of the statistical correlations (level repulsion) in the ensemble (Dyson 1962a,b,c) will be explained shortly, when we review the properties of the standard ensembles. Throughout this work we will assume exact spin degeneracy of the channels, and therefore only mention briefly the effect of spin-orbit scattering which leads to the third and final universality class, the symplectic ensemble (β = 4). For the ensemble of disordered conduc­tors, this case has been studied thoroughly by Zanon and Pichard (1988), and it shows the behavior expected by the standard extrapolation from the orthogo­nal and unitary cases discussed here.

5.2. Polar decomposition and eigenparameters of Μ

Any matrix satisfying the U(N, N) (flux-conservation) condition (3.9) can be parameterized as (Mello et al. 1988a)

/ u ( 1) 0 \ /(Ι + λ ) 1 /2 λ1/2 \ / u ( 2) 0 \ M = = U r V , (3.18)

\ 0 u < 3 7 \ X1'2 (\ + Ιγ/*)\ 0 u< 47

where u ( I) (i= 1, 4) are arbitrary Ν χ Ν unitary matrices and λ is a real, diagonal matrix with non-negative elements ku ..., λη. We refer to this as the polar decomposition of Μ because the unitary matrices play the role of angular coordinates, while the {λη} play the role of radial or 'magnitude' coordinates. It is easy to check that a matrix product of this form satisfies (3.9); we will thus use this parameterization for our transfer matrices. We will discuss the unique­ness of the parameterization below.

Note that the Ν real parameters {λη} are not the eigenvalues of the transfer matrix; however it will be seen that they characterize the 'magnitude' of a U ( N , N) matrix in much the same way as a Hermitian matrix is characterized by its Ν real eigenvalues. An alternative interpretation of the eigenparameters

* Note, the matrix Ρ here is not to be confused with the real-space transfer matrix which connects the real-space wavefunction amplitudes on either side of the disordered region. The latter matrix has a dimension equal to the number of lattice sites in the transverse direction, whereas the former has a dimension equal to JV, the number of propagating channels. In general these two dimensions are not equal.

382 A.D. Stone et al.

{λη} are as the Ν doubly degenerate eigenvalues of the IN χ IN matrix X = i [ M f Μ + (M f M ) " 1 - 2 ] " l . This relation will be derived below and then used to establish the fundamental connection between the {λη} and the conduc­tance (Pichard 1984). We shall refer to the {λη} as eigenparameters of M, or X, to have a term of sufficient generality. We shall see that the parameters {λη} are the natural quantities appearing in the invariant measures for both the set of Μ and X matrices.

The time-reversal symmetry requirement (3.12) imposes the additional con­straints

U<3> = ( uu>) « , u( 4) = ( u ( 2 )) * . (3.19)

In this latter case, u ( 1) and u ( 3) give rise to N2 parameters each, and λ to Ν additional ones, in agreement with the expected number of parameters, /x = N(2N + \). The parameterization (3.18) is unique, except for a set of zero measure. However, in the absence of time-reversal symmetry, eq. (3.18) contains 4N2 + Ν parameters, i.e. Ν more than is needed to specify an arbitrary pseudo-unitary matrix. This redundancy of the parameterization arises because an M-matrix parameterized as in eq. (3.18) is unchanged if the unitary matrices appearing in the parameterization are subject to the transformation

U=*UU„, V=>UJV, (3.20)

where is a diagonal unitary matrix with identical Ν χ Ν blocks (hence specified by Ν real phase variables η). This redundancy of the parameterization (3.18) is easily handled as discussed by Mello and Stone (1990) and in section 11 below, and does not add any significant complications to our use of the above parameterization in the unitary case.

3.3. Relation of the transfer matrix to the conductance

Having established our canonical parameterizations, we can express the trans­mission and reflection matrices in terms of these parameters using eqs. (3.6) and (3.7). One then finds for the transmission matrices

t ' = m 4 1 = - ( u ( 4 )) f( l + λ Γ 1 / 2ι ι ( 3 ) ΐ, (3.21)

t = ( m l ) " 1 = u ( 1 )( l + λ ) " 1 / 2ι ι ( 2 ), (3.22)

and similar relations for the reflection matrices. From eqs. (3.21) and (3.22) we find

t f t = u ( 2 )( l + λ ) ' 1 u ( 2 ) t, f t ' * = u ( 4 )( l + λ ) " 1 u ( 4 ) t. (3.23)

From which we see that these two matrices have the same eigenvalues and are related by a unitary transformation. Thus the conductance, g, of eq. (2.2) is given by twice the trace of either matrix (the factor of two coming from spin degeneracy which we will assume throughout).

Random matrix theory and maximum entropy models 383

We can now establish the general relationship between the transfer matrix and the conductance which underlies this work. To the best of our knowledge, this relationship was first obtained by Pichard (1984), for the many-channel case. Using eq. (3.18) above we express the matrix Q = I V T M as

/ ι ι ( 2 )( Ι + 2λ)υ<2» 2 u ( 2 )V M i + ^ ) i i ( 4» \ Q = , . (3.24)

\2u( 4>yX(TTI)u<2 )t ι ι ( 4 )( Ι + 2λ)ιι<4» /

Q is Hermitian positive and has the U(iV, N) symmetry expressed by eqs. (3.9) and (3.10). From these symmetries it follows that if vn is an eigenvector of Q with eigenvalue qn9 then Σζνη is also, with eigenvalue q~l. Assume Qvn = qnvn, ΣζΟΣΐνη = 0.~ιΣζνη = q^Vn^ which implies the above statement. Thus the eigenvalues of Q are real positive numbers coming in inverse pairs:

qn = e\ q^^e-*-. (3.25)

These eigenvalues will be of interest below, because the logarithm of Q satisfies Osledec's theorem for random matrix products (Pichard and Sarma 1981). If NL is the number of terms in the random matrix product M , then the quantities

a" = ^ 2 ^ V " ' ( 3' 2 6)

are self-averaging and define the Liapunov exponents of the ensemble. We can explicitly evaluate Q _ 1 using our parameterization (3.18),

Q " 1 = ΣζΟΣζ

/ u ( 2 )( l + 2 X ) u ( 2 )t - 2 ι ι ( ν ( Ι + λ ) λ ι ι ( 4» Ν

\ - 2u<4) 7 ( ί + λ ) λ υ ( 2 ,ί u ( 4 )( l + 2λ)υ<4» ,

Hence

(3.24)

/ υ ( 2 )λ ι ι < 2» 0 \ X = i(Q + Q " 1 - 2 ) = . (3.28)

\ 0 υ( 4 )λυ< 4>7

We see that X is a 2N χ 2N block-diagonal matrix with Ν doubly-degenerate real positive eigenvalues {λ„} (if one includes spin degrees of freedom and assumes time-reversal symmetry, then X is 4N χ 4N with a quadruply degener­ate spectrum). Using eqs. (3.23), (3.24) and (3.28) above we find the matrix identity

/ t f t 0 \

384 A.D. Stone et al.

and finally by taking the trace and assuming spin degeneracy

^ 4 ϊ 7 ΐ = 4 ? ϊ τ β <"* where we have expressed g both in terms of the eigenparameters {λη} and in terms of the eigenparameters {v„ = cosh _ 1(2A w + 1)} which are related to the eigenvalues of Q.

Equations (3.18) and (3.30) express the fundamental relation between the conductance and the eigenparameters {λη} of the transfer matrix (or equivalently the eigenvalues of the matrix X ) . Note that this relationship holds even in the absence of time-reversal symmetry and spin degeneracy (exactly the same derivation works in the absence of spin degeneracy, with the modification that t is now a IN χ IN matrix connecting all momentum and spin channels. Note also that the conductance depends only on the eigenparameters {λη} and is independent of the matrix elements of the unitary matrices u ( I ). Thus all Μ or X matrices with the same values of {λη} have the same value of g. Since only g is physically observable (in disordered conductors) this naturally suggests that we consider probability distributions for these matrices in which matrices with the same {λη} are equiprobable. The appropriate way to define such a random matrix ensemble in terms of the invariant measure on the matrix space is discussed in the next sections. Before introducing this more formal discussion, let us examine qualitatively some implications of eq. (3.30) which were impor­tant in the initial stages of the development of this theory, and which motivate the subsequent rather mathematical sections.

3.4. Active transmission channels, UCF and Imry's conjecture

The implication of eq. (3.26) is that typically the parameters {v„} increase linearly with the length of the conductor, and hence the parameters {λη} in eq. (3.30) grow exponentially with L. Thus if one fixes the width of the conductor and simply increases the length, fewer and fewer of the Ν eigenparameters in eq. (3.30) will contribute significantly to g. Imry (1986b) introduced the notion that there are N e ff 'active transmission channels' for which v„ < 1, and that to a good approximation

g * N e f f. (3.31)

It should be emphasized that these 'channels' refer to the Ν eigenchannels of Q, and are not to be confused with the original momentum-space channels in terms of which the transfer matrix is defined. The fluctuations in JVe ff determine the expected fluctuations in g, and from this point of view the U C F initially appear anomalously small, since

var g « var J V e f f« O ( l ) ^ N( (3.32)

Random matrix theory and maximum entropy models 385

The naive expectation that the variance in the number of eigenparameters λη should be proportional to the average number comes from implicitly treating {λη} as uncorrelated random variables. Imry pointed out that as the {λη} are eigenvalues of random matrices, we should not expect them to be statistically uncorrelated, but rather to show much reduced fluctuations due to 'level repulsion'. Imry conjectured that the {v„} have statistical behavior identical to the eigenvalues of the standard random matrix ensembles (to be discussed in detail below), and then invoked a theorem concerning these ensembles due to Dyson and Mehta (1963) in order to argue for the existence of universal conductance fluctuations.

This argument (along with independent arguments by Altshuler and Shklov­skii (1986) suggested a fundamental explanation for the UCF, that they were a manifestation of the long-range suppression of eigenvalue fluctuations arising from level repulsion in random matrix ensembles. As such it freed the explana­tion from particular microscopic models, and gave much insight into the universality of the phenomenon.

However the Imry argument left many open questions before a complete theoretical explanation along these lines could be achieved. First, what is the nature of the random matrix ensemble describing disordered conductors, and is it really identical to the standard ensembles? (we have already indicated in the introduction that the answer to the latter question is in general negative). If not, do U C F follow simply from the Dyson-Mehta theorem? Second, should we regard the {λη} or the {vn} as the fundamental objects in the theory, and does this make a difference? Third, if the ensemble of disordered conductors is completely described by standard random matrix theory, how can it ever fail to exhibit UCF, as we know that it does in the localized regime? Clearly in order to answer these questions one needs to subject the ensemble of transfer matrices to further and more rigorous mathematical analysis to determine its properties and relationship to the standard ensembles. We begin to develop the necessary mathematical concepts for this analysis in the next sections.

4. Maximum-entropy probability densities

In the following section we introduce the concept of the information content of a probability distribution and hence define the term maximum-entropy hypothesis. We generalize the notion to treat probability distributions of matrices and illustrate how the integration measure can give rise to the phenomenon of 'level repulsion'. We then briefly review the properties of the standard random matrix ensembles for completeness. Since much of this material is outside of the standard repertoire of condensed matter physicists, the level of the discussion is intended to be elementary and pedagogical, and may be skipped by expert readers.

386 A.D. Stone et al.

4.1. Information content of a discrete probability distribution

Consider a probability distribution for a random variable d which can take Ν values dn with probabilities {pn}. For definiteness one may think of rolling dice with Ν sides, but without assuming at the outset that the dice are fair, i.e. that all sides are equivalent. It is intuitively clear that if we have no further information about the dice or the rolling procedure, then the hypothesis that best represents our state of knowledge of the system is that all results are equiprobable, i.e. that pn = l/JV, V n. This merely states that the rolling pro­cedure has Ν possible outcomes and at least one must occur each time (Σ„ pn = 1). Any other choice would state that at least one of the outcomes is more probable than another. This would clearly indicate that we have some further information about the system, even though that information is only of a probabilistic nature and does not tell us that a certain side will or will not come up in any given roll. Thus the probability distribution p° = 1/N, V η clearly corresponds to the distribution which contains the minimum information about our system. Since all values dn are equiprobable, it is also the least biased distribution in the sense described above. Furthermore, it is clear that if we choose two of the values to deviate from each other, that such a distribution would have less information (in general) than if we chose three. This suggests that one can define a quantitative measure of the information content in a discrete probability distribution; the standard definition is

I({Pn})= Σ PnMPn). (4.1)

n=l

The distribution which we expect to have minimum information, {p° } , yields I0 = — ln(Af). It is easy to check that if one adds an arbitrary small deviation to each pn,p„ = 1/N + δη, subject to the normalization constraint (Σ„<5η = 0), that the change in J is positive and of order <52, confirming that / 0 is a minimum (one can also show that it is an absolute minimum). Of course many functions of {pn} could be chosen which are minimized for {p„ = p°} . The choice J has the further advantage of being additive. Consider a second set of dice, with Μ possible outcomes d'm and probabilities {p'm}. The composite system has Μ Ν possible outcomes with probabilities {Pmn = p'mpn\ and it is again easy to check that I({Pnm}) = I({p'm}) +I({pn}). From the definition of J we see that the information content is negative or zero, zero corresponding to the maximum information distribution, in which one of the pn = 1 and all others are zero. One can then, if one chooses, define the information entropy as S = — I, hence {p° } is the maximum-entropy probability distribution. The term entropy is used because of the obvious analogy between the definition (3.1) and the definition of the thermodynamic entropy of a physical system. In the latter case one expects a system to maximize its thermodynamic entropy due to ergodicity. There is no general physical principle requiring that the information entropy

Random matrix theory and maximum entropy models 387

of an ensemble of physical systems with quenched randomness tends to a maximum. However, if the random variables in question obey a central limit theorem, then there may be in certain cases a deeper mathematical justification for assuming the ensemble has maximum information entropy, beyond merely expressing our ignorance of details of the distribution. This point will be discussed further below, and we are able in section 12 to derive a kind of central limit theorem justifying a maximum-entropy hypothesis for the ensemble of random transfer matrices.

Nonetheless, even in the absence of such a justification, from the above discussion of the system of dice, it is clear that without any further information, the simplest and least biased hypothesis is that the system is described by the distribution of maximum entropy. We refer to such an ansatz as a maximum-entropy hypothesis. Such an ansatz will then make definite statistical predictions which can be compared with sampling data. From this point of view we might choose the maximum-entropy distribution {p ° } in our above example of the system of dice simply because of our ignorance: the dice might be biased, and {p°} may be a poor choice, but in the absence of any further information it is our best guess. A second possibility is that we have examined the dice and seen that they are symmetric with respect to the different faces, and therefore have good reason to assume that all faces should have the same probability of coming up. This symmetry property then immediately leads to the distribution

This illustrates the point that if the possible values of a random variable are related by a symmetry operation, then in general this reduces the informa­tion content of the probability distribution. In such a case, where one has some justification for assuming a symmetry exists, then again the choice of the minimum information distribution can be justified, and is not just a choice necessitated by our ignorance. Thus determining or imposing the appropriate symmetries on the set of random variables under consideration is essential to obtain the correct statistical description.

4.2. Continuous probability densities and constraints

Consider now a continuous random variable x, a^x^b, with probability density p(x). We can generalize the notion of information entropy by defining

Slp(x)l = - dx p(x) ln [p (x ) ] , (4.2)

and then extremize this functional, subject to whatever constraints appropri­ately represent our state of knowledge of the system. In general the constraints will be of the form </i(x)> = cf where the angle brackets denote the expectation value, Ci are constants, and the number of functions /j(x) is equal to the number of constraints. One always has the constraint of normalization of p(x) corre­sponding to the choice fx = 1, cx = 1. In order to extremize S subject to the

388 A.D. Stone et al.

imposed constraints one must introduce Lagrange multipliers lh and then take the functional derivative of the functional S'[p] = — Σ, /;</;(*)>.

6S' = - j * dx δρ(χ) | ln [p (x ) ] + 1 - £ WwJ = 0, (4.3)

which implies

ρ(χ) = εχ ρΓγ /,/,(*)! (4.4)

where the constants lt must be adjusted to give the correct values for the c{ (the term unity in eq. (4.3) can be absorbed into the definition of / J . It is trivial to confirm that the maximum-entropy distribution when the only constraint is normalization is just a uniform probability density p(x) = (b — a)~1. Note that this becomes indeterminate if the interval on which p(x) is defined becomes infinite, and a second constraint must be imposed. The simplest choice is to fix the variance of p(x\ and we work this example explicitly because of its importance.

We find the maximum-entropy density p0(x) on the real line subject to the two constraints:

( l ) < l > = Cl = l; ( 2 ) < X 2> = C 2 = <72.

From eq. (4.3) we have p(x) = e l i + h x 2. Obviously l2 must be negative and assuming this we compute the normalization integral which yields the condition th-Jnl\l2\ = 1, relating Zx to Z2. Then using this relation one finds <x 2> = 1/(2|/2|) = σ2, which determines l2 = -1/ (2σ 2) , I, = - ^1η (2πσ 2) ,

Ρο(χ) = -7^=2ε-χ2,2σ2. (4.5) y/2naz

Thus the maximum-entropy distribution for a real variable on the interval — oo to oo with a fixed variance is a Gaussian. In fact, once the variance is fixed, p0(x) would be Gaussian on any interval (only the normalization would change). This illustrates that imposing a constraint typically changes the form of p0(x)-For a finite interval, normalization alone would yield a uniform density for p0, whereas the additional constraint of fixed variance changes this to a Gaussian.

This simple example also illustrates how a maximum-entropy hypothesis can be justified by a central-limit theorem. Obviously if the variable x in the above example is actually the sum of many random variables with a well-behaved distribution, then the central-limit theorem requires that this distribution (appropriately scaled) approaches a Gaussian; so the distribution of maximum information entropy is actually created by the composition procedure of sum­mation of random variables. Exactly the same situation will arise below, except that our random variables are matrices, and the composition procedure is

Random matrix theory and maximum entropy models 389

matrix multiplication (although we have not been able to prove a central-limit theorem of equivalent generality in this case).

4.3. Multivariate densities and the integration measure

An example relating to vector random variables illustrates the role of the integration measure and symmetry in defining the maximum-entropy distribu­tion. Consider a real random two-dimensional vector variable r. All points in the space of possible values of r can be labelled by Cartesian coordinates (x, y) and we can define a probability density p(x, y), and expectation values as </(**)> = </ fe ) ;) > = idxd};/?(x, y). However, all points in two-dimensional space can also be labelled by polar coordinates (r, 0), so what prevents us from defining </(*·)> = j dr d0 p(r, 0)/(r, 0), which clearly gives a different value for </>? The reason that we know the former definition is 'right' and the latter is 'wrong', is that we are accustomed to the normal Euclidean metric ds2 = dx2 + dy2, and we automatically assume that the appropriate integration mea­sure for our two-dimensional probability space is the one derived from that metric, i.e. dx dy = r dr d0.

Often this assumption is justified by the nature of the random process. Consider the example of a fountain which shoots out a stream of particles in all directions at once and that these particles travel a variable distance which would be described by a probability density px( r ) if the fountain only sprayed in one direction. Clearly the correct two-dimensional probability density p(r) will have a rotational symmetry which is most simply represented in polar coordinates, p(r, 0) = p(r). Then the probability of finding a particle in any given annulus of area 27rr dr is given by the product Inrp^r) dr; i.e. it consists of two factors, the probability of traveling a distance r from the cannon, and the measure of the region of space a distance r from the origin.

If we ask, using the formalism described above, what is the maximum-entropy probability density in two dimensions with a fixed value of < Γ · ι · > = σ2, we immediately obtain the result

Po(r,e) = p0(r) = -l-re-'2>°2. (4.6) \LTlO )

Since the density is independent of angle, it is natural to define a reduced probability density p0(r) = (1/2π) j d0 p0(r) such that the probability of finding a particle in the interval (r, r + dr) (independent of direction) is given by p0(r) dr,

Ρο(ή = - 2 ^ 2 , σ\ (4.7) ο"

so that <f(r)> = \Ιπ άθ Jg> dr p0(r, 0)/(r) = Jo" dr p0(r)f(r). The important point to notice is that unlike the maximum entropy distribu­

tion for a single real variable (which is maximum at the origin, see eq. (4.5)

390 A.D. Stone et al.

above), p0(r) vanishes as r ->0 . We see that this effective 'repulsion' of the particles from the origin, is actually entirely due to the underlying two-dimen­sional character of the integration measure for the space: the measure of the points near the origin is zero, and as long as p(r, Θ) is smooth in this neighbor­hood p0(r) will vanish. Trivial as this example is, we shall see that 'eigenvalue repulsion' in the random matrix ensembles arises in exactly the same manner.

5. Maximum-entropy matrix ensembles and level repulsion

5.1. Level repulsion in the 2 x 2 orthogonal ensemble

As a final pedagogical example, let us generalize the above ideas to probability densities of random matrices by considering an ensemble of 2 χ 2 real symmetric random matrices,

where hi} are independent real random variables between oo and — oo. It is natural to then regard the ft0 as 'Cartesian coordinates' and propose an inte­gration measure for the matrix space of the form d H = d f t n dh22 dh12.

More formally, we can define a metric on the matrix space based on the independent infinitesimal variations allowed by the symmetry of Η (Balian 1968):

ds2 = dh2, + dh222 + 2dh22 ee t 8*» dK dhp. (5.2) *.β

The measure, or volume element d H is (up to a constant factor) the one derived from this metric by the usual rule dV= Π α yfg^ dha. The factor-of-two multiplying dh\2 comes from the fact that when h12 is varied by an amount dh12, h21 varies by the same amount. Note that the metric is defined in a vector space equal to the number of independent elements of Η allowed by symmetry. The index α in eq. (5.2) runs over all independent pairs of matrix indices (not over the dimension of the matrix), e.g. for the case of Ν χ Ν real symmetric matrices α = 1, ...,N(N + l)/2. This method for defining the integration mea­sure will be used in the new and more complex case of the matrices Q considered below.

Returning to our simple example, we shall not initially consider the maxi­mum-entropy distribution for H , but a more general case in which the diagonal elements are specified by a density p{hu) and the off-diagonal elements by P'(h12). To understand the origin of level repulsion, we wish to analyze the probability distribution of the eigenvalue spacing, s = (Xjy/l^E^ — E2) =

Random matrix theory and maximum entropy models 391

( 1 / N/ 2 ) (E+ — £ _ ) ; where by finding the roots of the characteristic polynomial of Η we have E± = (1/^2)[σ± y/A2 + 2 / i 2

2] , and σ = ( Ι / ^ / Ϊ Ρ η + h22\ Δ = (\/y/2)(hil — h22). One immediately see that s = yj Δ2 + 2h\2~\ and that in order for the spacing to vanish we must have hll = h22, h12 = 0; i.e. the eigenvalue spacing only vanishes along a line in the three-dimensional matrix space defined by the coordinates hi}. In our previous example the set corre­sponding to r = 0 was a point in a two-dimensional space and we found the measure vanished linearly at that point; here, where the set of interest is a line in three-dimensional space, we shall also find linear level repulsion.

Since the transformation from the variables h i i9 h22 to the variables σ, Δ has Jacobian unity, the eigenvalue spacing distribution can be obtained from the integral

p(s) = -7= i°° da Γ άΔ dh

where we have rescaled h12, h = J~2 hl2. We wish to expand p(s) to obtain the leading behavior for small 5. The small-

5 behavior is determined by the region of integration where Δ is small, so we can expand the factors

Note that since ftn and h22 are uncorrelated random variables, in general there should be no special or singular behavior of the function Δ) along the line Δ = 0 and we can make such an expansion and truncate at lowest order. Then the integral over σ just yields a trivial constant which can be absorbed into the normalization of p(s). We then can switch to polar coordinates in the Δ-h plane, J άΔ dh - > 5' ds' άθ (sf = Δ2 + h2); the radial integration eliminates the δ function yielding:

Q COS θ Μ p'[?—^)=sp(s)~s9 s ^ O , (5.5)

y/2

where again we have assumed that for typical probability densities p'(h12\ the function p(s) defined above will be regular as s 0.

Equation (5.5) establishes that the probability of two eigenvalues being very close vanishes linearly with their separation under quite general conditions for ensembles of random real symmetric matrices. In the above argument we used nothing but the statistical independence of the matrix elements of Η and the

392 A.D. Stone et al.

regularity of their densities on the line A n = h22. Even if the {fcy} were corre­lated, as long as that correlation did not cause any singular behavior along the line h11 = h22 (e.g. an s~1 divergence of the probability density), the linear level repulsion would survive at short spacings. Level repulsion has this robust and universal character because it arises from the nature of the measure on the matrix space: the set of points on which the spacing vanishes is so small that unless a very special distribution of matrix elements is chosen to concentrate the probability density on this special set of points, the probability density of finding equal eigenvalues will vanish.

5.2. Level repulsion in the 2 x 2 unitary ensemble

The above arguments should also make it clear why the strength of the level repulsion depends on the symmetry of the matrices. If instead of considering real symmetric matrices, we had considered Hermitian matrices, the only change would be that now the off-diagonal element of Η is complex: Re hl2 + ilm hl2. Hence the number of independent parameters is now four, leading to a four-dimensional integration measure for the matrix space d H = dhlx dh22 d Re h12 d Im h12. However the only change in the expression for the eigenvalues £ ± is the replacement h\2 - » | f t i 2| 2 = R e [ / i 1 2] 2 + I m [ / i 1 2] 2. Thus the spacing still only vanishes on a one-dimensional subspace of the enlarged matrix space. One immediately sees that the only change in eq. (5.5) for the spacing distribution is that now there is a three-dimensional Cartesian inte­gration over Re hl2, Im h12, A and two densities factors p'(Re hl2\ p"(Im h12). Changing to three-dimensional spherical coordinates, a trivial extension of the earlier argument yields p(s)~s2 at small spacings, i.e. quadratic rather than linear level repulsion at small spacings. This type of level repulsion is character­istic of the unitary random matrix ensembles, describing Hamiltonians for systems lacking time-reversal symmetry.

5.3. Gaussian orthogonal and unitary ensembles

The above argument for the behavior of the spacing distribution of the orthogo­nal and unitary ensembles at small separations is quite general. If we make a special choice for the densities p(ha), p'(Re fty), p"(Im ft^), taking all densities to be Gaussian with zero mean, and the variance of the diagonal elements var hit = w2 to be twice that of the off-diagonal elements, we can compute straightforwardly the spacing distribution for all s for either the symmetric or Hermitian case:

ρ^) = ο^^2ι^\ (5.6)

where the value β = 1 corresponds to the Gaussian orthogonal ensemble (GOE) and β = 2 corresponds to the Gaussian unitary ensemble (GUE) , and cp is a

Random matrix theory and maximum entropy models 393

normalization constant. Equation (5.6) is the famous 'Wigner surmise' (Brody et al. 1981) for the spacing distribution.

5.4. Joint probability densities

If we consider the special cases of the GOE, G U E we can obtain the entire joint probability density (JPD) for the eigenvalues simply by omitting the integration over σ in eq. (5.3). One finds

P(E+9E.) = CP(E+ - £_ f exp|^ - ^^f"^, (5.7)

where again the linear or quadratic level repulsion at short distances comes simply from the integration measure. By more involved arguments one can show that this result generalizes to similarly defined ensembles οϊ Ν χ Ν real symmetric or Hermitian matrices (Brody et al. 1981, Mehta 1967):

P({E„}) = C M ft |£.- EJ Π e x p [ - £ n W ] , (5.8) m<n η

which are the joint probability densities for the G O E (β = 1) and the G U E (β = 2). It has also been shown (Mehta 1967) that for the Ν χ Ν case, eq. (5.6) is not exactly correct at all spacings, but is a remarkably good approximation. The basic reason that eq. (5.7) generalizes to the Ν χ Ν case is that the prob­ability of more than two eigenvalues being near is so small, that the phase-space argument in the Ν χ Ν case reduces to that in the 2 x 2 case.

5.5. The GOE and GUE as maximum-entropy distributions

From the above discussion it might appear that the G O E and G U E correspond to rather special choices, but if we return to our earlier discussion of maximum-entropy densities and ask what is the most random density of the matrices Η consistent with the constraint <Tr H H f > = ll9 we immediately find

Po(hll9 h229 h12) d f c n dh22 dh12 ocexpCJ^h?! + h\2 + 2fc?2)]

x d / i n ah22 dh12, (5.9)

where the factor-of-two multiplying h\2 in the exponent comes from the fact that the terms h\2,h21 are equal in the trace. One sees that this corresponds exactly to the 'special' choice made above, and thus upon change of variables and integration leads to eqs. (5.6) and (5.7) for the eigenvalue and spacing distributions. Hence the Gaussian ensembles are the maximum-entropy ensem­bles consistent with a fixed value for the expectation value of H H f (Balian 1968). This result trivially generalizes to the Ν χ Ν case. The property of the trace constraint, that it is unchanged by orthogonal transformations in the real symmetric case and by unitary transformations in the Hermitian case is the

394 A.D. Stone et al.

origin of the terms G O E and GUE. The significance of this invariance is that all matrices in the ensemble with the same eigenvalues will have the same probability of occurrence. This is a natural symmetry to expect under many circumstances and motivates the choice of the constraint. However, it is also clear that many more general choices could be made with the same invariance (e.g. one could fix the expectation value of any function of the trace) and this would lead to a different form of the second factor in eq. (5.8). It is only the first factor in eq. (5.8) which comes from the integration measure and which determines the strength of the level repulsion which is truly universal (excluding anomalous correlations as discussed above).

6. Properties of standard random matrix ensembles

6.1. Coulomb gas analogy

Dyson (1962a,b,c) was the first to point out that in order to get some intuitive feeling for the JPD of eq. (5.8) one could formally interpret it as describing the Boltzmann weight for a classical statistical mechanics problem. If one rewrites

P ( { £ „ } ) = <:,,,„ e x p [ - / ? { - £ l n | £ „ - £ J + £ ; £ 2/ ( 2 w 2) [_ m<n η

(6.1)

one sees that this corresponds to the Boltzmann weight for a gas of 'charges' constrained to move in one dimension (with positions given by the locations of the eigenvalues) interacting with the two-dimensional Coulomb repulsion, and subject to a quadratic potential centered at the origin. The effective temper­ature of the gas is KT= β~χ = 1, \, i, depending on the symmetry of the ensemble. The quadratic potential confines the gas to the region around the origin, preventing it from flying apart due to the repulsive interaction, and determining the charge (eigenvalue) density distribution. The density is not uniform, since the problem is not translationally invariant, instead we shall see below that the density function is to a very good approximation a semi-circle centered at the origin. Dyson (Dyson 1962a,b,c, Dyson and Metha 1963) considered the circular ensembles describing random S-matrices, in which the eigenvalues are complex numbers of modulus unity, and showed that their maximum-entropy JPD is given by eq. (5.8) without the Gaussian factors (no confining potential), so in this case the ensemble is translationally invariant and the density is uniform. Often in studying the GOE, GUE, and related ensembles one considers the limit of very large matrices (N oo), and intervals of the spectrum over which the density variation is negligible (Mehta 1967). Then the ensemble has 'local' translational invariance, and many statistical properties can be calculated exactly analytically. Moreover the calculated prop­erties will be 'universal' across different ensembles, because only the integration

Random matrix theory and maximum entropy models 395

measure for the space is relevant. We shall see that in our ensemble describing disordered conductors, this is not the only physically relevant limit, and one must analyze more general situations. In particular we will be interested in regions of the spectrum which do not satisfy the condition of local translational invariance. In order to discuss such cases it is useful to review briefly a more general approach due to Mehta (Mehta 1967, Mehta and Gaudin 1960), called the method of orthogonal polynomials, which is particularly simple when treating ensembles with β = 2 (unitary case). We note that the method can be adapted to the cases β = 1 , 4 , which leads to substantial mathematical complications, but qualitatively similar results for the correlation functions.

6.2. Method of orthogonal polynomials for GUE

Consider the factor Π ^ , , \En — Em\ in eq. (5.8) whose square will appear in the JPD for the case β = 2. It is well-known that such a factor can be written as a determinant whose m, η entry is ( £ m) w -1 (the so-called Vandermonde determi­nant). The Gaussian factors in the JPD,

ft e x p [ - £ n2] = f [ e x p [ - i £ 2 ] ft e x p [ - i £ ? l η η η

can be divided into square roots as shown (we take 2w2 = 1 here for simplicity), and each square root absorbed into one of the determinants by multiplying the mth row by exp[— jE%]. Since each row then has entries [Em)n~l exp[— iJB2], simply by adding lower columns to the nth column and multiplying each column by appropriate constants the m η entry of each deter­minant becomes </>„_!(£„,), where φρ(χ) = Hp(x) exp[— i * 2 ] , is the pth wave-function of the simple one-dimensional harmonic oscillator. The JPD of the eigenvalues En is then proportional to the square of a determinant of such wavefunctions, i.e. it is in exact formal analogy to the probability density of the ground-state (Slater determinant) wavefunction of Ν non-interacting fermi-ons with coordinates En, in the external potential of a simple harmonic oscillator with m = co = h = l. Because of this exact analogy it is trivial to evaluate expectation values of functions of the eigenvalues in terms of these 'wavefunc­tions' using the well-known properties of Slater determinants which lead to the rules for second quantization of fermion operators. For example the eigenvalue density is given by

p(E) = N f°° d E 2 . . . d E , » J — OO

= ( Σ δ(Ε - £ . ) ) = *Σ Φ2ΛΕ), (6-2)

\ π = 1 / η=0 and the two-point correlation function (the analogue of the density-density

396 A.D. Stone et al.

correlation function for the fermions) is

R2{E,E') = ( Σ δ(Ε-Εη)δ(Ε'-Εη))

= Υ φΙ(Ε')φ2η(Ε)- Υ φη(Ε')φη{Ε)φη(Ε')φη{Ε). (6.3) m,n - 0 m,n - 0

These relations are exact and do not require that R2(E, E') = R2(E — E'). Note also that the particular form of the fictitious 'external potential', in this case a harmonic well, was determined by the Gaussian weight functions in eq. (5.8), which led us to construct Hermite polynomials in the Vandermonde determi­nant. A different weight function would have required constructing a different set of orthogonal polynomials corresponding to the solution of a different one-dimensional Schrodinger equation. We shall see below that the disordered metallic conductors naturally generate an ensemble whose correlation functions are related to Laguerre polynomials.

6.3. Correlation functions of GUE and spectral rigidity

Using this formal analogy it is possible to make a simple physical argument to evaluate the mean density (it is possible to do so more rigorously using the asymptotic properties of Hermite polynomials). If we look at the limit Ν > 1, the quantities φ2{Ε\ η > 1 in eq. (6.2) will approach closely the classical prob­ability density for a harmonic oscillator with the energy η + \ (in these units, remember Ε is now playing the role of the coordinate x). The classical density is just proportional to the time spent in the interval £, Ε + d£, hence pn(E) dE = dt(E) = dEjvn(E\ where v(E) is the classical velocity at position E. By 'energy' conservation n = \v2 + \E2 vn = ^Jln — E2 (where we have dropped the zero-point energy because η > 1, and υ = 0, Ε2 > 2ή). The density of states for a one-dimensional harmonic oscillator is simply proportional to the energy n, hence converting the sum in eq. (6.2) to an integral yields

p(E) = *Σ φ2„(Ε) ~ Γ dn 1 = ^/2Ν-Ε2, Ε2 < 2N. m = o JE2/2 y/2n — Ez

(6.4)

This is the famous Wigner semi-circle law (Wigner 1953, 1955, 1957, 1958) for the eigenvalue density. Note that for Ν > 1 there is large interval around the origin in which the density is approximately constant, and it is in this interval that fluctuation measures are usually calculated and expected to be universal. In this case, in which one can ignore the variation in the eigenvalue density, Dyson and Mehta (Dyson 1962a,b,c, Dyson and Mehta 1963) showed that the 'Coulomb gas' we are considering has remarkably small density fluctu­ations due to the long-range logarithmic interaction. This can be illustrated by

Random matrix theory and maximum entropy models 397

considering the reduced two-point correlation function,

T2(E, E') = p(E)p(E')-R2(E,E')

= ΣοΦ»·(Ε)φη(Ε) [Ε-Ε'Ϋ (6.5)

where the last identity can be obtained from the asymptotic properties of the Hermite polynomials (Erdelyi 1953) for Ν > 1, Ε', Ε <ζ Ν. Note the dependence of T2 only on Ε — Ε', reflecting the translational invariance of the ensemble in a region of uniform density. The rapidly oscillatory nature of this function is not surprising because for small argument and large Ν the harmonic oscillator wavefunctions should behave roughly like plane waves with k ~ >/iV. However, the slow power law decay has some important implications.

For example, Dyson and Mehta (1963) considered the behavior of linear statistics on an interval of the spectrum of size E0, where Ν > E0/AE > 1, and AE is the level spacing (i.e. the interval Ε on average contains many eigenvalues, but many fewer than the total number, N). A linear statistic A was defined as

where / is a given function of En and θ0(Εη) is unity if En is in the interval E0, and zero otherwise. The variance of any linear statistic is calculable from p(E) and T2(E, Ε') (the exact expression will be given in section 9 below).

The choice / = 1 gives the number of eigenvalues in the interval (A = n). Obviously, <n> = Ε0/ΔΕ = n0. Because of the strong correlations between eigenvalues we expect number fluctuations to be strongly suppressed. In fact, the large peak in T2 at Ε = E' cancels the normal linear fluctuations; however, the variance of η involves an integral over all E, E' in the interval. Because of the slow decay of T2(E — Ε') ~ (E — E')~2, this integral depends logarithmically on the size of the interval, i.e. one finds

var nocln n0, (6.7)

in both the orthogonal and unitary ensembles. This is a very significant result: instead of var η oc n0, as one would expect for

uncorrelated (or short-range correlated) charges, the long-range interaction leads to an almost total suppression of charge (eigenvalue) density fluctuations. Typi­cally, these charges form an almost equally-spaced (crystalline) array over many intercharge spacings! This phenomenon is referred to as spectral rigidity and is easily visible to the eye if one compares a level sequence generated according to G O E statistics, with one generated by Poisson (uncorrelated) statistics (Brody et al. 1981). In fact, this phenomenon of spectral rigidity is the long-range manifesta­tion of the eigenvalue repulsion, and it is this long-range manifestation which is relevant to the phenomenon of universal conductance fluctuations.

Ν

Λ = Σ/(Ε„)Θ0(Ε„), (6.6)

η

398 A.D. Stone et al.

A standard measure of this long-range rigidity of the spectrum considered by Dyson and Mehta is the A3 statistic, which measures the mean-square deviation of the staircase function N(E) = Σηθ(Εη — Ε), (— ?E0 <E< jE0) from the best-fit straight line. Since this statistic is still essentially measuring the fluctuations in the number of eigenvalues in an interval, one expects to find again a logarithmic dependence on the size of the interval. This expectation is confirmed, for the G O E the result (Dyson and Mehta 1963) is,

< Λ 3( £ 0) > = (1 / π 2) [1η (£ 0/Δ£ ) - 0.0687], (6.8)

and a similar formula can be calculated for the GUE. However an even stronger consequence of spectral rigidity proved by Dyson

and Mehta (1963) is that if one considers the variance of linear statistics where f(En), is not constant, but decays sufficiently rapidly at the endpoints of the interval, then even the logarithmic dependence on the size of the interval disappears. In this case

var/ loc i , (6.9)

i.e. it is a universal number independent of the average number of eigenvalues in the interval, n0. It is this result which we refer to as the Dyson-Mehta theorem. Clearly the result is highly reminiscent of universal conductance fluctuations, and from eq. (3.30) above we see that the conductance is a linear statistic on the eigenvalues of X (or Q), except that it sums over all eigenvalues and not just those in a restricted region of the spectrum. Earlier work of Imry (1986) and Muttalib et al. (1987) simply invoked the Dyson-Mehta theorem to derive UCF; however, we shall see below that because of the breakdown of translational invariance near the origin in the ensemble of disordered conduc­tors, the conditions of the Dyson-Mehta theorem are not satisfied. Below we shall derive U C F from a correct treatment, which does not assume translational invariance of the ensemble.

7. Invariant measure for Q and X matrices

7.1. Definition of the invariant measure

The previous discussion has made it clear that the first order of business in analyzing the random matrix ensembles which characterize disordered conduc­tors is to derive the integration measure which we expect under general condi­tions to determine the nature and strength of the level repulsion in the ensembles. After we have derived this measure, we are in a position to make a maximum-entropy hypothesis which will fix the statistical properties of the ensembles.

Our goal in this section is to express the volume element in the relevant matrix space in terms of the parameters {λη} (or {v„ } ) and the matrix elements

Random matrix theory and maximum entropy models 399

of some auxiliary matrices which are needed to fully specify a point in the matrix space, but do not enter eq. (3.30) for the conductance. We shall then assume that the probability density characterizing the ensemble is independent of these auxiliary matrices, and integrate them out using the appropriate measure to obtain a reduced probability density for the parameters {λη} alone. The resulting probability density, Ρ({λη}) will consist of two factors, exactly as did the reduced density p0(r) in section 4,

Ρβ({λη}) = Μ{λ*}Μ{λ*})> (7.1)

where w(U„}) άλη gives the probability that the set {λ„} lies in an interval {[Λη Κ + άλη~\} (i.e. it is the probability density corresponding to an ΛΓ-dimen-sional space), and the Jacobian Jp(&n}) gives the measure of each {λη} in the much larger matrix space. We should mention that since the matrix space of interest is non-compact, the integration over the volume of that space will diverge, unless some weight function w({A„}) is introduced to reduce the prob­ability of very large values of λη. This is exactly analogous to the situation in the G O E (GUE) , where the Gaussian factors in eq. (5.8) are necessary to define a normalizable reduced probability density.

We saw in section 2 that each transfer matrix Μ can be parameterized by Iβ real parameters with I2 = 4 N 2, Ix = IN2 + N; in the former case the ensemble of disordered conductors is the full group U(iV, JV), in the latter case it can be shown (Mello et al. 1988, Muttalib et al. 1987) that the set is isomorphic to the group Sp(2N, R) of IN χ IN real symplectic matrices. One natural approach, taken in Mello et al. (1988) is to derive the invariant measure for these groups and express it in terms of the parameterization (3.18). Then integration over the matrices u ( i) according to the invariant measure of the group U(N) yields •MWn})- This is necessary if one wishes to analyze the statistics of such quantities as the transmission coefficient of a particular channel, which depend on the u ( 0

explicitly. However, in this work we shall focus on the statistics of the conduc­tance, which is independent of the u ( 0, and so we shall directly derive the measure for the set of matrices Q = M f M . This approach has some convenient technical features, and allows us to follow very closely the logic of Dyson's famous derivation of the relevant measures for the circular ensembles (Dyson 1962a,b,c). The derivation given below is an expanded and slightly generalized version of that given in Muttalib et al. (1987).

Each transfer matrix IN/I corresponds to a point in an /^-dimensional matrix space, an infinitesimal neighborhood of that point d M can be assigned a measure μ{άΜ) = άμ(Μ). The invariant measure for the group U(N,N) is defined (Hammermesh 1962) as the measure invariant under left or right multiplication by any fixed element IVT of the group: άμ(Μ) = άμ(Μ'M) = άμ(ΜΜ') (Hammermesh 1962); this definition corresponds to the intuitive notion of giving each element of the group equal weight when integrating over the matrix space (i.e. over all elements of the group). We will define a measure

400 A.D. Stone et al.

for the set of matrices Q = M f M (which do not form a group), which has the corresponding invariance property. Under the transformation M - > M M ' , Q - * M^QIVT; so we impose the condition on the measure d/x(Q),

άμ(0) = ά/ι(Μ'*<1Μ'), (7.2)

where M ' is any fixed element of U{N9 N). We factorize

Q + d Q = M f ( l + d V ) M , (7.3)

and define

ind

d/i(Q) = ndKmn (7.4) m,n

where, as in section 4, the product is only over the independent elements of d V . We must first show that the measure defined by eqs. (7.3) and (7.4) has the invariance (7.2).

7.2. Uniqueness of the invariant measure

Our proof of eq. (7.2) proceeds in two steps: (1) We show that if the definition (7.4) is unique in the sense that it is

independent of the choice of the matrix Μ factorizing Q, then dμ(Q) has the invariance (7.2).

(2) We show that dμ(Q) is unique in this sense. To verify statement (1), consider a matrix Q' = M , tQ M ' so that

d Q ' = M ' f d Q M ' = M ' W d V M M ' = M t d V M . Thus if we are free to fac­torize Q' in terms of any transfer matrix without changing its measure as defined by eq. (7.4), then we can choose to factorize it in terms of M . Hence the matrix d V determining d^(Q') is simply equal to d V which determines dμ(Q) and trivially dμ(Q') = dμ(Q). This proves statement (1).

To prove that dμ(Q) is unique we must analyze the symmetries of d V which follow from those of Q . Q is Hermitian, U ( N , N) and positive. Since Q + d Q must also be Hermitian, d Q = M f d V Μ and hence d V are also Hermitian:

d V f = d V . (7.5)

Since Q + d Q must also have U(iV, N) symmetry, we have (Q + d Q ) L Z( Q + d Q ) = Σ ζ, which implies

0 = d Q L ZQ + ΟΣζ d Q

= M f ( d V Σ 2 + Σ ζ d V ) M ^ { d V , Σ ζ } + = 0, (7.6)

where { } + denotes the anticommutator. If the time-reversal symmetry con­straint (3.12) is applied to Q + d Q , one arrives at the additional condition

Σ χ d V Σ χ = d V * . (7.7)

Random matrix theory and maximum entropy models 401

Writing d V in terms of Ν χ Ν blocks, and imposing the conditions (7.6) and (7.7), one finds that d V is off-block-diagonal and of the form

(7.8)

Thus d V is specified by one Ν χ Ν complex matrix, i.e. 2N2 real parameters, and consistent with the definition (7.4), we find

d j i ( Q ) = ft avlmndv2mn, (7.9) m,n = 1

where we introduce the notation dv^ = Re [d i ; m n] , dvi„ = ImCdt;,™] (henceforth superscripts on matrix elements will always denote their real or imaginary parts).

If we also assume time-reversal symmetry (eq. 3.12), then we have an addi­tional constraint on d v

d v = d v T, (7.10)

so d v is symmetric and the product in eq. (7.9) would only involve the N(N + 1) independent elements.

Suppose we have two factorizations, Q + d Q = M t ( l + d V ) M = M f ( l + d V ) M ; we now wish to prove that dμ(Q) = d/i(Q) where d/i(Q) is the measure derived from d V . First we note that Μ = ( M f ) _ 1 M f M = U M , where we now show that 0 has both U ( N , N) and U(2N) symmetry.

Since ( I V f ) " 1 = Σ Ζ Μ Σ Ζ we have that 0 = Σ Ζ Μ Σ Ζ Μ Ι , 0f = ΜΣΖΙ\/ΓΣΖ, and

0+ϋ = Μ Σ ζ Μ ί Μ Σ ζ Μ ι = Μ Σ ζΐ ν Γ Μ Σ ζΐ ν Γ = Σ ζ = I. (7.11)

so 0 is unitary. In addition,

Σ ζ U ^ z = Σ ζ Μ Σ Ζ M f = ( M f ) " 1 M f = 0 " \ (7.12)

so 0 is also pseudounitary. It thus follows from eq. (7.3) that

0 d V 0f = d V , (7.13)

where 0 is both unitary and pseudounitary, and d V is of the form (7.8). We thus simply need to prove that the measure defined by eq. (7.9) is invariant under the transformation (7.13).

The transformation (7.13) can be built up by a succession of infinitesimal transformations δϋ = I + ίεΗ where Η is Hermitian due to the U(2JV) symmetry and satisfies the additional condition [ Σ ζ , Η ] = 0, due to the U ( N , N) symme­try. Hence

" = (" . " ) > ( 7 1 4)

402 A.D. Stone et al.

with h, h' Hermitian. We thus have d V = δ 0 d V 5Uf, which when evaluated explicitly,

l + k h 0

0 I + ίεή'

(7.15)

yields to order epsilon,

d v = d v + ie(dv h - h' dv). (7.16)

Equation (7.16) defines a (linear) coordinate transformation between the 2N2

variables dv„n and di;^,,; to prove that the measure (7.9) is unchanged by this transformation we need only prove that the Jacobian of this transformation is unity. The Jacobian J is of the form J = d e t ( l + εΚ), where Κ is a constant matrix (independent of dv); hence

J = exp[Tr { ln ( l + εΚ) } ] * e e T r { K }, (7.17)

so we need only prove that Κ is traceless, to prove J = 1. The diagonal elements of the matrix I + ε Κ are

^ = (1 + l f i h2 n _ ^ 2 ]} = j (7 1 8)

where the final identity follows because h, h' are Hermitian and thus must have real diagonal elements. We see from eq. (7.18) that Κ is traceless, hence the Jacobian J is equal to unity.

If we impose the time-reversal symmetry constraint (3.12), which gives the additional constraint (7.10) on dv , this leads to the relation h' = — h*. The linear transformation (7.16) must now be regarded as relating only the N(N + 1) independent elements of dv , dv. One can then prove that the relevant Jacobian is still unity in exactly the same manner as above, as again only the diagonal elements of a Hermitian matrix appear in the analog of eq. (7.18).

We conclude that in either case (β =1,2) the measure defined by eqs. (7.3), (7.4) and (7.9) is independent of the factorization of Q in eq. (7.3). Since the measure is unique, it is invariant under the U ( N , N) transformations defined in eq. (7.2). We thus propose the term pseudo-unitary ensembles for the ensemble of Q matrices, since their measure is invariant under U ( N , N) transformations just as the orthogonal or unity ensembles have measures invariant under the corresponding transformations. It must be remembered however that for the time-reversal symmetric case (β = 1), the U ( N , N) matrix IVT in eq. (7.2) is not arbitrary, but must satisfy eq. (3.12).

Random matrix theory and maximum entropy models 403

7.3. Derivation of dμ(Q)

We now proceed to derive the unique, invariant measure defined by eqs. (7.2)-(7.4) in the form needed for determining Ιβ{{λη}). To do this we use the diagonalizing equation for Q to express the infinitesimal matrix d V (whose independent elements determine the measure) in terms of {λ„}.

Since Q is Hermitian it is diagonalized by a unitary matrix,

Q = U D U f , (7.19)

where the diagonal matrix D has real positive elements Dmn = qndmn, n = 1,..., 2iV, and the unitary matrix U has column vectors vn given by the eigenvec­tors of Q. As proved in section 2, the eigenvalues qn = e±Vn are real, positive and come in inverse pairs. Since all qn > 0, we divide them into two groups, those greater than unity and those less; we choose the ordering q\> q2 ··· >qN>\; qN+n = qn1- Defining an Ν χ Ν diagonal matrix q containing the elements qn > 1, we have

(7.20)

This choice determines U , up to a set of overall phase factors ei(f>n which can multiply each of the column vectors (eigenvectors) vn. This means the matrix U is still undetermined up to the transformation U = > U U 0 , where (U<t>)mn = είφη dmn- We shall shortly fix Όφ to eliminate this freedom.

Since Q is pseudo-unitary, we can derive a further constraint on U from ΣζΟΣζ = L z U D U f L z = Q 1 = U D 1 U f , which implies

( U % U ) D ( U % U ) = U 0 D U 0 = D " 1, (7.21)

where the matrix U 0 is unitary and Hermitian, so U Q = I. It then follows from eq. (7.21) that

D U 0 D = U 0 . (7.22)

By explicit multiplication of Ν χ Ν blocks, using eqs. (7.19) and (7.20) for D, and imposing U 2, = I one finds that

U 0 = Λ 1, (7.23)

with umn = el<t,»5mn. N o w by making a specific choice of U and fixing the phase freedom

mentioned above, we can simplify the constraint imposed on U by the pseudo-unitary symmetry of Q. Assume we have made an arbitrary choice of U consistent with the choice we have made for D, then the diagonal matrix u in

404 A.D. Stone et al.

eq. (7.23) is characterized by Ν real phase variables with arbitrary values (this is another incarnation of the N-fold phase redundancy discussed in connection with the parameterization (3.18) above). As discussed above, we are then free to choose a matrix of the form given above, and U , = U U ^ will also diagonalize Q and be consistent with our ordering of D. We make the choice

u 0 u * i . Λ?· ,7·24)

and then find U ' f L ZU ' = U ^ U q U ^ = Σχ. With this choice (dropping the primes on U for simplicity of notation), we now have that U in eq. (7.19) satisfies the simple constraint

U % U = L X . (7.25)

We need not impose the constraint on U due to time-reversal symmetry at this point.

By differentiating the diagonalizing equation for Q, we find

d Q = d U D U f + U D d U f + U d D U f. (7.26)

Define: d U = U d A , then from the unitarity of U + d U it follows that d A is anti-Hermitian, d A = — d A f. If we now impose the constraint (7.7), ( U f + d U f ) L 2( U + d U ) = Σ χ, we find that [ Σ χ, d A ] = 0. These two constraints imply that

d A= . , . > (727)

with da , da ' anti-Hermitian. If we compare eq. (7.26) with the definition (7.3), we find

d Q = M f d V Μ = U [ d A D + D d A f + d D ] U f

= U [ d A D - D d A + d D ] U f . (7.28)

This equation expresses a relationship between d V and the eigenvalues qn = QVn, where cosh(v„) = 2λη + 1, so we need only express this relationship explicitly in terms of the matrix elements of d V , d A , and d D to calculate the Jacobian of the transformation.

In order to do this, the crucial step is to invoke the uniqueness of the measure dμ(Q) with respect to the factorizing matrix Μ (proven above), and then choose a convenient matrix Μ for simplifying eq. (7.28). A convenient choice is

M = O D 1 / 2U f , 0 = ^ - ^ = - ^ , (7.29)

Random matrix theory and maximum entropy models 405

where the matrix Ο is a 2N χ 2N real orthogonal matrix. To check that this is an acceptable choice of M , first note that M f M = U D U 1 = Q, as required. Next we prove that this Μ is pseudo-unitary, using the condition (7.25) on U:

Μ Σ ζ Μ + = 0 0 1 / 2( Υ % Υ ) Ϋ 1 / 2Ο Τ

= Ο D 1 2 Σ , D 1 2 Ο τ = ΟΣχ Ο τ = Σ2, (7.30)

where the last equality follows from the algebra of the Pauli matrices. Substitut­ing this choice of Μ into eq. (7.28) above yields

d V = 0 [ D 1 /2 d A D 1 / 2- D 1 2 d A D 1 / 2+ D " 1 d D ] 0 T. (7.31)

We know from the analysis leading to eq. (7.8) that d V is off-block diagonal and completely specified by the Ν χ Ν complex matrix dv . It is straightforward to check that the same property holds for the matrix product appearing on the right-hand side of eq. (7.31), and to obtain from direct matrix multiplication

d v = q 1 /2 da ' q 1 / 2- q 1 /2 da ' q 1 /2 + q 1 /2 d a q 1 2

- q 1 2 da q ^ - q 1 d q , (7.32)

which when evaluated gives the fundamental relation

(7.33)

where qn>\, i.e. η = 1, 2 , . . . , Ν. This equation gives the transformation between the infinitesimal displacements in the two 'coordinate systems', in the first system the neighborhood of a point in the matrix space Q is specified by the 2iV2 independent parameters dvmn9 dvmn9 in the second system by the indepen­dent parameters of the matrices d a , da ' , d q . Since da and da ' are anti-Hermitian they give rise to 2AT2 independent parameters alone, and when the additional Ν parameters qn are included, there might appear to be an inconsis­tency between the left and right-hand sides of eq. (7.33). This is immediately resolved by noting that the coefficient of the diagonal element of d a vanishes identically, so d v actually does not depend on the Ν parameters Im [da n w] (Re [da , U I] = 0 because d a is anti-Hermitian). We thus can choose for the independent parameters on the right-hand side of eq. (7.33) the set {d<42„,

m^n; da^ , da 2„ , damn, m<n;qn} with n, m = 1, 2 , N ; and eq. (7.33) does represent an infinitesimal neighborhood of the point Q in two different 2N2-dimensional coordinate systems. The volume element in the new coordinate system, i.e. that specified by d a , da ' , d q , is just the 2N2 χ 2N2 Jacobian of the transformation (7.33).

406 A.D. Stone et al.

We can see immediately that the sector of this Jacobian matrix involving only the diagonal elements of d v is diagonal, dv2n only depends on da^l, and dvln only depends on dqn; by inspection this diagonal sector contributes a factor Π * (1 — q^2)dqn, to the total Jacobian. We can also see immediately that the real and imaginary parts of the off-diagonal elements dvmn, dvnm only depend on the real and imaginary parts, respectively, of damn, da^n (and not on other matrix elements of da , da ' ) . Therefore, the remaining sector of the Jacobian matrix separates into 2 x 2 blocks. The block involving the real parts is of the form

(MvL ddvL\

\8dflU 8d<4V

(7.34)

and that involving the imaginary parts is of exactly the same form with the imaginary parts replacing the real parts everywhere. The partial derivatives involved are easily calculated from eq. (7.33) and each Jacobian has the same value: 2 [ ( < ? m< ?„ ) - 1 / 2- ( ^ ^ ) 1 / 2] [ ( g J ^ ) 1 /2 -(i./eJ1 / 2]. There are two such blocks for each element above the diagonal of dvmn, hence this entire sector of the Jacobian contributes a factor Π ^ , , 4[(q„, + q ~ 1) - (qn + q~1 ) ] 2. Collecting all the factors to arrive at the total Jacobian in the absence of time-reversal symmetry, we find using eq. (7.4)

άμ(0) = Π 4 [ ( 4 W + qm 1) - (qn + q~1 )]2 dalmn dc^n da2mn da£n

m<n

xf\(l-q;2)dq„aa'„l (7.35) η

Let us now briefly consider the case in which there is time-reversal symmetry. As noted above, in this case d v is symmetric; if we impose the additional constraint dvmn = dvnm in eq. (7.33), this immediately implies da£n = 0, da2„ = 0. This causes no change in the sector of the Jacobian matrix involving only the diagonal elements dvnn9 but now sector involving the off-diagonal elements of dvmn is simply diagonal, and just gives rise to the factor nNm<n(ddvLlddaL)@dv2mJddcti„) = n ^ < n[ ( q m + qm * ) - (q„ + q " 1) ] . In this case one finds

d/i(Q) = Π C(€m + 1) - (qn + q« 1)] del d<42„ m<n

xf\(i-qn-2)<iqndcC. (7.36)

Random matrix theory and maximum entropy models 407

8. Global maximum-entropy hypothesis

8.1. Constraint of given eigenvalue density

We are now in a position to propose a maximum-entropy hypothesis to fix the joint probability distribution Ρβ({λη}) which formally determines the distribu-

We are interested in considering probability densities for the Q and X matrices which only depend on their eigenvalues, hence in the reduced prob­ability density for the eigenvalues, independent of the values of the auxiliary matrices d a , da ' . Therefore, we integrate over the auxiliary variables to arrive at the Jacobian factor of interest

(7.37)

where as usual β = 1 in the presence of time-reversal symmetry, and β = 2 in its absence.

This factor describes the eigenvalue repulsion in the pseudo-unitary ensemble of Q matrices; if the eigenvalues qn are considered as the fundamental objects, then it differs in only a minor way from the familiar G O E (GUE) forms discussed in section 6. Indeed, if we now consider the matrices X with eigen­values λη = \(qn — qn1 — 2), a simple change of variables in eq. (7.37) yields the result

(7.38)

which has exactly the same form as for the eigenvalues in the Gaussian ensembles. It is thus very tempting to conclude that the {λη} are the simple and natural variables to treat statistically, with the hope that one can carry over to the present case the known results for the Gaussian ensembles without change. In fact, we shall see that for our purposes the {v„} are the natural variables in the problem, and the statistical level repulsion for these variables has the unfamiliar form given in eq. (7.37), which leads to novel statistical behavior not encountered in the Gaussian ensembles. These issues will be addressed in the next section, where we introduce a global maximum-entropy hypothesis for the ensemble of disordered conductors, and discuss the behavior of the eigenvalue density.

408 A.D. Stone et al.

tion function of the conductance p(g\

P(g)= Π άληρβ{{λη})δ(8-2 Σ ττ-rV ί8·1) η = 1 \ m = l l T > t m/

It is natural to assume that Ρβ({λη}) corresponds to the reduced probability density obtained from the probability density for the matrices Q, assuming that the probability density Φ(Ο) depends only on {λη}, Φ(0) = \ν({λη}%

U ) = | Ρβ({λ„})= Φ(θ)^(μπ})ά3ά3' = w(U„}) Jp({Xn}) d a da '

= CtJI fl ttm-knfw{{kn}). (8.2) m <n

The weight function w({A„}) is not determined by the integration measure and cannot be chosen to be unity, since in this case the resulting Ρβ({λη}) would not be normalizable. We are therefore required to impose a further constraint on Ρβ({λη}) to fix \ν({λη}). The 'interaction' term Tl^<n(Xm — λη)β provides the expected strong correlation of the eigenvalues, but leaves the eigenvalue density ρ(λ) completely undetermined. Moreover, both exact microscopic calculations and analytic arguments to be presented below indicate that this density differs dramatically from the Wigner semi-circle law characteristic of the Gaussian ensembles, and instead has a behavior characteristic of the eigenvalues of random matrix products. This observation motivates us to propose the following ansatz:

Ρβ({λ„}) is the maximum-entropy probability density consistent with a fixed eigenvalue density ρ (λ).

Below we shall specify the form of the density, based on microscopic calcula­tions. Constraints of this form were discussed by Balian (Balian 1968), building on earlier work of Wigner; they can be implemented by the following procedure. Since we can write the eigenvalue density (normalized to the number of eigen­values N) as

ρ(λ) = f Π άλη Ρβ({λη}) Σ δ(λΜ - λ) = (f 8{km -λ)\ (8.3)

this corresponds to an infinite number of constraints of the type discussed in section 4 above, and therefore this infinite set of Lagrange multipliers is specified by a function /(λ). The generalization of eq. (4.4) becomes

w(U„}) = exp " f00 Ν Ί Γ Ν

άλ/(λ) Σ δ(λ-λη) s exp - Σ V(Xn)

JO η=1 J [_ π=1

(8.4)

Random matrix theory and maximum entropy models 409

so we have

Ρβ({λη}) = ΟβιΝ e x p C - j S H i t t , } ) ] , (8.5)

with

Ν Ν Η{{λ„})

= - Σ Μ\λ„-λ„\) + Σν(λπ), (8.6)

m<n η

where we now need to choose the function V(X) so as to give the desired charge density ρ(λ) (and for convenience we have absorbed the constant l/β into the definition of V(Xn)).

A simple argument due originally to Wigner (1953, 1955, 1957, 1958) gives an approximate solution for V(X). At any 'temperature' β~ι the most probable configuration of {λη} is obtained from the condition δΗ/δλη = 0, η = 1,..., Ν,

9Α„ τηΦη \λη — Xm\

This is simply the condition for mechanical equilibrium of the 'gas' interacting logarithmically in the presence of the external 'force' generated by the potential V. If we can ignore the self-energy of each 'charge' and the difference between the average and most probable configuration are small as one expects, then we can replace the sum over Xm Φ λη in the second term of eq. (8.7) by an integral over the density ρ(λ). Integrating the resulting equation yields

This result has an obvious interpretation in the context of the physical analogy discussed above: If the 'charge' density ρ(λ) and the form of the two-body interaction are assumed given, then the external potential must be such as to cancel the total force on a charge at λ due to all the other charges (otherwise charges would move, and the density would change). Hence we can think of ν(λ) as the potential arising from a 'neutralizing background' of fixed charges of opposite sign. This simple argument is consistent, e.g. with the Wigner semi­circle law for the GOE. Thus our global maximum-entropy hypothesis (Muttalib et al. 1987, Pichard et al. 1990a) states that Ρβ({λη}) is given by eqs. (8.5) and (8.6) with V(X) given in terms of the charge density ρ(λ) by eq. (8.8). The properties of the ensemble of disordered conductors will then follow from our choice of ρ (λ).

8.2. Justification for the choice of ρ(λ)

The global approach can proceed no further without specifying the density ρ(λ). Here we must return to the microscopic description of the ensemble of disordered conductors for guidance. As discussed in section 2, the transfer

(8.8)

410 A.D. Stone et al.

matrix Μ which gives us Q = ΙΝ/ΓΜ, and also X according to eq. (3.28), is a product of NL random matrices, where the total sample length L = aNL and a is a microscopic length, e.g. the lattice spacing in a tight-binding model, which shall henceforth be set equal to unity. Given such a random matrix product M , as discussed after (3.25) Osledec's theorem states that the eigenvalues of the matrix (l/2NL) l n [ Q ] are self-averaging. With our definitions the self-averaging quantities are the Liapunov exponents,

the {a„} approach the same set of Ν numbers for each realization of the random matrix product as NL - • oo. When N, the number of eigenvalues (or dimension) of the transfer matrix becomes large it is natural to define a density function, dN(a) which describes the density of Liapunov exponents in a given interval. This density function is also self-averaging as Ν oo, and has been well-studied both for random matrix products and for linearized chaotic maps (Pichard and Andre 1986, Paladin and Vulpiani 1986, Livi et al. 1986, Eckmann and Wayne 1988). The density of positive Liapunov exponents in almost all cases is found to be approximately uniform.

We would like to know the densities ρ(λ) or σ(ν) = a(2ocL) for finite Ν and L, so we cannot invoke this rigorous theorem as a proof of the correct choice, however we expect the fluctuations from the limiting density to be small for finite L > / due to the large eigenvalues correlations, and also on the basis of numerical calculations. This expectation is confirmed by the data obtained from microscopic calculations (to be described below) shown in fig. 1, where we compare three quantities: a„ (L-> oo) (the true Liapunov exponent), <a„(L)> the ensemble-averaged value of ocn for a square of side L < ξ9 and the actual (fluctuating) values of a„(L) for a given realization of the random potential. One sees immediately that all three quantities are roughly equal to about 20% accuracy, and the plot is quite linear for the small an, implying a constant density. Similar results are obtained for the {v„ = 2anL} (with the vertical axis scaled of course by the factor 2L). There is a substantial curvature of the three graphs for the large values of a„, but these contribute little to the conductance as given by eq. (3.30), since the large values of a„ correspond to the 'closed channels' in Imry's terminology. These observations motivate us to impose the constraint that the density σ(ν) be taken uniform between zero and some upper bound. The appropriate constant value for the density, σ0 can then be deter­mined by the requirement that the average conductance satisfy Ohm's law in the metallic regime, <g> = 2NI/L = 2g0.

= 4σ0 t a n h ( i v m a x) * 4 ( 7 o . (8.10)

Random matrix theory and maximum entropy models 411

0.4

OCj

0.3

0.2

0.1

ο

t •

0.0

Φ Φ π ϋ

Λ • * φ π α

φ •

ο * •

_Ι I Ι­

Ο 10 20 30

Fig. 1. aj(L) = (l/2L)v; as a function of the index i, arranged in order of increasing magnitude for a two-dimensional square disordered tight-binding model with L = 25. The crosses are the results for a given realization of the disordered potential; the diamonds are for the ensemble average <a f> , and the squares are the Liapunov exponents obtained by taking the limit L - » oo at fixed width.

A linear region corresponds to a uniform density σ(α).

The final approximate equality is valid to order exp[— v m a x] and we must have vm a x > 1 in order to satisfy (g} <N (which is the condition for being in the metallic, as opposed to the ballistic, regime). Equation (8.10) yields the result σ0 = \g0, thus the appropriate choice for the density becomes

a(v) = Nl/2L = $g09 0 < v < 2 L / Z , (8.11)

= 0, ν > 2L/L

From the relation σ(ν) dv = ρ(λ) άλ, this uniform density for the variable ν implies that our original variable λ = i(cosh ν — 1) has a density of the form

ρ(λ)= , g o (8.12)

This density is rapidly varying for small λ (the region of interest). Since essentially all the results for the fluctuation properties of the standard ensembles (e.g. the two-point correlation functions) are calculated assuming that the variation in the density of eigenvalues is negligible over the interval of interest, we have no

412 A.D. Stone et al.

justification for applying any of this accumulated wisdom to the Coulomb gas described by the Hamiltonian Η({λη}). We also have very little intuition about the behavior of a system with such a rapidly varying charge density. Therefore, it is natural to reformulate the problem in terms of {v„}, which have an approximately uniform density.

8.3. Properties of the Coulomb gas for the charges {vn}

We now focus on the joint probability density for the {v„}. Throughout this section we will utilize the Coulomb gas analogy extensively and will consistently refer to the eigenparameters v„ as 'charges' at a position v„; we make one final warning to the reader that this terminology comes simply from exploiting a mathematical correspondence, and has nothing to do with the actual physical charges that are present in the disordered conductor. In this language, the JPD is

Ρβ(Μ) = €0tN exp [ - jS i? ( { v „ } ) ] , (8.13)

with

£ ( { ν „ } ) = - Σ l nd c o sh v n- c o s h v j ) + f IMvJ + lMvJ, (8.14) m<n η

where the 'external potential' now has two terms. The term

l/1(v) = - ^ ln ( |2s inhv| ) (8.15)

comes from the change of variables in the integration measure, and hence has a relative factor of jS" 1 with respect to the interaction term. The presence of this new term indicates formally that ^ ( { v , , } ) ~ vn -> 0, as v„ ->0, Vn, i.e. the eigenvalue density vanishes linearly at the origin, independent of the symmetry of the ensemble. This implies that the actual density σ(ν) as calculated from Ρβ({νη)) wi H deviate substantially from our assumed constant density very near the origin. It is thus better to think of the potential U2(v) as the potential of a fixed neutralizing background of jellium, while the actual mobile charges, {vn} arrange themselves to have a uniform density over scales large compared to the interparticle spacing, but with deviations from uniformity on the scale of the interparticle spacing. We shall see below that microscopic calculations indicate that such a repulsion of the v„ from the origin exists and the correct charge density σ(ν) does vanish linearly as ν -> 0.

Hence we maintain our choice for the confining potential U2(v) as given by eqs. (8.8) and (8.11),

riL/i

U2(v) = igo I ln[|cosh ν - c o s h μ|] dμ. (8.16)

Random matrix theory and maximum entropy models 413

This integral can be done exactly (Pichard et al. 1990a) and then expanded in different limits to get the leading behavior; however it is also possible to get the leading behavior from the following approximation:

(8.17)

(8.18)

(8.19)

Direct evaluation of the integral is now trivial and yields

(U2 ~ ν for ν > 2L/1, but this region is unimportant) hence

ln[|cosh ν — cosh μ|] = ν — In 2, ν :

Note that our maximum-entropy hypothesis implies a quadratic external potential for the charges {v„}, just as in the GOE, as well as the short-range repulsion from the origin discussed above. However we emphasize the crucial fact that the effective interaction term between the charges {v„} is not logarithmic at long distances as in the GOE. This analysis reveals why the ensemble of disordered conductors does not map onto the standard random matrix ensem­bles: the charges {λη} have the standard logarithmic interaction but a rapidly varying density in the region of interest, whereas the charges {vn} have an approximately uniform density but a non-standard interaction. Note also the dependence of the potential Ul(v) on the symmetry parameter β which defines the effective 'temperature' of the gas. The dependence of U1 (v) on β has dramatic consequences for the behavior of the localization length as a function of magnetic field and of the spin-orbit coupling which were only understood very recently by Pichard et al (1990b). We shall discuss these briefly in section 10 below.

Let us study further the nature of this novel interaction between the charges {v„}. Consider the interaction energy Enm between two charges at positions vn > vm, with vn — vm > 1; in this case, since vm ^ 0, cosh v „ « ^eV n, and

Enm = — In [cosh vn — cosh v m] « v„ — 2e~ Vn cosh vm « v„. (8.20)

This means that the interaction between any two widely spaced charges is very asymmetric, the charge further from the origin feels a constant force (linear attractive potential) due to the charge nearer the origin, whereas the nearer charge feels essentially zero force due to the further charge. As long as the charges remain widely spaced neither force depends on position. Obviously a gas interacting with such a force law will behave very differently than the logarithmically interacting charges in the standard ensembles such as the GOE.

414 A.D. Stone et al.

On the other hand, as we shall see in detail below, two charges at positions such that |v„ — v j < 1 do still have a symmetric logarithmic interaction, and should exhibit correlations similar to the standard ensembles. Remembering that the charges vn are distributed approximately uniform with mean spacing d = 2/g09 we expect that the gas should exhibit a 'high-density phase' for d < 1 in which each charge interacts logarithmically with order g0> I neighbors, and a 'low-density phase' in which d > 1 and a given charge typically has no near neighbors, and only interacts in the very asymmetric manner described above with the charges closer to the origin. The transition region between these two 'phases' is at g0 = Nl/L& 1, i.e. precisely at the transition point between the metallic and strongly-localized regimes! Thus it appears that this model can in principle account for the very different statistical behavior of the conductance in these two regimes. We now proceed to demonstrate this explicitly.

9. Metallic regime: The high-density phase

9.1. Fluctuation measures in the metallic regime

In the metallic regime we have <g> % 2g0 > 1, since the average spacing of the charges v„ is 2/g0, each charge has order g0 neighbors satisfying |v„ — v j < 1, and it is the charges such that 0 < vn < 1 (Open channels') which dominate the behavior of the conductance, g. In the next two sections for convenience let us assume that the charges are in a configuration such that v„ > v m, η > m, the behavior derived will be the same in all other sectors of probability space from permutation symmetry.

A subtle point of focusing only on the open channels is that our derivation of the confining potential U(v) in eq. (8.18) above has to be reexamined, since it is based on taking ν > 1. If one instead expands cosh ν » 1 + \v2 and performs the relevant integral one still obtains a quadratic confining potential for ν < 2L/1 but with a renormalized coefficient, Uopen(v)&^g0l/4L)v2. If we use this form and expand the interaction term around vn = 0, we obtain

The first and third terms correspond exactly to the interaction Hamiltonian of the Gaussian ensembles, except for the implicit constraint that the vn ^ 0, which is absent in these ensembles. However, the second term describes a new feature of this ensemble which arises directly from the derived measure; it represents the interaction of the charge vn with an exactly symmetric distribution of' image

x l n [ v „ - ( -E-vJ] + I (9.1)

Random matrix theory and maximum entropy models 415

charges' at positions — v„. Note that the interaction of v„ with the image charge at — v„ arises from the expansion of the term In [2 sinh v„ ] in eq. (8.19), and therefore depends on β in the manner indicated. Note also that for charges located near positions v0 > 1, we should expand cosh vn — cosh vm around v 0; since this expansion will have a linear term, we will only obtain a term of the standard G O E (GUE) form (for η φ m). Thus the 'closed' channels will have G O E (GUE) statistics in the metallic regime, without any significant image-charge effect (see fig. 2).

Hence we expect the probability density of the interparticle spacing (eigen­value separation) P(S), averaged over the whole spectrum, to be quantitatively identical to that of the appropriate Gaussian ensemble,

Pfi(S) = CfiS* txpt-S2IWll (9.2)

where the normalization constants for each ensemble (value of β) are Cx = \π, C2 = 8/π2, C4 = 4 9/ (3 6π 3) , and the width parameters are W\ = £π, W\ = 4/π, Wl = 64/9π. Here we quote the values for the Wigner surmise (which, as noted above, is not quite exact), for the normalized spacing S = s/<s>, which is why the form differs slightly from eq. (5.6) above. In addition we expect the A3

statistic, averaged over the spectrum, to be that of the GOE. We have calculated (Muttalib et al. 1987, Pichard et al. 1990a) the eigenvalues

of Q numerically using the random one-band tight-binding model (Anderson model), and hence are able to study the statistics of the parameters {v„} from an exact microscopic calculation. The calculations are critical as a test of the validity of our maximum-entropy ansatz. By performing the calculations for the system in the absence and presence of a magnetic field we can obtain the behavior for the values β = 1, 2 respectively, and in Zanon and Pichard (1988) the behavior for β = 4 was studied by introducing spin-dependent scattering. The details of the computational technique has been discussed elsewhere (Stone 1985) and will not be given here. In fig. 3 we show the spacing distribution for

open channels M 1

N-N, f, closed channels

I

! * V2

I I

veff vi Vi.1

Θ Θ Θ Θ Θ Θ Θ • © — ~ ©~ ©,

I — 0 —

| — © -— © -— ©~^

0 1

Symmetric Dyson's Coulomb Levels in logarithmic gas with symmetric uniform repulsion with 0(Ne f f)

positive jellium neighbors

Fig. 2. Schematic illustrating the Coulomb gas analogy appropriate to the JPD for {v„} in the metallic regime.

416 A.D. Stone et al.

1 . 0 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 , . 1 1 1 1 1 1 1 Γ

0 .0 0 .5 1.0 1.5 2 .0 2 .5

S Fig. 3. Plot of the spacing distribution of {v„} in the metallic regime for a two-dimensional Anderson model with L = 40, disorder parameter W= 1, 400 realizations, averaged over the whole spectrum. Circles are numerical results with zero magnetic field (β = 1); Squares are with a magnetic field

(β = 2). Solid lines represent the Wigner surmise for P(S) for the appropriate value of β.

the charges {v„}, P(SV) as obtained from these microscopic calculations* for the cases β = 1, 2, as compared to the Wigner surmise expected for the standard ensembles. We also show (fig. 4) the A3 statistic for the cases β = 1, 2 compared to the G O E and G U E predictions. The results are for an ensemble of conductors in the metallic regime, averaged over the entire spectrum. The agreement is extremely good for g$> 1, as our above argument led us to expect; similar agreement was found for the case β = 4 in Zanon and Pichard (1988). In the case of the A3 statistic, which measures long-range spectral rigidity, we would expect some deviations even for g > 1 from the G O E (GUE) behavior at distances greater than g0 (in u ni t s of the eigenparameter spacing), when the interaction is no longer logarithmic. There are some small indications of such deviations in fig. 4. However, more strikingly, we find substantial deviations

*In Muttalib et al. (1987) the spacing distribution and A3 statistic were calculated by numerically transforming from {λ„} to variables with uniform density over the whole spectrum (unfolding). Since {v„} are such variables to a good approximation, we do not distinguish this procedure from the direct study of {v„} performed in Pichard et al. (1990a).

Random matrix theory and maximum entropy models 417

Fig. 4. Mean value of A3(E0) determined numerically without a magnetic field (circles) and with a field (squares) for the same parameters as in fig. 3 (E0 is distance in units of the eigenvalue spacing). Solid lines are the G O E , G U E results. For these parameters the sample is metallic <g> « 10. The dashed line is the linear (Poisson) behavior expected for an uncorrelated spectrum. Squares are numerical results for <g> « 1 (W= 4), showing significant deviations which we ascribe to the cross­

over to the low-density phase of the Coulomb gas discussed in section 10.

for g « 1, where the mean spacing of eigenparameters is comparable to the range of the logarithmic interaction. This is the beginning of the cross-over to the different statistical behavior described in the next section. In contrast to the standard G O E (GUE) behavior of {v„} in the metallic regime, analogous calculations for the charges {λη} show a strong disagreement with the Wigner surmise, particularly for the ληρΐ, demonstrating the important point stated above: although the {λη} have a measure of the GOE form, they do not exhibit standard GOE behavior due to the rapid variation in the density σ(λ).

9.2. Confirmation of image-charge effect

As noted above, the presence of the 'image' term in eq. (9.1) differentiates the statistics of {vn} in the metallic regime from standard G O E variables. If this is correct, these 'image charges' should affect the short-distance fluctuations of

0 .5 Ι ι ι ι ι I I 11 ι . I . . . I I ι I . ι ι ι ι ι ι ι ι . ι ι , .

418 A.D. Stone et al.

the smallest eigenvalues in the metallic regime. In particular, if we look at the probability density (not the spacing distribution) of the charge nearest the origin, vi9 we expect it to vanish linearly as vx ->0 as discussed above, and since it is being repelled from the origin by its image charge, we expect the distribution to be approximately the Wigner surmise for β = 1. Below we compare (fig. 5) this distribution as calculated numerically for the case β = 1 to Pi(S) and find excellent agreement; this confirms the presence in a microscopic model of the image-charge effect predicted by our maximum-entropy hypothe­sis. An interesting further prediction of our ansatz is that the distribution p (v x) should always have linear repulsion from the origin, even when the ensemble is characterized by β = 2,4, since as mentioned above, the image 'interaction' between a charge at ± v „ comes from the change in variables between λη and v„ and not from the Jacobian factor ΙΙη\<η(ληχ — λη)β, and does not contain the usual dependence on β. This prediction was also recently confirmed (Pichard 1990), and is also shown in fig. 5; even for β = 2, we still obtain approximately Pi(S) for the distribution p(v x) . In addition we have calculated the distributions p(v2), p (v3) , etc. for j S = l and confirmed that they approach a symmetric

0 1 2 3 4 <V,>

Fig. 5. Probability distribution for the value of the smallest eigenparameter vl9 normalized to its average value, for a two-dimensional metallic square for zero (squares) and non-zero magnetic field (circles). The solid lines represent the Wigner surmise for β = 1 (smaller maximum) and β = 2 (larger maximum). Note that the data in a field agree with the curve for β = 1, confirming

the special nature of the image charge effect discussed in the text.

Random matrix theory and maximum entropy models 419

Gaussian form indicating that the image-charge effect is screened out as the distance from the origin increases. We have thus confirmed the predictions of our maximum-entropy ansatz, specialized to the metallic regime. The joint probability density is essentially that of the Gaussian ensembles for charges far from the origin, but as the origin is approached a novel 'image-charge' effect appears. Since the conductance is sensitive to the eigenparameters near the origin, it will turn out to be essential to include correctly the 'image-charge' effect, in order to obtain universal conductance fluctuations.

9.3. Statement of UCF in the global approach

In this section we will prove that the joint probability density Ρβ({νη}) = C/3,jv exp[—/?/?({νπ})] for the charges {v„}, in the approximation of replacing H({VJ) - * ^ o Pe n ( { v, i } ) (appropriate for describing the small eigenparameters in the metallic regime), yields universal conductance fluctuations. We will also point out an important gap in previous arguments based on the global approach, which has only been bridged very recently. We will only treat the mathematically more tractable case of β = 2 (unitary ensemble), which is quite relevant for U C F experiments, since they are typically performed using a magnetic field. In order to obtain this result it is useful to rewrite the JPD in the form

P2({vn}) = CN e x p [ - 2 H o p e n( { v n} ) ]

= CN ft « - v n2 ) 2 n v „ e - « (9.3)

m<n η

From this we see that the natural variables are μη = (g0l/4L)v2, in terms of which the JPD is

Ρ2({μ«}) = C'N ft (ft. - ft.)2 Π β""", (9.4) m<n η

where 0 ^ μη < oo. This JPD is already well-known in random matrix theory and it describes the Laguerre ensemble, so called because of the presence of the exponential weight function which defines the relevant orthogonal polynomials to be of the Laguerre type.

As described in section 6 above, it is straightforward to calculate the exact eigenvalue correlation functions Rpfoi,..., μρ) for such a JPD in the case β = 2, in terms of sums over these orthogonal polynomials. The calculation is formally identical to calculating the pth-order density correlation function for a gas of Ν non-interacting fermions in a one-dimensional potential whose form is determined by the weight function. In particular the one and two-point correlation functions (i.e. the eigenvalue density and density-density correlation function) are needed in order to calculate the variance of any linear statistic Λ = Σ"=1 / (μ η) (the analog of a one-body operator in the fermion problem).

420 A.D. Stone et al.

The conductance as expressed in eq. (3.30) is such a statistic. Using the general theory outlined in section 6, one finds in this case

* ι ( μ ) = Ρ(μ) = ΚΝ(μ, μ) = ^ Ο Φ ' " , (9.5) η = 0

R2(ji, μ') = Κι(μ)^(μ')-ίΚΝ(μ, μ ' ) ] 2 = R^R^)~ Ά(μ, μ'), (9.6)

where

ΚΝ(μ, μ') = Υ Κ{μ)Κ{μ')ζ-(μ+μΊΙ2. (9.7)

In terms of these correlation functions we have the exact relation

var g = άμ f \ J AN μ^1)ρ{μ)

Γ 00 Γ 00

- <1μ άμ' Α^/4ΝμΙ820)Α^/4ΝμΊ§20)Τ2(μ, μ'), (9.8) J o J o

where

J 1 + cosh χ

We need to show that in the limit Ν > g0 > 1, corresponding to the metallic regime, the sum of these two integrals is independent of Ν and g0 to leading order in these large parameters. We remind the reader that N9 the number of channels is related to the width of the conductor divided by the Fermi wave­length, and g 0 = Nl/L is related to the length and the elastic mean free path, hence this is equivalent to proving that var g is independent of these parameters.

9.4. Proof that var g ~ 1

Although it is rather difficult to calculate the value of the conductance fluctua­tions from eqs. (9.5)-(9.9), it is possible to prove that they are of order unity. In order to do this we must first obtain an analytic expression for the kernel ΚΝ(μ, μ') needed for the integrals of eq. (9.8). The sums over Laguerre polynomi­als in the definition of KN can be performed exactly using a relation for orthogonal polynomials called the Christoffel-Darboux formula (Erdelyi 1953), which yields

ΚΝ(μ, μ') = Ν*-**™ ~ ^ < " > L» - , (9.10) μ-μ

For large Ν (Erdelyi 1953), Μ μ ) « [βχρ(£μ)/ (π2Νμ)1 / 4] c o s [ 2 7 ^ - ±π ] , where this relation is only valid for μ > 1 /Ν; inserting this asymptotic expansion

Random matrix theory and maximum entropy models 421

into eq. (9.10) and expanding for small x/N yields

1

2π(μμ' ) 1 /4 ΚΝ(μ9 μ') =

„ f s i n C y ^ V ^ - ^ ) ] cosCy^V^ + V )]]

Because our approximate expression for the kernel ΚΝ(μ,μ') is only valid for μ$>1/Ν, we adopt the following procedure:

(1) All integrals are cut-off at a lower limit ζ/Ν, ζ~1. (2) We show that the region of integration below this limit can at most give

a contribution of order unity when the exact kernel is used. (3) We show that the approximate kernel of eq. (9.11), when integrated over

the remaining region gives a contribution of order unity. Hence the exact result of eq. (9.8) can be at most of order unity.

The presence of the variables y/μ, ^ίμ' in eq. (9.11) now makes it convenient

to shift back to the variables ν = y/4Lμ|g0l when performing the integrals in eq. (9.8). In terms of these variables we may write

v a r g = / 1- / 2 , (9.12)

with

h = ~ Γ d v / 2( v ) , (9.13) π hi so

« 2 Jt / to Jc/80 I V - V ' V + V' j

(9.14)

The crucial feature to notice about the expression for I2 is the presence of the argument ν + v' in the correlation function T2. Since the density σ(ν) is constant, far from the origin the system is translationally invariant, and the correlation function can only depend on the difference ν — ν'. Indeed, one sees that the terms depending on ν + v' do decay for v, v' > 1. The remaining term is precisely the correlation function of the standard ensembles. However, we shall see below that if one keeps only this term, evaluating eq. (9.8) yields v a r g ~ l n g 0; i.e. ignoring the breakdown of translational invariance near the origin predicts non-universal conductance fluctuations! In Imry's original work (Imry 1986) he employed the correlation functions of the standard ensembles, but ignored the positivity constraint on the eigenparameters v. This then restored translational invariance, and yielded universal conductance fluctua­tions, but with a value approximately two times too large. In later work of Muttalib et al. (1987) this important but subtle point was also overlooked. It

422 A.D. Stone et al.

was simply assumed that if the bulk spectrum was GOE-like, then this implied UCF. We will now show that using the full correlation function allows a correct derivation of universal conductance fluctuations, which observes the positivity constraint. Finally, we show that numerical evaluation of eq. (9.8) yields a value very close to the expected one.

7X just involves the density <r(v) which we have taken to be constant based on our ansatz. To test the consistency of this we can evaluate the density from eqs. (9.5) and (9.11), and find

π (2πν)

which for g0> 1 yields a uniform density σ(ν) ~ g0 as expected for consistency with our ansatz. We also see that there is a lower-order rapidly oscillating term which diverges at the origin, reflecting the breakdown of our approximation for the kernel KN mentioned above. In fact from the exact expression (9.5), and the properties of Ln(p -> 0) it is simple to show that σ(ν -> 0) ~ v, i.e. the density vanishes linearly as it must from the JPD of eq. (9.3). Since f(v) = 4/(1 + cosh v) is bounded above by 2, it follows that we make an error of at most order unity if we assume a uniform density down to ν = 0 in the first integral of eq. (9.8). Thus

Τ - go d v / 2( v ) = — . (9.16) ο π

The difficulty is in evaluating I2\ anticipating future developments we assume an expansion of J2 in g0 of the form

/2 (So) = C i £ o + c 2 l n g 0 + c 3 + c 4/ g o + ( 9· 1 7)

and seek to show that c1=I0/n9 c 2 = 0, which implies that v a r g ~ l , from eq. (9.12).

12 consist of three integrals arising from the square of the term in brackets, let us denote them by I2(go) = Ji(go) + « Μ ^ ο ) + ^3(#o)> where J1 involves the integral of sin2(v — v'), J2 involves cos2(v + v'), and J 3 is the cross-term. Jx is the only term that arises if one uses the correlation function of the GUE, as did Imry. The integrand of J1 is non-singular for all v, v' and it is easily shown that one makes an error of at most unity by extending the lower limit to zero. Changing integration variables to χ = ν — ν', y = ν + ν', we have

--hi ay 0 -y

To extract the leading behavior as g0 -> oo we use the identity limgo 00(g0/n)(sm(g0x)/g0x)2 = δ(χ)9 which immediately yields cl=I0/n as desired. To obtain the sub-leading dependence on g0 is more involved. First,

Random matrix theory and maximum entropy models 423

J 1( r 0) = ( 8M 2) ' 00 1 Γ Γ00 sin2 ν

dy e--°yF(y) = -(S/n2) dy e'-0" —-f- , (9.20) ο ro LJo y

where the second equality is obtained from integrating by parts. If we define the integral in brackets as K ^ o ) , by taking two derivatives we obtain

d2Kl

8r2 o (9.21)

This integral can be computed exactly, but the leading behavior as r 0 0 comes from neglecting the factor cos 2y, which then immediately yields l/2r0. Integrating the resulting differential equation twice and using eq. (9.20) it follows that

Ji(go) = c l g o- ^ W l n g 0 + c 3. (9.22)

The constant cl has already been determined above, and was found to cancel the term of order g0 arising from 7 t; however, we now see explicitly the logarithmic dependence of var g on g0 that would arise from using the correla­tion function of the G U E in eq. (9.8). In order to see that this log term is cancelled we now examine J2(g0)-

Since the integrand of J2 is independent of x, the χ integration can be done trivially to yield

ι = ( 1 6 / π ) 2£ cos2 V

J 2( r 0) = (16/π)2 j d y ^ — ^ , (9.23)

where we have made the same change of variables as that leading to KY above. In this case we can differentiate J2(r0) once with respect to r 0 to obtain

9 J2 = - ( 1 6 / , ) 2 Γ d , e - - l ± £ 2 i ^ = -J-(16M)2, (9.24) 8 r 0 - ^ / π ) 2 1 2 2r0

where the final equality only holds to leading order in r$ 1 = g 0 (independent of the value of the constant ζ) and can be obtained by neglecting the factor cos ly in the integrand. Integrating this equation once yields

•/2teo )=+ i (16/7 i ) 2lng 0 + c 4, (9.25)

we note that the only remaining possibility of a singular dependence on g0

comes from the region of integration x, y > 1, hence we can replace f(y) by its asymptotic form 8e " v, giving

i . y ^ d W 2 Γ dy e " ' f dx s^§^>. (9 1 9)

Jo Jo x

If we make the change of variables χ g0x, y^goy and define r0 = l/g0, F(y) = Jo dx sin2x/x2, we can write

424 A.D. Stone et al.

and we see that the logarithmic dependence on g0 arising from J1 is exactly cancelled by that of J 2. The cross-term J 3 in eq. (9.14) is of the same form as J2 except that the factor cos2 y is replaced by Si(y) cos y, where Si(y) is the sine integral. Hence there is no constant term multiplying the factor e~r°y and one does not expect a logarithmic dependence on r0. This can be confirmed rigor­ously by using the asymptotic expansion of Si(y). Thus we have shown by an analytic argument that v a r g ~ 1; the JPD for the open channels does indeed imply universal conductance fluctuations.

9.5. Numerical results for var g

Based on the results of the local approach, to be discussed below, we would expect the numerical value of var g in this ensemble to be that obtained for a sample much longer than it is wide (quasi-one-dimensional limit), which is var g = -fa ~ 0.266 (in a magnetic field assuming spin degeneracy (Lee and Stone 1985)). This value is difficult to obtain analytically, as no simple analytic expression for ΚΝ(μ,μ') exists in this case valid for all μ. hence we have evaluated the integrals for var g in eq. (9.8) numerically using the exact expres­sion for KN in terms of sums over Laguerre polynomials; the results are given in fig. 6. We see, in confirmation of our analytic argument that var g is indeed independent of g0 over more than an order of magnitude variation, and is very close but not identical to the expected quasi-one-dimensional value. Interest­ingly enough, if we use f(v) = 4/(1 + cosh v) in eq. (9.8), the value obtained, var g % 0.296, agrees very well with one half the value of the variance of this statistic in the GUE* . However, if we expand f(v)« 4/(2 + \v2) to be consistent with our approximation for H o p e n, and evaluate the resulting integrals, we find var g « 0.280 which only differs from the known quasi-one-dimensional result by 5%. Nonetheless, this residual discrepancy appears to be real, and not the result of error in the numerical integrations, so we can only ascribe it to the approximation of replacing H({vn}) by Hopen({vn}) which presumably introduces some error.

9.6. Dependence of var g on symmetry parameter β

The actual value of fluctuation measures in the metallic regime depends strongly and in a universal manner on the symmetry of the ensemble, and specifically on the symmetry parameter, the inverse 'temperature' β. The lower the symme­try, the larger β, and as β increases the fluctuations of any statistic decreases because of the stronger correlation of eigenparameters. In the Coulomb gas language, fluctuations are suppressed as the temperature is reduced. In the

*This is exactly one half the value that Imry calculated by simply extending the integrals from — oo to oo and assuming G U E statistics for the {v„}. However, we have just shown that Imry's approach integrated from 0 to oo yields a In g0 term, and not a constant for var g; so the origin of this relationship is quite subtle (P.A. Mello and J.-L. Pichard, unpublished).

Random matrix theory and maximum entropy models 425

0.35

0.30

D)

σ

0.25 \-

9o Fig. 6. The results of numerical evaluation of eq. (9.8) using the exact expressions (9.5) and (9.7). The circles and squares come from using eq. (9.9) and correspond to different choices of NPg0, the diamonds are the result after expanding (9.9) for small x. The solid line is the quasi-one-dimensional U C F value; the dashed line is one half the value obtained for the G U E , as discussed

in the text.

standard ensembles Dyson and Mehta (1963) showed that the dependence of the variance of any smooth linear statistic on β is simply

var Aoc^ (9.26)

which predicts universal reduction factors for the variance as the symmetry is reduced. Unfortunately, we have not yet extended the arguments given above to establish such a universal result for the var g in the ensemble defined by our

426 A.D. Stone et al.

global maximum-entropy hypothesis. Nonetheless the validity of this relation has been established in the local approach to be presented below, and in numerical and analytic calculations on microscopic models (Lee et al. 1987, Imry 1986). Hence it is of great interest to see if these simple and universal reduction factors are observed in the experimental systems which are being modeled on the basis of this maximum entropy approach.

The reduction factors are difficult to observe directly in the time-independent magnetoconductance fluctuations, since one is using the magnetic field as the means of observing the fluctuations and cannot directly observe the zero-field fluctuations. It was thus proposed (Stone 1989) that measurements of the effect of a field on time-dependent conductance noise, generated by the UCF-based interference mechanism of Feng et al. (1986), would be a good means of observing this universal effect of symmetry breaking. A beautiful experiment by Birge et al. (1989) on the low-frequency noise in Bi films clearly observed such an effect (see fig. 7). Several closely related experiments on GaAs conduc­tors (Sanquer et al. 1989, Debray et al. 1989) also gave good agreement with

1.0

ο 0.8 II

X 0.6

ο ω

0.4 χ

0.4

"ο 0.2 0)

0.2

0

1.0

ο II 0.8

χ

0.6 ο

Χ 0.4

"ο 0.2 ω

0.2

0

' ' I ' ' ' ' I ' ' ' ' I ' ' ' ' I ' '

— -φ

' .. I . . .. I . . .. I . . .. I . . .. I . . 0 2 4 6 8

Η (Τ)

. . . . . . , | • • • ' ι

Fig. 7. Plot of experimentally determined low-frequency noise amplitude (appropriately normalized) as a function of magnetic field, divided by the value at zero field for Bi films. All curves approach \ asymptotically as the field increases, as predicted by eq. (9.26). Different curves correspond to different temperatures, since the size of the necessary field is expected to decrease with temperature. Top: diamonds (20 K), squares (10 K), triangles (4 K). Bottom: triangles (4 K), squares (1.5 K), diamonds (0.5 K). The solid lines are fits to the theory of Stone (1989). (Reprinted from Birge

et al. (1989) with permission of the authors).

Random matrix theory and maximum entropy models 427

the predicted reduction factors (after the eifect of breaking spin-degeneracy is also included).

We shall see that even more striking universal dependences on the symmetry parameter arise in the localized regime within the global approach, and have very recently been observed experimentally (Pichard et al. 1990b). Moreover, it may even be possible to understand the weak localization effects on the average conductance from this point of view. In the original applications of random matrix theory to nuclear physics it was impossible to observe the transition between ensembles since the magnetic field scale for breaking time-reversal symmetry of the nuclear Hamiltonian would be enormous; thus the applications of the theory to disordered conductors brings these universal effects of symmetry-breaking into an experimentally accessible regime for essen­tially the first time.

In summary, the ensemble defined by our global maximum-entropy hypothe­sis has fluctuation properties of the bulk spectrum equivalent to the Gaussian ensembles but the behavior near the origin is unique to this ensemble, and requires special treatment. A correct treatment of this behavior does yield universal conductance fluctuations in reasonable quantitative agreement with existing results. A universal proportionality of the fluctuation amplitude to the effective temperature β~ι is expected and has been observed experimentally by the application of a magnetic field.

10. Localized regime: The low-density phase

10.1. Effective Hamiltonian in the localized regime

Equations (6.13) and (6.19) define the joint probability density which, if our maximum-entropy hypothesis is correct, determines the statistics of the conduc­tance both in the metallic (g0 > 1) and localized (g0 <ζ 1) regimes. It describes an interacting gas of Ν charges with uniform density \g0 over the interval 2L/1; clearly if g0 < 1 then the typical spacing of the charges becomes greater than unity. If there are no charges near the origin, then from eq.(3.30) we expect the typical conductance to be exponentially small, and not to scale according to Ohm's law. In addition, the fluctuations in the conductance are altered when g 0 < 1. Since the range of the logarithmic interaction in eq. (8.19) is unity, this implies that configurations of the charges in which their interaction is logarith­mic are very improbable; instead one typically has the very asymmetric interac­tion mentioned at the end of section 7. Because of this non-standard interaction, the spectrum is no longer even approximately described by the standard loga­rithmically interacting random matrix ensembles. Thus the characteristic fluc­tuation properties change and the ensemble no longer exhibits universal conductance fluctuations. This cross-over to a different statistical behavior of

428 A.D. Stone et al.

the conductance in the localized regime follows naturally from our maximum entropy hypothesis. In the Coulomb gas analogy, the system is now in a low-density phase, in which it is the long-range behavior of the effective interaction that is important.

To describe the low-density limit we expand eq. (8.19) assuming that v„ > vm, η > m to obtain

H l oc = f - ( n - 1 + 1/JSK + ig0v2. (10.1) n = 1

One sees, as discussed below, that the interaction energy of the nth charge due to all charges nearer to the origin is independent of the position of these charges and just proportional to their number (except for the force due to the 'image charge' at — v„ which depends on β). Hence, excluding rare configurations in which charges approach one another, the partition function for the gas is described by a single-body quadratic effective Hamiltonian,

W . » = C , „ n e x p r - ^ ( v . - f c i ± i » ! Y «=ι L 4 V £o /

(10.2)

From this equation one immediately sees that the mean position of the nth charge is

vo = 2 ( n - l + l / f l

go

Hence the average positions of the charges form a one-dimensional 'lattice' with spacing d = 2/g0. From the Gaussian form of the effective Hamiltonian we can read off the fluctuations of the charges around these equilibrium positions,

<(v? - v„)2>1/2 = ( 2 / f e 0) 1 /2 ~ J~d- (10.4)

Since the rms position fluctuation is much less than the mean spacing, we see that the charges are tightly bound to their lattice positions.

10.2. p(g) in the localized regime

This has important consequences for the statistical behavior of the conductance in the localized regime. Since vn > 1, Vn, we have (1 4- cosh v „ ) - 1 « 2 exp(— v„); if the fluctuation in the position of each v„ is small compared to their mean spacing, then we can assume that configurations in which more than one charge contributes to the conductance have negligible weight and

g*8 £ e " v" « 8 e " V l, (10.5)

where we remind the reader that we assume vl<v2< · · · < v„, and the statistical

Random matrix theory and maximum entropy models 429

behavior in other orderings is the same by permutation symmetry of the joint probability density. It follows that l n g » — ν ^ ΐ η δ » — v l 5 and by integrating out all other variables in eq. (10.2) we obtain

p(ln g)»(4π/£0/?)1/2 exp - ?f (in g + ^-J (10.6)

Hence we immediately obtain the result, found in many microscopic calcula­tions, that the logarithm of the conductance is normally distributed in the localized regime. In our model the mean and variance are exactly equal

- < l n * > = var(ln (10.7) Pgo

In fig. 8 we show numerical calculations (Pichard et al. 1990a) for an Anderson model in the strongly localized limit, in which the probability density of — Ing is indeed Gaussian with variance approximately equal to its mean, confirming semi-quantitatively this prediction of the model. Note that here, as in the metallic regime, we see the simple dependence of the fluctuation amplitude on the symmetry parameter, β, except now the universal reduction factors are predicted for var(ln g) and not g. However, an even more striking prediction

Fig. 8. Probability distribution of - I n g fit to a Gaussian with - < l n g} = 6.5, v a r ( - l n g) = 7.0 for two-dimensional square with L= 10, W= 12.

430 A.D. Stone et al.

of the model is a strong dependence of < lng> (or equivalently, the localization length ξ) on this parameter.

10.3. Dependence of localization length on β

We have calculated here the probability density of In g at zero-temperature in the strongly-localized limit. Unlike the metallic limit, in which temperature-dependent corrections can be small, in the localized limit Mott variable-range hopping usually dominates transport (Lee 1984), since it provides a much higher conduction transport mechanism (even in mesoscopic samples). None­theless, a crucial parameter of that theory, the localization length, ξ = [ ( l/2L)<ln g > ] " \ is a zero-temperature quantity, which we can obtain immedi­ately from our theory,

ξ = βξ = βΝΙ. (10.8)

The result ξ = Λ/7, where Ν is the number of channels and / is the elastic mean free path, in the case β = 1 is well-known from previous work (Anderson et al. 1980). The simple and exact proportionality of ξ to the symmetry parameter β has only been derived previously in one-dimensional models (Efetov and Larkin 1983, Dorokhov 1983). Its recent derivation from our global approach (Pichard et al. 1990b) is much more general, but its range of validity requires further study. The theory makes the following very striking qualitative predic­tion. Anderson-type insulators with negligible spin-orbit interaction (corre­sponding to β = 1 in the absence of a magnetic field), will have a positive hopping magnetoconductance, since ξ-*2ξ upon application of a weak field. Insulators with strong spin-orbit scattering (corresponding to β = 4 in the absence of a field) will have negative magnetoconductance, since ξ -» \ξ. Very recently experimental measurements of hopping conductivity of three-dimen­sional insulators were performed to test this prediction and observed this effect, with enhancement or reduction factors in reasonable (10-30%) agreement with the theory (Pichard et al. 1990b).

Because of the remarkable simplicity of the result eq. (10.8) it is useful to summarize our current understanding of its range of validity. We have assumed that the effect of a magnetic field is simply to break time-reversal symmetry, without changing at all the average density of the eigenparameters v„. For example, in two-dimensional quantum Hall conductors one expects this density to be very strongly affected by the magnetic field, e.g. due to the appearance of edge-states and bulk Landau levels. The very interesting localization behavior in this limit is presumably described by a very different theory. Similarly, even in insulators with sufficiently strong disorder to be in the variable-range hop­ping regime; the zero-temperature localization length will always begin to decrease with sufficiently large field, due to magnetic confinement of the wave-

Random matrix theory and maximum entropy models 431

functions. Indeed the non-monotonic behavior suggested by this argument was observed in the recent experiments of Pichard et al. (1990b). Furthermore, there was no strong a priori argument, except for numerical studies on two-dimen­sional systems (Pichard et al. 1990a,b), that our theory should apply to two or three-dimensional systems, as the isotropy assumption introduced earlier would seem to be natural only for quasi-one-dimensional systems. Thus the apparent success of the theory in predicting experiments on three-dimensional insulators indicates a higher degree of universality than expected.

Another point emphasized by Pichard et al. (1990b), is that the appearance of the parameter β in the effective Hamiltonian for the charges {vn} may also illuminate a rather mysterious feature of weak localization theory, that metallic conductors with weak spin-orbit scattering exhibit positive magnetoconduc-tance (weak localization), whereas those with strong spin-orbit scattering exhibit negative magnetoconductance (weak anti-localization). We have just seen that in the insulating phase, the charges vn form a lattice and the position of the first site in the lattice is determined by the image-charge repulsion from the origin, whose strength is proportional to β~ι. Increasing β (as in the orthogonal to unitary transition) decreases this repulsion, decreasing vx and increasing the conductance; conversely decreasing β (as in the symplectic to unitary transition), increases this repulsion and decreases the conductance. Since the unitary transition always corresponds to application of a magnetic field, in the former case one gets positive magnetoconductance and in the latter negative.

In the metallic phase the charges form a dense gas, <g> |> 1, and many charges contribute to the conductance, nonetheless the image-charge repulsion is still present and pushes the gas a certain distance away from the origin which depends on β. The fact that the distribution of vx in the metallic regime obeys the Wigner surmise means that the image charge is roughly equivalent to having an extra charge at the origin. As β is changed the strength of this charge is changed by order unity, hence one would expect the whole charge density to shift in or out by an amount of order the spacing of a single charge. A change in the charge density near the origin by one charge should give roughly speaking a change in <g> by order unity according to Imry's JVe ff argument. Again applying a field can either cause β to decrease or increase, depending on whether the zero-field sample has weak or strong spin-orbit scattering. Hence this qualitative argument predicts a positive magnetoconductance of order unity in the former case and a negative one of order unity in the latter; exactly as occurs in microscopic models. Since <g> > 1, the symmetry-breaking in the metallic regime has only a small relative effect on the conductance. In the metallic regime the argument just given is simply suggestive and needs to be made quantitative, however, we have just seen from the previous argument that the effect of the image-charge repulsion becomes exponentially large in the localized regime, and can be predicted quantitatively from our model.

432 A.D. Stone et al.

11. The local maximum-entropy approach

11.1. Generalities

In the preceding section, the joint probability distribution Ρ{{λη}) for the parameters λη of the transfer matrix Μ was obtained from the distribution of maximum entropy for the Hermitian matrix Q = M f M , constrained by a given local density ρ(λ). The resulting Ρ({λη}) and some of its consequences for the distribution of the conductance were found to be consistent with numerical simulations and perturbative microscopic calculations. This is very significant, because it tells us that, at least in the cases that were examined there, the physics is contained in the symmetries of the problem - a general physical requirement - and in a global property like the level density - a more specific physical feature. The question whether such a picture has a greater generality is, in our view, of great importance and deserves a serious study. It is rather in a different direction that the above philosophy - of realizing that the physics of the problem is contained in a small set of parameters - will be further developed in this and subsequent sections. We shall see that, under appropriate physical conditions, one can reduce the number of parameters still further: it turns out that in addition to the number Ν of channels and the length L of the system, only the elastic mean free path / is necessary to provide a good description of quasi-one-dimensional systems. The statistical description of a (mesoscopically) thin slice of length 8L (thereby the name local approach) is done in terms of a maximum-entropy distribution constrained by a given value of Z; the distribution for arbitrary lengths is then found to obey a Fokker-Planck or diffusion equation in Ν dimensions. This is reviewed in the present section, whereas in the next one we indicate the predictions of the model. One of the predictions is the nature of the level density, which is not an input, but can be calculated from the diffusion equation in the local approach. In section 13 we show that if the correct choice of density is made, the global ansatz (eqs. (8.5) and (8.6)) is a solution of the diffusion equation, and at least in the limit Ν oo, L -> oo, the correct choice is the uniform density σ(ν) made in section 8 above. Hence, in addition to making the quantitative predictions given below, the local approach plays an important role in justifying the assumptions of the global approach.

A very important question that we ask in section 12 is why these maximum-entropy approaches work. We find the answer in the multiplicative nature of our transfer matrices (i.e. Μ is the product of the transfer matrices associated with smaller slices of the sample), and the existence of a limiting distribution (central-limit theorem (CLT) ) for such a product. In the weak-scattering limit, we shall find that out of the various parameters that define the individual distributions, only one - associated with I - survives the convolution process that constructs the final distribution, whereas all the others disappear in that

Random matrix theory and maximum entropy models 433

process. The limiting distribution described by this CLT is found to coincide with the maximum-entropy distribution described in the present section.

11.2. The ensemble of transfer matrices and combination law

In the local approach the fundamental quantity is the proability density of the transfer matrix M , rather than the derived quantity Q = M f M . It should by now be clear that an essential concept is the invariant measure d^ ( M ) on our group of M-matrices. By definition, άμ(Μ) remains invariant when all the M ' s are multiplied by a fixed one M 0 . Care has to be taken due to the fact mentioned in section 3 above that the parametrization (3.18) is not unique in the unitary case /? = 2, as there are Ν arbitrary phases ηα (a= Ι , . , . , Λ Γ ) still appearing in the analysis. This point can be taken care of by working, for β = 2, with the new measure άμ(Μ) = άμ(Μ) dμ(G), where dμ(G) = Π * = 1 (άηα/2π) describes the extra freedom. Using the parametrization (3.18), it has been shown (Mello et al. 1988a, Mello and Stone 1990),

άμ(Μ) = άμ(Μ ίάμ(0)-]β-1 = Jβ(λ) Π άλα f[ <M(u(i)), (11.1)

α ;= 1

where 3β(λ) is the Jacobian defined in section 5 and d^(u) is the invariant measure associated with the unitary matrices u; notice that u ( 1 ), . . . , u ( 4) appear in eq. (11.1) for β = 2 and only u ( 1 ), u ( 2) for β=\.

Our ensemble of disordered conductors of length L will now be described by an ensemble of transfer matrices M , defined by the differential probability d P L( M ) . Just as above, it will be convenient to work with the modified differential probability d P L( M ) = dPL(M)^(G)f~\ which we split as

d P L( M ) = p L( M ) d / i ( M ) . (11.2)

The probability density p L( M ) cannot depend on L in an arbitrary way, since it must satisfy the following combination law. Suppose that we put together two wires (with the same cross section) of lengths L', L" and transfer matrices Μ ' , Μ" ; the resulting length and transfer matrix are L = L" + L and Μ = Μ " Μ ' , respectively. If we designate by pL(M') and pL>W) the two respective prob­ability densities and assume M ' and M " to be statistically independent, the resulting probability density is given by the 'convolution'

P L ' ' + L ' ( M ) = J P r f M M ' - ^ p . f M ' ) d£ ( M ' ) (11.3)

= PL»*PL- (11-4)

The simple structure of the combination law (11.3) is a consequence of having split the differential probability in terms of the invariant measure, as in eq. (11.2). Equation (11.3) is a very severe requirement on the set of p L( M ) ' s for different

434 A.D. Stone et al.

values of L, because the convolution of two pL's must be another pL, whose index is the sum of the two indices.

Suppose we break up our system in subsystems of length 8L; eq. (11.3) implies that, if we are given the 'building block' pbL, we can construct the distribution pL for the full system as the multiple convolution

PL = PoL*PdL* — *PdL> (11.5)

11.3. Local maximum entropy ansatz and diffusion equation

In what follows, we shall propose an ansatz for pdL, assuming that 5L is a small, but still macroscopic, length. Out of all possible distributions pdL(M) that are normalized and correspond to a fixed value of the mean free path Z, i.e.

Ν 1 (Σ%=0 >0<5L 1 n Λ • Λ δΖ = 7' ( 1 L 6 )

we choose the one whose entropy,

ftL(M)lnp,L(M)d£(M), (11.7)

is a maximum. The result is

p6L(M) = exp Ν

n= 1 (11.8)

where a and b are Lagrange multipliers. It represents, for small lengths, an ensemble of M-matrices that is as random as is allowed by the constraint (11.6) and the normalization condition.

The building block (11.8) is isotropic, i.e. independent of the unitary matrices u ( 0 of eq. (3.18). One can prove that the resulting convolution (11.4) is again isotropic (Mello et al. 1988a); i.e. it is a function of λ = (λΐ9...9 λΝ) only, pL{X). An isotropic distribution implies, by eq. (3.21), that flux incident in one channel is transmitted with the same probability and with random phases into any channel. Intuitively, this seems reasonable if the system is long compared with the mean free path / and the width W. We shall indeed see that the present model yields reasonable results in the diffusive regime (L > Z) and for quasi-one-dimensional systems (L > W).

It is actually more practical to describe the 'evolution' (11.4) in terms of a differential equation. In order to derive such an equation we set L" = L and L = 5L in eq. (11.3); we then expand both sides in powers of 5L and take the limit 6L-> 0. This procedure is similar to that followed in the Brownian-motion problem, in which the time evolution is described by a Smoluchowsky equation

Random matrix theory and maximum entropy models 435

which, when it is expanded in powers of δί and the limit St -> 0 is taken, gives rise to the Fokker-Planck or diffusion equation.

The details of this procedure are given by Mello et al. (1988a), where this equation was first derived. Here we outline very briefly the key ideas used in the derivation. Equation (11.3) now relates pL+dL to pL, pbL by integration over the measure describing the distribution of transfer matrices Μ ' of the small additional slice 5L. This measure is completely determined by the invariant measure of the unitary group and the ansatz leading to eq. (11.8), and the integration can be performed if we express the transfer matrices in terms of our basic parameterization, eq. (3.18). Each of these three probability densities depend only on the parameters X(L + 8L), A(L), A(8L), respectively. We can regard the transfer matrix M ' of the additional slice as a small perturbation on X(L\ which generates a small increment δλ. Since the λ are eigenvalues of a Hermitian matrix (X), we can use ordinary perturbation theory to expand the change in each λη to appropriate order in δλ. We can then perform the averages over δλ implied in eq. (11.3) using the ansatz (11.8) to generate a power series in 8L. Finally, we must average over all the unitary matrices appearing on both sides of eq. (11.3), to obtain an equation relating the reduced probability density of A(L + 5L) to A(L), which in the limit bh -+ oo becomes a differential equation. The procedure outlined here yields for the probability density

*(β\λ) = ρ(λ)3β{λ) (11.9)

of the variable λ, the diffusion equation

6 w f U ) 10)

6s

where 5 = L/l and the initial condition is

<β)(λ) = δ(λ). (11.11)

11.4. Predictions of the local approach

The diffusion equation presented above depends directly on the particular choice made in our maximum-entropy ansatz, eq. (11.8). However, by expressing quantities of interest in terms of our canonical parameterization eq. (3.18), and averaging over the unitary matrices which appear using the invariant measure of the unitary group (Gaudin and Mello 1981), one can obtain certain results which follow simply from the isotropy assumption. Thus we first indicate some results that are a direct consequence of the isotropy of the transfer-matrix distribution.

436 A.D. Stone et al.

One can, for example, relate the average reflection coefficient Raa back to the same channel to the one to a different channel, as

< R a a> 2, (/? = ! ) ,

= 1, (β = 2). (11.12)

This result means that, if T-symmetry holds (β = 1), backward scattering to the same channel is enhanced by a factor 2, as compared to any other channel. This is precisely the prediction of weak-localization theory (Bergmann 1984), where the argument is that the various paths contribute with random phases, except for a path and its time-reversed one which, if T-symmetry holds, contrib­ute coherently and give rise to a factor 2 in the backward direction. It is striking that this result is apparently so general that it follows merely from the isotropy assumption.

Another exact result that follows directly from isotropy is the the structure of the covariance function for pairs of transmission coefficients (Mello et al. 1988b, Mello and Stone 1990)

Cab,atb> — (TabTafb,y — (Taby (Ta,b,y, ( i 1.13)

for which one finds

clb,a,b, = LAN<T2y - βΝ<τ2>] daa,sbb, + [ Λ „ < Τ 2> - 5 Ν< Τ 2> ] (daa, + dbbf) + IAN<T2) - B2(T2> - CN(T}21 (11.14)

In eq. (11.14) we have defined

1

AN —

λα)" '

N2+ 1

i V 2 ( N 2 - l ) 2 '

2

* N2(N2-l)2'

Q = ^ . (11.15)

Equation (11.14) is exact. As a check, we can easily verify that the sum of eq. (11.14) over a, fc, a\ b' gives precisely var T.

The structure of eq. (11.14) is the same for β = 1, 2 although the explicit value of the coefficients of the ^-functions in eq. (4.10) does depend on β.

Random matrix theory and maximum entropy models 437

In Feng et al (1988, eq. 3), three terms of this type are also obtained from rather involved applications of weak-disorder perturbation theory; if we assume W<L (quasi-one-dimensional systems), they are seen to have essentially the structure provided by the ^-functions of our eq. (11.14). The difference is that our Kronecker deltas (that we can write as baa, = dAq = 0, with Δ<1α = \<1α — q.afl<la being the transverse wave vector labeling the channel (the eigenmode a)) are replaced by functions which are peaked at the wavevectors which satisfy the appropriate Kronecker deltas in our calculation, but decay over some distance in momentum space (this distance becomes shorter as the system becomes more one-dimensional). We can describe in similar terms the difference between our result (11.12) and that of weak-localization theory.

Although we have not reproduced the details here, the conceptual simplicity of the calculation leading to eqs. (11.12) and (11.14) is striking, when compared to the complexity of the weak-localization perturbative calculations whose results are exactly reproduced in the extreme quasi-one-dimensional limit. Apparently, as noted above, the isotropy assumption restricts us to this limit, an improvement of the present model must thus involve a better treatment of the statistical properties of the matrices u ( 0 of eq. (3.18).

In order to calculate the size dependence of expectation values of physical quantities, we must go back to the diffusion equation (11.10), which governs the 'length evolution' of the probability density w(f\X).

Let F(X) be any function of interest (e.g. the conductance or its square). Multiplying both sides of eq. (11.10) by F(X) and integrating over λ we obtain, for the expectation value

(11.16)

the evolution equation

ΗβΝ + 2-β) 8<F> i «

8s

λα{\ + λ.) 8F/8A„ - λ„(1 + λ„)

Κ ~

(β)

(11.17) s

The total transmission coefficient T = ^g and its moments are the basic quantities that we now consider. We thus set F = Tp in eq. (11.17) and obtain, for the evolution of the pih moment of T, the equation

438 A.D. Stone et al.

(2-β + β Ν ) ° - ^ -

= {-βρΤ^1 - (2 - β)ρΤ* + 2p(p- \)Tp-2(T2 - T3)>f , (11.18)

where Tk was defined in eq. (11.15). In the limit

^/n^S^N, (11.19)

one can show from eq. (1.18) that < T P> ^ } can be expanded as a series of descending powers of the ratio N/s = Nl/L. In this regime we thus have a model described by the single parameter Nl/L. The right-hand side of the inequality (11.19) corresponds to L<^iVZ, i.e. lengths smaller than a characteristic length of the order of the quasi-one-dimensional localization length, ξ. The left-hand side of eq. (11.19) implies, first of all s$>\, which means L > Z. Therefore, to satisfy eq. (1.8) we must be in the metallic, or diffusive regime. But since Ν ~(k¥W)d~l, s>y/N restricts L/w to be > 1 , i.e. a quasi-one-dimensional system, for d ^ 3.

For the average transmission coefficient one finds

<T>f = γ -Η , , ι + - , (11-20)

The first term of eq. (11.20) is Ohm's law, while the second term is a negative correction which only is present when the system has T-symmetry, just as found in weak-localization theory. In fact, the weak-localization correction to <T>, calculated by diagrammatic techniques (Mello and Stone 1990), gives precisely the value — found in eq. (11.20), in the limit L/W^> 1 and for arbitrary d. That for d = 2 eq. (11.20) gives the same correction even for a square, is a weakness of the present model, since it is well known that the weak-localization correction in two dimensions depends logarithmically on the system size.

For the variance of T, one finds that the first two terms of the series cancel exactly, giving the leading term

var T =

which is independent of the number Ν of channels, the length L of the conductor and the mean free path Z. We thus get for the conductance the universal rms value

rms g = x/8/T5 = 0.730, β=1,

= ν/4/Γ5 = 0.516, β = 2, (11.22)

Random matrix theory and maximum entropy models 439

these are the same numerical values found in Lee and Stone (1985), Al'tshuler and Shklovskii (1986), for the quasi-one-dimensional case, with the use of microscopic Green function techniques. Thus as stated in section 9 above, it is possible to derive quantitatively the known microscopic values for U C F from the local approach, and to explicitly obtain the universal dependence on the symmetry parameter β, as seen in eq. (11.22).

One can similarly calculate <T 2> and thus complete the coefficients needed in front of the ^-functions in eq. (11.14). They are found to have the same structure as those obtained in Feng et al. (1988) using a microscopic theory.

12. Central-limit theorem for local approach

12.1. Why do maximum-entropy approaches work?

In this article, the theory of disordered conductors was approached by introduc­ing a statistical distribution that fulfills a small number of constraints and, aside from that, is 'as random as possible'. The intuitive notion behind such an approach is that the physical properties of interest depend only on those few constraints and are insensitive to the microscopic details of the system.

We recall an elementary case where a situation like the one we just mentioned occurs: the central-limit theorem (CLT) of statistics. Let xi9 i=\, ...,n be η statistically independent variables, each one described by the statistical distribu­tion P i (Xf ) ; the CLT studies the distribution pn(x) of their sum χ = Σίχί. The result is that if η is large, pn(x) is approximately Gaussian, with centroid χ = nxx and variance σ2 = ησ\\ xx and σ\ are the centroid and variance of the 'microscopic' distribution Ρι{χι), its other moments being irrelevant if n> 1. Alternatively, we could say that successive convolutions of P i ( x f) 'wash out' its details, except for its centroid and variance which are the only ones that survive the convolution process. The microscopic information is certainly not lost: it is in the tails of the resulting distribution pn(x), farther and farther away from the body of pn(x) as η increases. We recall from section 4 that a Gaussian distribution is a distribution of maximum entropy for that centroid and vari­ance. We can thus rephrase the CLT saying that for a given centroid x~[ and variance σ\ of Ρχ(χ), pn(x) approaches, as η increases, a distribution of maximum entropy, constrained by the value nxx of the centroid and ησ\ of the variance.

12.2. CLT in weak-disorder limit

The purpose of this section is to exhibit a similar situation in the problem of disordered conductors (see Mello and Shapiro 1988). When we describe the system in terms of its transfer matrix M , the essential feature of the problem is the multiplicative nature of M : i.e. Μ is the product

Μ = M n ... M t (12.1)

440 A.D. Stone et al.

of the transfer matrices of η microscopic units. Suppose we choose a statistical distribution for the individual M^, that we designate by P i ( M f) , and assume all the M , to be statistically independent. The question is whether the resulting distribution for the Μ of eq. (12.1), that we call pn(M), tends to a limiting universal form as η grows: if this occurs, we shall say that we have a CLT, in analogy with the familiar CLT reviewed above.

Just as in ordinary statistics, in the present problem too it is the convolution (defined in eq. (11.3)) of the individual distributions px that give the distribution of the product Μ of eq. (12.1); i.e.

Ρ„ = Ρ ι * Ρ ι * · · · * Ρ ι · (12.2)

The recursion relation for the distribution P„(M) associated with η units is

p „ + 1( M ) = P n . P l = J p„(MM'"^(Μ')άμ(Μ'). (12.3)

Here we shall assume that the individual P i ( M ' ) is isotropic, i.e. independent of the unitary matrices of eq. (3.18); it is thus a function of X only. One can prove that the convolution of two isotropic functions is again isotropic, so that the resulting pn + l of eq. (12.3) depends on λ only. We shall also assume T-invariance, just as in Mello and Shapiro (1988). If the effect of the added unit in the recursion relation (12.3) is small enough, one can make the expansion

wH + 1(i) = wM(i) + <tr λ')ΰγ>Λ(λ) + O K W ) , (12.4)

which is now expressed in terms of the probability density w(X) of eq. (11.9). In eq. (12.4), D represents the differential operator

6 - 7 ^ L K i A , + l - ) m K m , l 2- 5 )

Our next step is to iterate eq. (12.4) m times, in order to add m units to our system.

We now make the change of variable λ = νλ' where ν is the number of scattering units per unit length. We also define the weak-scattering limit, when ν ^ oo, while the probability density of Xstayes fixed; in particular

<Ut> = &. (12.7)

We notice that eq. (12.6) represents, for small λ', the scattering probability per unit length, which is just the inverse mean free path. At the same time, we take the limit η -• oo, so that n/v = L (the fixed length of the wire), and m oo,

Random matrix theory and maximum entropy models 441

so that m/v = 5L (the fixed added piece). We then have from eq. (12.4)

w L + M, t t ) - wL(A) = y U w L( i ) + 0((5L//)2), (12.8)

and, taking the limit 8L 0, we obtain the differential equation

8w

~8L

One also finds the initial condition

no(i) = S(l). (12.10)

The crucial feature of our result is that all the moments of the the distribution other than (12.6) have disappeared: information about the microscopic details enters only via a single parameter — the mean free path. Thus, whatever is the solution of eq. (12.9), it can depend only upon one global property (eq. (12.6)) of the distribution for the individual scattering units. This statement constitutes the CLT for a weakly disordered conductor and demonstrates the disordered potential only enters through a single parameter for that case.

Equation (12.9) coincides with the diffusion equation (11.10), that had been obtained via a local maximum-entropy assumption. We can thus state our CLT saying that for a fixed mean free path I, w„ ( i ) approaches, as η increases, in the weak-scattering limit, a distribution that locally, has maximum entropy. We shall see in the next section that the local maximum-entropy distribution, in turn, gives rise (under the conditions specified below) to a global maximum-entropy distribution for the full conductor, among distributions with the same level density.

This explains why the maximum-entropy approach that we have used works. It is a CLT that lies behind the scene that is responsible for its success. It is not our subjective 'ignorance' or 'lack of information' about the system (as one sometimes finds in literature), that justifies the application of the method. In other words, it is not the physical constraints that we happen to know about the system that we need to introduce in the model, but rather the constraints the system allows us to know, all other information being lost (or pushed to the tails of the distribution) by the the convolution process which leads to the appropriate CLT.

We finally make a few comments on the limitations of the model. First, we can only prove a relevant CLT in the weak disorder and show that only one parameter, /, appears in the distribution. We do not know the generalization when disorder is not weak, except in the one-channel problem (see, e.g. Erdos and Herndon 1982, Mello 1986, Shapiro 1987) where one finds, in general, two parameters that merge into just one for weak disorder (this was proved with isotropic distributions).

442 A.D. Stone et al.

As we saw in the previous sections, the assumption of isotropy leads directly to a backwards scattering enhancement of vanishing width, and similar narrow­ing of the correlation function (11.14) between transmission coefficients. More generally, there are several indications that the isotropy assumption limits the quantitative validity of the approach to quasi-one-dimensional systems. An even more basic objection is that as a result of isotropy and weak disorder, the dimensionality d of the system enters the model only through the number of channels N; yet we know that a strip may behave very differently from a bar with the same N, due to the different connectivity of the system. It would thus be very interesting to investigate the possibility of a CLT starting with a nonisotropic distribution for the individual scatterers and study under what conditions a completely isotropic distribution occurs as Ν oo, L oo, as one aspect of the CLT. The relaxing of both the isotropy and the weak-scattering assumptions represent important problems for future work.

13. Compatibility of global and local approaches

We have defined two maximum entropy models for describing quantum trans­port in a disordered sample of dimensionless length s = L//> 1. The global model is characterized by an ansatz for the joint probability distribution of the {λη} describing the entire conductor of length s:

where CPtN is a normalization constant and FS(X) is related to the input global density ps(X) by

The local maximum entropy model on the other hand is characterized by an ansatz for the joint probability density of {λη} which describes a small segment of the conductor of length s « 1; this ansatz combined with the multiplicative combination law for M , yields a diffusion equation for Ρ8(λί9 λ2, - ..,λΝ), which describes its evolution with 5. Due to the Jacobian factor 3β(λη) arising from the measure μ ^ Μ ) , this diffusion equation constitutes a set of Ν strongly coupled partial differential equations. Not surprisingly, these equations are difficult to solve in general. However, it is possible to establish the compatibility of the global and local approaches without obtaining such an explicit solution for the JPD, at least in the large N-limit.

First consider the relationship between the two maximum entropy hypothe­ses. The maximum entropy distribution for the 'building block' assumed in the

Ν Ν

ps{XuXi,...,xN) = ce,N Π \Κ-Κ\β Π UK\ (13.1)

m<n η — 1

(13.2)

Random matrix theory and maximum entropy models 443

local approach is

Ν

Π ,1=1

Ρ. = ι(λ1,λ2,...9λΝ) = fl Π exp[a-H], (13.3)

which is of precisely the same form as the JPD for the whole conductor assumed in the global approach. Hence if the two approaches are compatible, then it must follow that if we put some number ρ building blocks in series, representing ρ good conductors, that the only effect of the series addition is to weaken the confining potential, since the conductance must decrease on average as the length of the conductor is increased.

A proof of this compatibility between the two maximum entropy descriptions has been given in Mello and Pichard (1989). If the two descriptions are compati­ble in some limit, then the solution of the diffusion equation (11.10) of the local approach, for s > 1 must be of the form (13.1) assumed by the global approach. If we assume a trial solution of the diffusion equation of this form, then we may integrate over λ29..., λΝ, to obtain a integro-differential equation for the density ps(X) in the large-N limit:

9s 9/1 (13.4)

The density must satisfy this equation in order for the global ansatz to be a solution of the diffusion equation. We note that β does not appear in this equation. This is important because it implies that the eigenvalue density is insensitive to the symmetry of the system, a crucial assumption needed to derive the universal dependence of the localization length on β (eq. (10.8)).

On the other hand, integrating the diffusion equation over λ2, ...9λΝ, we obtain an exact integro-differential equation for the evolution of ps(X) as a function of s, as implied by the local approach:

9P* 2 8 [ u tt n( * . p i { i )_ m _ l )p

8s βΝ + 2 - β δλ L ' \δλ λ(1 + λ) ^

Κ2(λ,λ')άλ'

λ-λ'

(13.5)

where Κ2(λ, λ') is the two-point correlation function for the distribution of {X„}. If we now ignore the second-order correlations contained in the function Τ2(λ, λ'), and assume

Κ2(λ,λ')χρ{λ)ρ(λ'), (13.6)

we see that in the limit Ν -»· oo, the diffusion equation for the density becomes identical to that derived above by assuming that the solution of the full diffusion equation for the JPD was of the form assumed in the the global approach. Hence such a solution is consistent with the assumptions of the local approach.

444 A.D. Stone et al.

We note that the 'mean-field' approximation of factorizing the two-point correlation function is exactly the same as that leading to the expression for the confining potential in terms of the charge density in section 8 above. Such an approximation completely throws away the non-trivial correlations leading to universal conductance fluctuations, but is sufficient to correctly predict the large-scale variation of the density. We saw in section 6 above, for example, that the approximation correctly predicts the Wigner semi-circle law in the Gaussian ensembles. Hence we expect the approximation to give the large-scale behavior of the density in this case as well.

We underline the importance of this compatibility proof. Since the global ansatz for the JPD is of the same form as the local ansatz for the building block, and is compatible with the diffusion equation for the JPD in the local approach, we have shown here that such an ansatz for the Ρβ({λη}) has the very remarkable property of remaining stable under connection of conductors in series. This strongly supports our earlier conclusion that the validity of the global approach is not restricted to good conductors, where L < ξ, but still holds after increasing the length until (L > ξ) and the sample is an Anderson insulator. We note also that insofar as the global approach is equivalent to the local approach, which we have seen is only quantitatively correct in the quasi-one-dimensional limit, the global approach is also restricted to this regime of validity. However, both the numerical results of section 9, which were mostly obtained from two-dimensional samples, and the experimental results on Mott hopping cited in section 10, indicate a wider range of validity for the global approach.

An explicit solution for the density ρ(λ) for quasi-one-dimensional systems can be achieved by a solution of the integro-differential equation (13.4). This has only been possible in the limit s -+ oo. No t surprisingly, it is useful to use as variables the Liapunov exponents α = v/2L, where as before λ = ^(cosh ν — 1).

It is then straightforward to show that the density σ(α) is uniform in the interval [0,1] in the large AMimit. The detailed proof is given in Mello and Pichard 1989. This means that the Liapunov exponents of Μ are equally spaced in the local approach. Such a uniform density is exactly what is assumed in the global approach. We point out again that a tendency for the Liapunov exponents to be uniformly distributed is not specific to transfer matrices describ­ing quantum transport but is a generic property of matrix products observed in different contexts, e.g. dynamical systems (Pichard and Andre 1986).

14. Summary and conclusions

We have shown that many of the diverse phenomena of quantum transport in disordered conductors can be understood in terms of maximum-entropy models

Random matrix theory and maximum entropy models 445

for the random transfer matrices which determine the conductance. In particu­lar, the global maximum-entropy approach was shown to provide a unified framework for deriving the universal conductance fluctuations in the metallic regime and the Gaussian fluctuations of Ing (or equivalently, the inverse localization length) in the localized regime. The local maximum-entropy approach was shown to predict quantitatively all the known results of micro­scopic calculations in the metallic regime (in the quasi-one-dimensional limit). Both approaches utilize only the fundamental symmetries of the problem, current conservation, and if appropriate, time-reversal and spin symmetry, and a small number of physical inputs. Hence both approaches free the theory from detailed microscopic models and elucidate to some extent the origin of univer­sality and scaling behavior in the quantum theory of disordered conductors. In particular, we believe this approach has clarified greatly the crucial influence of the symmetry of the system on the transport properties, as illustrated by the universal dependences of measurable quantities on the symmetry parameter β obtained in both the localized and metallic regimes. The major shortcoming of the approaches at present is that they have not been generalized to describe the characteristic phenomena of two and three-dimensional localization. It seems likely that such a generalization requires relaxing of the isotropy assump­tion for the eigenvectors of the transfer matrix.

We conclude by summarizing the relation of the present work to the classic theory of random matrices, and in particular the maximum-entropy approach. It appears that in many respects the ensemble of random transfer matrices which describes disordered conductors provides a better physical application for the general theory than any known previously.

First, due to the multiplicative compositional law, as shown above, there is a kind of central-limit theorem which enforces the convergence of the joint probability density to one of maximum information entropy. This is in contrast to the Hamiltonian ensembles proposed in applications of random matrix theory to nuclear physics, in which a maximum-entropy hypothesis subject to the constraint of a given level density has no deeper justification; and is merely chosen to give both the correct local statistics and of course the right level density. Presumably this limiting behavior is intimately related to the existence of a limiting density of Liapunov exponents for random matrix products; however, there is clearly room for more rigorous and extensive mathematical study of this question.

Second, the ensemble of disordered mesoscopic conductors provides a physi­cal realization of the three classic ensembles, the orthogonal, unitary and symplectic, and the unique possibility of experimentally inducing transitions between ensembles by application of a weak magnetic field, or by the addition of impurities with strong spin-orbit interactions. Hence the possibility of meas­uring universal dependences of physical observables on the symmetry of the ensemble.

446 A.D. Stone et al.

Finally, we have found that the ensemble of random transfer matrices, although similar to the classic ensembles, has important differences arising from the 'image-charge' effect and breakdown of translational invariance in the metallic regime, and the necessity of treating a low-density phase, not described by the Ν oo limit, to understand the localized regime. We therefore believe that the ensemble of random multiplicative transfer matrices constitutes a new random matrix ensemble, whose interesting and novel properties are only partially explored, and which provides a fruitful new research area for physicists interested in this elegant and universal approach to random systems.

Acknowledgements

The authors gratefully acknowledge helpful discussions with Y. Imry, R. Jalabert, and K. Slevin. A.D. Stone and P. Mello thank C E N Saclay for support during visits in which part of this work was performed, and J.-L. Pichard thanks the I B M T.J. Watson Research Center for similar support. Research at Yale was partially funded by NSF grant DMR-8658135. Research at the University of Florida was partially funded by NSF grant DMR-8813402. All of the authors thank the Aspen Center for Physics for support during visits in which the final draft was prepared.

References

Abrahams, E., P.W. Anderson, D.C. Licciardello and T.V. Ramakrishnan, 1979, Phys. Rev. Lett. 42, 673.

Al'tshuler, B.L., 1985, JETP Lett. 41, 648. Al'tshuler, B.L., and D.E. Khmel'nitskii, 1986, JETP Lett. 42, 359. Al'tshuler, B.L., and B.L Shklovskii, 1986, Sov. Phys.-JETP 64, 127. Anderson, P.W., 1981, Phys. Rev. Β 23, 4828. Anderson, P.W., D.J. Thouless, E. Abrahams and D.S. Fisher, 1980, Phys. Rev. Β 22, 3519. Aronov, A.G., and Yu.V. Sharvin, 1987, Rev. Mod. Phys. 59, 755. Balian, R., 1968, Nuovo Cimento 57, 183. Baranger, H . U , and A.D. Stone, 1989, Phys. Rev. Β 40, 8169. Bergmann, G , 1984, Phys. Rep. 107, 11. Berry, M .V , 1985, Proc. R. Soc. London A 400, 229. Birge, N . O , B. Golding and W.H. Haemmerle, 1989, Phys. Rev. Lett. 62, 195. Birge, N . O , B. Golding and W.H. Haemmerle, 1990, Phys. Rev. B, in press. Bohigas, O , M.-J. Giannoni and C. Schmit, 1984, Phys. Rev. Lett. 52, 1. Brody, T .A , J. Flores, J.B. French, P.A. Mello, A. Pandey and S.S.M. Wong, 1981, Rev. Mod.

Phys. 53, 385. Buttiker, M , 1986, Phys. Rev. Lett. 57, 1761. Debray, P , J.-L. Pichard, J. Vicente and U. Tung, 1989, Phys. Rev. Lett. 63, 2264. Dorokhov, O . N , 1983, Sov. Phys.-JETP 58, 606. Dyson, F.J, 1962a, J. Math. Phys. 3, 140.

Random matrix theory and maximum entropy models 447

Dyson, F.J., 1962b, J. Math. Phys. 3, 157. Dyson, F.J., 1962c, J. Math. Phys. 3, 166. Dyson, F.J, and M.L. Mehta, 1963, J. Math. Phys. 4, 701. Eckmann, J.P, and C.E. Wayne, 1988, J. Stat. Phys. 50, 853. Edwards, J.T, and D.J. Thouless, 1972, J. Phys. C 5, 807. Efetov, K .B , and A.I. Larkin, 1983, Sov. Phys.-JETP 58, 444. Erdelyi, A , 1953, Bateman Manuscript Project, Vol. 2, section X (McGraw-Hill , New York). Erdos, P , and R.C. Herndon, 1982, Adv. Phys. 31, 65. Feng, S , P.A. Lee and A.D. Stone, 1986, Phys. Rev. Lett. 56, 1960, 2772(E). Feng, S, C L . Kane, P.A. Lee and A.D. Stone, 1988, Phys. Rev. Lett. 61, 834. Fisher, D .S , and P.A. Lee, 1981, Phys. Rev. Β 23, 6851. Gaudin, M , and P.A. Mello, 1981, J. Phys. G 7, 1085. Hammermesh, M , 1962, Group Theory and Its Applications to Physical Problems (Addison-

Wesley, Reading, MA ) . Imry, Y , 1986a, in: Directions in Condensed Matter Physics, eds G. Grinstein and G. Mazenko

(World Scientific, Singapore). Imry, Y , 1986b, Europhys. Lett. 1, 249. Landauer, R , 1970, Philos. Mag. 21, 863. Lee, P .A, 1984, Phys. Rev. Lett. 52, 1641. Lee, P .A, and A.D. Stone, 1985, Phys. Rev. Lett. 55, 1622. Lee, P .A, A.D. Stone and H. Fukuyama, 1987, Phys. Rev. Β 35, 1039. Livi, R , A. Politi and S. Ruffo, 1986, J. Phys. A 19, 2033. Mehta, M . L , 1967, Random Matrices and the Statistical Theory of Energy Levels (Academic Press,

NY ) . Mehta, M . L , and M. Gaudin, 1960, Nucl. Phys. 18, 420. Mello, P .A , 1986, J. Math. Phys. 27, 2876. Mello, P .A , 1988, Phys. Rev. Lett. 60, 1089. Mello, P .A , and J.-L. Pichard, 1989, Phys. Rev. Β 40, 5276. Mello, P .A, and B. Shapiro, 1988, Phys. Rev. Β 37, 5860. Mello, P .A, and A.D. Stone, 1990, submitted to Phys. Rev. B. Mello, P .A , P. Pereyra and N. Kumar, 1988a, Ann. Phys. 181, 290. Mello, P .A, E. Akkermans and B. Shapiro, 1988b, Phys. Rev. Lett. 61, 459. Muttalib, K .A , J.-L. Pichard and A.D. Stone, 1987, Phys. Rev. Lett. 59, 2475. Paladin, G , and A. Vulpiani, 1986, J. Phys. A 19, 1881. Pichard, J.-L, 1984, Thesis (University of Paris, Orsay, no. 2858). Pichard, J.-L, 1990, in: Quantum Coherence in Mesoscopic Systems, Proc. Nato Advanced Study

Institute (Les Arcs, France). Pichard, J.-L, and G. Andre, 1986, Europhys. Lett. 2, 477. Pichard, J.-L, and G. Sarma, 1981, J. Phys. C 14, L127, L617. Pichard, J.-L, N. Zanon, Y. Imry and A.D. Stone, 1990a, J. Phys. (Paris) 51, 587. Pichard, J.-L, M. Sanquer, K. Slevin and P. Debray, 1990b, Phys. Rev. Lett. 65, 1812. Porter, C.E, 1965, Statistical Theory of Spectra: Fluctuations (Academic Press, NY ) . Sanquer, M , D. Mailly, J.-L. Pichard and P. Pari, 1989, Europhys. Lett. 8, 471. Shapiro, B , 1986, Phys. Rev. Β 34, 4394. Shapiro, B , 1987, Philos. Mag. 56, 1031. Stone, A . D , 1985, Phys. Rev. Lett. 54, 2692. Stone, A . D , 1989, Phys. Rev. Β 39-11, 10736. Stone, A . D , and R. Jalabert, 1990, unpublished. Stone, A . D , and A. Szafer, 1988, I B M J. Res. Dev. 32, 384. van Wees, B.J, H. van Houten, C .WJ. Beenakker, J.G. Williamson, L.P. Kouwenhoven, D. van

der Marel and C.T. Foxon, 1988, Phys. Rev. Lett. 60, 848.

448 A.D. Stone et al.

Washburn, S, and R.A. Webb, 1986, Adv. Phys. 35, 375. Webb, R.A, A.B. Hartstein, J.J. Wainer and A.B. Fowler, 1985, Phys. Rev. Lett. 54, 1577. Wharam, D .A , T.J. Thornton, R. Newbury, M. Pepper, H. Ajmed, J.E.F. Frost, D.G. Hasko, D.C.

Peacock, D.A. Ritchie and G.A.C. Jones, 1988, J. Phys. C 2 1 , L209. Wigner, E.P, 1953, Ann. Math. 53 , 36. Wigner, E.P, 1955, Ann. Math. 62, 548. Wigner, E.P, 1957, Ann. Math. 65, 205. Wigner, E.P, 1958, Ann. Math. 67, 325. Zanon, N , and J.-L. Pichard, 1988, J. Phys. (France) 49, 907.