This is the html version of the file http://dissertations.ub.rug.nl/FILES/faculties/science/2005/h.g.schapp/thesis.pdf.
G o o g l e automatically generates html versions of documents as we crawl the web.


Google is neither affiliated with the authors of this page nor responsible for its content.

Page 1
Ising models and neural networks
H.G. Schaap

Page 2
The work described in this thesis was performed at the Center for Theoretical Physics
in Groningen, with support from the “Stichting voor Fundamenteel Onderzoek der Ma-
terie” (FOM).
Printed by Universal Press - Science Publishers / Veenendaal, The Netherlands.
Cover design by Karlijn Hut.
Copyright © 2005 Hendrikjan Schaap.

Page 3
Rijksuniversiteit Groningen
Ising models and neural networks
Proefschrift
ter verkrijging van het doctoraat in de
Wiskunde en Natuurwetenschappen
aan de Rijksuniversiteit Groningen
op gezag van de
Rector Magnificus, dr. F. Zwarts,
in het openbaar te verdedigen op
maandag 23 mei 2005
om 16.15 uur
door
Hendrikjan Gerrit Schaap
geboren op 7 mei 1977
te Emmen

Page 4
Promotores:
Prof. dr. A. C. D. van Enter
Prof. dr. M. Winnink
Beoordelingscommissie: Prof. dr. A. Bovier
Prof. dr. H. W. Broer
Prof. dr. W. Th. F. Den Hollander
ISBN: 90-367-2260-8

Page 5
Contents
1 Introduction
9
2 General overview
19
2.1 Gibbs measures: Ising model . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.1 The Ising model . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.2 Thermodynamical limit . . . . . . . . . . . . . . . . . . . . . . 23
2.1.3 Some choices of boundary conditions . . . . . . . . . . . . . . 24
2.2 Spin glasses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.1 Ising spin glasses . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.2 Mean field: SK-model . . . . . . . . . . . . . . . . . . . . . . 31
2.3 Hopfield model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.1 Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3.2 Dynamics and ground states . . . . . . . . . . . . . . . . . . . 32
2.3.3 System-size-dependent patterns . . . . . . . . . . . . . . . . . 35
2.3.4 Some generalizations . . . . . . . . . . . . . . . . . . . . . . . 36
2.4 Scenarios for the spin glass . . . . . . . . . . . . . . . . . . . . . . . . 37
2.4.1 Droplet-picture short-range spin-glasses . . . . . . . . . . . . . 37
2.4.2 Parisi’s Replica Symmetry breaking picture . . . . . . . . . . . 39
2.4.3 Chaotic Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5 Metastates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3 Gaussian Potts-Hopfield model
43
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Notations and definitions . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3 Ground states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.1 Ground states for 1 pattern . . . . . . . . . . . . . . . . . . . . 46
3.3.2 Ground states for 2 patterns . . . . . . . . . . . . . . . . . . . 47
3.4 Positive temperatures . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5

Page 6
3.4.1 Fixed-point mean-field equations . . . . . . . . . . . . . . . . 50
3.4.2 Induced measure on order parameters . . . . . . . . . . . . . . 51
3.4.3 Radius of the circles labeling the Gibbs states . . . . . . . . . . 53
3.5 Stochastic symmetry breaking for q = 3 . . . . . . . . . . . . . . . . . 54
4 The 2d Ising model with random boundary conditions
59
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Set-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.4 Geometrical representation of the model . . . . . . . . . . . . . . . . . 66
4.5 Cluster expansion of balanced contours . . . . . . . . . . . . . . . . . 73
4.6 Absence of large boundary contours . . . . . . . . . . . . . . . . . . . 77
4.7 Classification of unbalanced contours . . . . . . . . . . . . . . . . . . 79
4.8 Sequential expansion of unbalanced contours . . . . . . . . . . . . . . 82
4.8.1 Renormalization of contour weights . . . . . . . . . . . . . . . 84
4.8.2 Cluster expansion of the interaction between n-aggregates . . . 85
4.8.3 Expansion of corner aggregates . . . . . . . . . . . . . . . . . 87
4.8.4 Estimates on the aggregate partition functions . . . . . . . . . . 89
4.9 Asymptotic triviality of the constrained Gibbs measure ν
η
Λ
. . . . . . . 89
4.10 Random free energy difference . . . . . . . . . . . . . . . . . . . . . . 91
4.11 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.11.1 Proof of Proposition 4.20 . . . . . . . . . . . . . . . . . . . . . 95
4.11.2 Proof of Proposition 4.24 . . . . . . . . . . . . . . . . . . . . . 98
4.11.3 Proof of Proposition 4.25 . . . . . . . . . . . . . . . . . . . . . 103
4.11.4 Proof of Lemma 4.26 . . . . . . . . . . . . . . . . . . . . . . . 103
4.11.5 Proof of Lemma 4.28 . . . . . . . . . . . . . . . . . . . . . . . 105
4.12 High field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.12.1 Contour representation . . . . . . . . . . . . . . . . . . . . . . 108
4.12.2 Partitioning contour families . . . . . . . . . . . . . . . . . . . 110
4.13 Concluding remarks and some open questions . . . . . . . . . . . . . . 114
4.14 Appendix on cluster models . . . . . . . . . . . . . . . . . . . . . . . 116
4.15 Appendix on interpolating local limit theorem . . . . . . . . . . . . . . 120
A Cluster expansions for Ising-type models
123
A.1 High temperature results . . . . . . . . . . . . . . . . . . . . . . . . . 123
A.1.1 1D Ising model by Mayer expansion . . . . . . . . . . . . . . . 123
A.1.2 Uniqueness of Gibbs measure for high temperature or d=1 . . 124
A.1.3 High temperature polymer expansions . . . . . . . . . . . . . . 126
A.2 Low temperature expansions of 2D Ising model . . . . . . . . . . . . . 129
6

Page 7
A.2.1 Upper bound on the pressure by cluster expansion . . . . . . . 129
A.2.2 Site percolation . . . . . . . . . . . . . . . . . . . . . . . . . . 133
A.3 Nature of the clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
A.4 Multi-scale expansion for random systems . . . . . . . . . . . . . . . . 136
Publications
141
Bibliography
143
Samenvatting
149
Dankwoord
157
7

Page 8
8

Page 9
Chapter 1
Introduction
Imagine a river, continuously flowing on, in a constant way running its course; the
water in the river always tends to flow in the direction with the minimal resistance
possible. Suppose during heavy rain the water in the river becomes higher and higher.
Then, suddenly, the water becomes so high that the river overflows its banks, allowing
the water to flow in the nearby meadows. Then it takes some time before the river is
adjusted to this new situation.
Statistical physics is an attempt for modelling natural processes. It tries to connect
the microscopic properties with the macroscopic properties. For the river the movement
of the water molecules is connected to the overall flow. The global properties of the river
are given by some universal laws for some characteristics, notwithstanding the huge
number of involved water molecules. Because of this huge number we have to apply
probabilistic methods (stochastics) on the underlying microscopic differential equations
defining the movement of all of the water molecules. Then if we look at the macroscopic
properties, we know almost for certain its global behavior. This global behavior can
be described by equations only depending on macroscopic properties. The underlying
microscopic part is removed by the performed stochastics.
Let us take a die to demonstrate some of the involved stochastic principles. As we
know, we have the same probability to throw a 1 or a 6. However, experience tells us that
after a small number of throws, the resulting relative frequencies of individual numbers
can significantly differ from each other. Only when we throw a die a large number of
times, the resulting relative frequencies of the numbers become more and more equal.
The same result can be arranged when we throw not one die many times, but rather
when we throw a lot of dice at once and then look at the relative number frequencies.
Obviously throwing one die 1000 times is equivalent to throwing 1000 dice one time.
Eventually all of the relative frequencies approach
1
6
: on average every number appears
9

Page 10
once if we throw a die six times. This relative frequency of a number is sometimes
identified as the probability of the number to appear. To shorten notation it is denoted
by P(i). For every number i on our die, P(i) =
1
6
. The function which assigns to
each number i the corresponding probability, we call the probability distribution of the
property.
Suppose we have two dice. Then the combined probability equals P(n
1
,n
2
), where
n
i
represents the numbers on the two dice. For instance P(1,1), the probability of
throwing with both dice a 1 equals
1
6
·
1
6
=
1
36
. Note that the throw of one die does not
depend on the outcome of the throw of the other die. We say that the outcome of die
1 is independent of the outcome of die 2. When two properties are dependent on each
other the expression for the combined probability is in general more complicated.
All matter around us is made out of atoms. Every gram of matter contains around
10
23
atoms. Often one considers a collection of some global (bulk) properties in addi-
tion to the atomic properties. In the description of matter, all the atoms together with
the mentioned global properties define the system. Any particular realization of the
corresponding atomic values is called a configuration of the system. Determining the
configuration resembles the throwing of 10
23
dice at once.
Suppose one wants to measure a macroscopic property, for instance the average
density. Because of the large number of atoms, in the probability distribution of the
atomic values there is no need of the possibility of tracking the locations of the single
atoms. In case of the river: during the heavy rainfall, the river has a way of flowing
which does change in time. After some adjustments are made, the changes do stop;
the river flow becomes stationary. The time scale of the adjustments is extremely large
compared to the scale of the local movements of the water molecules. The way in which
the system properties are evolving we call the dynamics of the system. In the global flow
of the river the microscopic movements of the water molecules have been averaged out.
In the next chapter we give a general overview of the part of statistical physics
which is important for the two particular kind of models we study in this thesis: neural
networks and Ising models.
Neural networks
The first subject of the thesis is about a model originating in the theory of neural net-
works. In particular we like to understand the concept of memory. Our brain is built up
out of billions of neurons connected in a highly non-trivial way. This structure we call
a neural network. It is difficult to study it directly, because of the huge number of neu-
rons involved in a relatively small area. In order to understand how the memory works,
a common approach is to build a simpler model which captures its main features. Just
as the neural network of the brain, the model should be sufficiently robust: in transmit-
10

Page 11
Figure 1.1: Components of a neuron
(taken from [19])
ting signals between neurons there are always some small errors involved. Given this
slightly deformed signal, the brain is able to remove the noise and to reconstruct the
pure signal. For a good general overview of neural networks see [19].
A neuron is build up of three parts: the cell body, the dendrites and the axon, see
Figure 1.1. The dendrites have a tree-like branched structure and are connected to the
cell body. The axon is the only outgoing connection from the neuron. At the end of the
axon it branches and it is connected to the dendrites of other neurons via synapses. The
end of any branch of the axon is separated from a dendrite by a space called the synaptic
gap.
Neurons communicate with other neurons via electric signals. The electric signal of
a neuron i transfers to a neuron j in the following way, see Figure 1.2. First it travels
from the cell body of neuron i into the axon which is connected to neuron j. This is the
output signal of neuron i. When the signal of the neuron arrives at the end of the axon
it transmits neurotransmitters into the synaptic gap. Then by receptors on the dendrite
of neuron j the neurotransmitters are transformed back to an electric signal. There are
several types of neurotransmitters. Some of transmitters amplify the incoming signal
before transmitting it to the dendrites of other neurons, whereas others weaken it.
This resulting signal originating from the receptors of neuron j we call the input
signal from neuron i to neuron j. Finally the signal arrives at the cell body of neuron j.
In the cell body of neuron j all the inputs come together. The cell processes the
inputs (as we will model mathematically by performing a weighted sum), what we call
the total input h
j
of neuron j. Then, depending on the outcome, the cell produces a new
signal which is transported to the axon of the neuron j in order to be transferred to other
neurons. This is called the output or the condition of neuron j.
For making a useful model based on these neural processes we need to make some
11

Page 12
Figure 1.2: The synapse
(taken from [19])
simplifications. As a first simplification we assume that every neuron interacts with
every other neuron. We say that the neural network is fully connected. Further we
assume that each neuron can have only two possible outputs, i.e. it can only be in two
conditions. For reference we denote by σ
i
the condition of neuron i: σ
i
= +1 if it is
excited and σ
i
= −1, when it is at rest.
We also assume that no alteration of the signal takes place when it travels across a
synaptic gap. As result the input to neuron j which comes from neuron i is equal to the
output σ
i
from neuron i which is send to neuron j.
For modelling the dynamics of our model we introduce the time t. At every time step
∆t (with ∆ very small) every neuron output is changed simultaneously. The processing
of the cell body of every neuron j we model by two steps:
1. At time t we multiply every input σ
i
(t) coming from the other neurons with a
weight. To obtain the total input h
j
(t) at time t we sum the result over all of the
neurons (except neuron j).
2. For the output σ
j
(t + ∆t) of neuron j at time t + ∆t we take the outcome of a
probability distribution over the two possible neuron conditions. This distribution
is formed by a stochastic rule on h
j
(t).
We assume that the connections are treated by the neuron cell bodies in a symmet-
rical way: the weight given in neuron j to the input of neuron i is equal to the weight
given in neuron i to the input of neuron j. In realistic neural networks in general this
interaction symmetry does not hold.
The dynamics of our model is summarized by Figure 1.3. The stable configurations
under this dynamics form the memory of the system. Stability means that starting from
a stable configuration the system only reaches configurations which are very much alike
12

Page 13
output of all of the
neurons at time t
total input h
j
(t):
weighted sum of
all of the outputs
output of neuron j at time t + ∆t:
the outcome of stochastic rule on h
j
(t)
Figure 1.3: Dynamics of the neural network model
(in the sense of neuron configurations). By choosing appropriate weights we can tune
the dynamics such that the memory is formed by a finite number m of preselected
neuron configurations ξ
(m)
, also called patterns.
The stochastic rule depends on a parameter β. The inverse T = 1/β of parameter
β is called temperature. If parameter β is large, the neuron has a strong tendency (high
probability) to become equal to the sign of its total input. As β approaches infinity the
stochastic rule turns into a deterministic one. Then, if we put in a configuration which
is close enough to e.g. ξ
(1)
, the system evolves to configurations equal to the pattern
ξ
(1)
. In other words the neural network remembers the configuration ξ
(1)
of its memory.
This means that the neuron configuration becomes equal to ξ
(1)
and, afterwards, the
system stays in this configuration. This is the so-called zero-temperature dynamics of
the Hopfield model, see Section 2.3.2. It is e.g. very useful for information transmission.
The dynamics defines algorithms to remove noise from the received signals. Often it is
advisable to allow the parameter β to be finite. Then, when we perform the dynamics of
Figure 1.3, we have excluded the probability of getting trapped in undesired so-called
metastable configurations.
In order to increase the capacity of the memory, obviously one can make the gener-
alization of increasing the number of possible conditions to a finite larger number q. In
information transmission if one takes q = 26, every neuron state corresponds to a letter.
Of course, for q < 26 one can also deal with words by a more carefully encoding, but
then the encoding becomes less clear. If we make in the above model this generalization
to have more possible neuron conditions than two, the resulting model is also known as
the Potts-Hopfield model.
In Chapter 3 we choose the weights in the total inputs in a different way. For this we
need to define first for each neuron a set of p continuous variables ξ
(p)
i
which we refer
to as patterns. We take at random a realization of these variables. They have a Gaussian
distribution. This special distribution is often used in statistics. Then with the values
of the introduced patterns ξ
(j)
i
we determine the weights for the total input. What will
be the memory of the resulting model? Are there any stable neuron configurations? We
will also look what happens when we increase the total number of neurons. What will
be the effect on the memory?
We form the weights of the total input by two Gaussian patterns. The possible num-
ber of conditions of a neuron we set to three. When we increase the number of neurons,
13

Page 14
for large numbers the following will happen. For a fixed number of neurons, the mem-
ory is concentrated around six neuron configurations. These configurations are related
to each other by a discrete symmetry. Every neuron configuration is associated to a
point in macroscopic space formed by some macroscopic variables. We can differenti-
ate the six stable neuron configurations into pairs of diametrical opposite points. When
one increases the number of neurons the discrete symmetry always occurs, however the
six configurations tend to rotate on three circles. This we will see in Chapter 3. If we
look at the sequence of increasing number of neurons, then in the macroscopic space of
above, the appearing stable neuron configurations do fill up the three circles in a regular
uniform way.
Ferromagnets
In Chapter 4 we consider a famous model for magnetic materials: the Ising model. In
general there are several kinds of magnetism. For the so-called paramagnets only when
we are applying an external field to it, the metal is magnetized. Otherwise there is no
magnetization. Another important type of metal are the ferromagnets. These metals
retain their magnetization, once they have been exposed to an external field. Initially
the ferromagnetic metals have no magnetization. This is comparable with what happens
when we magnetize pieces of iron with the help of a magnet. When we heat the material,
then eventually this effect disappears: the metal behaves like a paramagnet. For more
general information about magnetism we refer to e.g. [46]. We will use the Ising model
as a model for ferromagnetism.
Ising models
For justification of the Ising model as a model for a ferromagnet, we need to make
some assumptions. We assume that the unpaired electrons of the outermost shell of the
atoms are localized: i.e. closely bound to the corresponding atoms. Only these unpaired
electrons are responsible for the magnetization. For the Ising model we assume that for
every atom only one unpaired electron is in the outermost shell.
Every electron has an intrinsic angular momentum which we call spin. This spin
generates a magnetic moment. Due to quantum mechanics the spin of the electron can
have only two orientations with respect to this magnetic moment, which we call up
and down [5]. With a bit abuse of notation we mostly refer to these orientations as the
values of the spin. Because we have assumed every atom only has one unpaired electron
in the outermost shell, we also have only two orientations for the total spin per atom.
Most of the metals do consist of atoms with more than one unpaired electrons in the
outermost shell. For these metals there can be more than two orientations of the total
14

Page 15
spin per atom. Most solid materials are crystalline. The atoms, or ions or molecules do
lay in a regular repeated 3 dimensional pattern. This makes some finite number of spin
orientations energetically favorable.
In magnetizable metals the metal is divided into domains which have net magnetic
moments. The boundaries between these domains are called domain walls [46]. The
Ising model only allows for configurations in which the spins of two neighboring elec-
trons are parallel or anti-parallel with respect to each other. If there is a domain wall
present, the thickness of the domain wall is automatically zero.
The interactions between the localized electrons are also called the Weiss interac-
tions. In general two types of interactions do frequently occur: nearest neighbor and
mean-field. When we restrict ourselves to nearest-neighbor interactions, we assume that
all of the remaining interactions between the electrons, which are not nearest neighbors,
are zero. When the interaction is mean-field, then the interaction between the moments
of any pair of sites is non-zero and all of them are equal.
For the Ising model we restrict ourselves to the nearest-neighbor interactions. For
the lanthanide series (a particular series of elements) this is a good approximation. Al-
though the model is simple and is for other magnetic metals at most only a rough ap-
proximation it is and has been very useful model. It is the first model (and for long time
the only model in statistical physics) which displays the phenomenon of phase transi-
tion (e.g. think about the liquid → gas transition). Furthermore it is exactly solvable in
1 and 2 dimensions. Nowadays the Ising model (and generalizations of it) appears in
several places, e.g. all kinds of optimization problems, voter problems, models for gas
versus liquid, etc.
Now we give a mathematical description of the model. Take a piece of a lattice.
Every point where a vertical line does cross a horizontal one we refer to as a site. The
horizontal and vertical line-pieces starting from a site and ending by the nearest next
crossing we refer to as bonds. On every site i there is an atom which has a net spin
magnetic moment to which the spin can have only two orientations. We denote the
spin-value by σ
i
= +1 when the spin is oriented up and σ
i
= −1 when the orientation
of the spin is down. We refer both to the atom as to the spin orientations as spin. The
configuration σ of the spins is in our case an array, which contains the spin-values σ
i
of
every site.
Between each pair of nearest-neighbor spins (i.e. every pair of spins associated with
a single bond) there is an interaction
E
ex
ij
= −βσ
i
σ
j
≡ Jσ
i
σ
j
(1.1)
We call often this interaction also the exchange energy between the atoms on site i and
j. The energy of a configuration is the total of these exchange energies. The variable β
15

Page 16
is the inverse of the temperature times a constant, which depends on the type of material
considered.
The probability of the configurations are determined by these interactions. In ferro-
magnets the nearest-neighbor spins tend to have equal orientations, i.e. they tend to be
aligned. Therefore we have chosen the interactions in the model such that it becomes
more probable for spins to be aligned: we have set J < 0. When J > 0, the model
behaves like an antiferromagnet. Then it becomes more probable for spins to be anti-
aligned. The higher the energy is the less probable the configuration becomes. The
probability of a single configuration equals
P(σ) =
exp β
i,j
σ
i
σ
j
Z(σ)
=
exp −
i,j
E
ex
ij
Z(σ)
(1.2)
where Z(σ) is the sum of the numerator over all configurations. We see that when
the temperature gets lower, the interaction (1.1) becomes stronger. Then it becomes
more probable for nearest-neighbor spins to be aligned. From (1.2) we immediately
see that for zero temperature only the two configurations which minimize the energy do
appear with positive probability: i.e. every spin has the same orientation. For very high
temperatures every configuration becomes almost equally probable. Then the model
behaves like a paramagnet. The temperature is thus a measure of the disorder in the
system. For low temperatures most of the spins do align with each other, for high
temperature the orientations of the spins are more or less randomly up or down. In
Chapter 4 we will consider the most interesting part, the low-temperature ferromagnetic
region of the Ising model.
Until this moment we did not bother about the environment. When the energy of
the system is independent of this environment, we say that the system has free boundary
conditions. But what happens when this environment is formed by a different material
with a particular chosen configuration of spins? The values of the spins next to the
boundary not only tend to align the internal spins but also feel the nearest-neighbor
spins in the environment.
In general a piece of metal contains a lot of atoms. Already one gram contains
around 10
23
atoms. One likes to consider volumes which are of the order of the size
of the piece of metal. The volume size is measured in the number of atoms, thus also
of the order of 10
23
. In the mathematical description of the model we approximate
this huge number by infinity. First one takes a large finite-volume version of the Ising
model. Then one tries to extrapolate the resulting expressions to an ’infinite’ volume
size model.
What will happen to the system when we increase the volume size, and choose for
each step the orientations of the external spins arbitrarily up or down, i.e. we take ran-
dom boundary conditions? How does the alignment of the spins change in the process
16

Page 17
of increasing the volume? It turns out to be dependent on the way we let the volume size
increase. Our results depend on letting it increase fast enough. Furthermore we need to
choose the temperature very low so that there is a strong tendency for the spins to align.
Then, in the long run, by (1.2), in the appearing configurations almost every spin
has the same orientation. However, because we have chosen the boundary conditions
randomly, for half of the volumes the appearing configurations will have almost all of
the spins up and for the other half of the volumes the configurations have almost all of
its spins down.
But now, if we look into the volume but far away from the boundary? Do we still
see an effect of the boundary conditions? We prove that the local volume density of the
area’s of aligned spins becomes asymptotically independent of the boundary conditions.
However, even for very large volumes, there is a significant effect on the density of spin
values. If we look at a fixed (very large) volume, then with probability one, either
all configurations have all the spins up or have all the spins down. Almost all of the
orientations becomes equal to the orientation of the majority of the external spins which
are involved in the boundary condition. Because of the non-zero temperature a small
part of the spins has an opposite orientation.
Because we have increased our volumes fast enough the so-called mixtures do not
appear. This means we do not have with nonzero probability both type of configurations:
i.e. having configurations with most of the spins up and configurations with most of the
spins down.
This is the subject of Chapter 4. There as a technical tool we need to introduce
non-trivial expansion techniques, called multi-scale cluster expansions. Our multi-scale
expansion method is inspired by the ideas of Frohlich and Imbrie [35]. The multi-
scale expansion is a generalization of the more familiar ’uniform’ cluster expansion
technique. To simplify our estimates we choose to use a different representation of
the expansions from the one used in [35], the so-called Kotecky-Preiss representation,
which was developed just two years later [50].
In order to have useful expansions, one needs to prove certain criteria: we need the
convergence of some summations related to the expansions. For cluster expansions it is
crucial to check the Kotecky-Preiss criterion. However, in our expansions it is impossi-
ble to prove it directly. Therefore we introduce a new criterion, which we prove to be
equivalent. This new criterion enables us to obtain useful estimates even for our expan-
sions. In the final chapter the uniform and multi-scale cluster expansions are explained
more thoroughly.
Schematically the thesis is built up as follows:
17

Page 18
1. Introduction → 2. General overview
3. Hopfield model
4. Ising model → 5. Cluster expansions
18

Page 19
Chapter 2
General overview
2.1 Gibbs measures: Ising model
2.1.1 The Ising model
In some metals, some fraction of the atoms becomes spontaneously magnetized, when
the temperature is low enough. This happens for instance in iron and nickel. The
magnetized spins, which are the intrinsic magnetic moments of the atoms, tend to be
polarized in the same direction (e.g. all up) which gives rise to a macroscopic mag-
netic field. We call this ferromagnetic behaviour. However, when the temperature is
above some T
c
then all spins are oriented randomly and there is no macroscopic mag-
netic field anymore [44]. The interaction between the magnetic moments is short-range.
However these short-range interactions do provoke long-range ferromagnetic behavior
in the system. These metals have a rather homogeneous-crystalline structure with the
atoms fixed, apart for some minor moving. This makes that the short-range interactions
are typically homogeneous ones.
The Ising model tries to model this transformation of typically-homogeneous short-
range interactions into long-range phenomena in physical ferromagnets. In this model
we look only at the basic features of a ferromagnet. We assume that the metal atoms are
on a regular crystalline lattice Λ, which is in general a subset of
d
. Every point of the
lattice contains precisely one atom.
Furthermore this atom is fixed and the only degree of freedom is its spin i.e. its
magnetic moment. In reality the atom moves a bit around its lattice point, but be-
cause of strong crystalline binding this movement is limited. In this model we have
neglected the effect of these movements. We assume that the environment outside the
metal changes adiabatically slowly. For real ferromagnets this is indeed typically the
case when we compare the microscopic changes in the crystal with the macroscopic
19

Page 20
exterior environment. Without any harm we consider this environment to be fixed: the
so called boundary condition to Λ. On every point i of Λ there is precisely one particle
which has only its spin-value σ
i
as degree of freedom. The spins can only point up or
down, or equivalently its spin-values σ
i
are restricted to σ
i
= ±1. Here is some re-
scaling involved, but for the total picture this pre-factor is not important. There are only
nearest-neighbor pair interactions between the spins.
In reality crystals are never perfect, and because of thermal excitations some points
of the lattice are empty and other parts of the lattice are deformed. Also more spin-
values are allowed. Despite its serious restrictions compared to reality, the Ising model
still shows the long-range ferromagnetism it was designed for (if the dimension d ≥ 2
and the temperature T is low enough) e.g. [20]. This is in contrast to Ising’s claim; he
found no ferromagnetism in dimension 1 and he conjectured wrongly that the same was
true for d ≥ 2.
As usually happens to simple models, all sorts of generalizations to the Ising model
have been done. The reality connection with the ferromagnets is often not so clear
or not even there at all. However, we see Ising models in various places to explain
many phenomena; Ising models are equivalent to lattice gases, closely related to many
percolation problems and useful for optimization problems as well.
We can generalize the Ising model by allowing the spins to have more spin-values.
The result is a so-called Potts model when this amount of spin-values is finite. It was
proposed by Domb as a subject for his student Potts. Using duality arguments Potts was
able to determine for the standard Potts model for d = 2 the critical points β
c
for all
values of q. Further on, in Chapter 3, we will see these Potts spins of the standard Potts
model. Another way of generalizing is to allow the spins to have continuous values on
a sphere: e.g. the Heisenberg model.
We return to the Ising model and make things more concrete. So -in other words-
let’s put the model into math. We use the canonical-ensemble description from statisti-
cal physics. It describes systems for which their exterior functions as a heat reservoir.
Each member of the ensemble is represented by a point in the phase space. All the
possible system behavior is described by this phase space together with a probability
distribution on the ensemble. For Ising systems the phase space is discrete because the
only freedom of the system are the spins. Because each spin can take only two values
the phase space equals {−1,1}
d
. Each point of the phase space we call a (spin) con-
figuration. Denote by σ the spin configuration σ ∈ {−1,1}
d
. The restriction of σ to
the finite-sized Λ we refer to as σ
Λ
∈ {−1,1}
Λ
, where
Λ = {−L,−L + 1,··· ,L − 1,L}
d
(2.1)
and Λ
c
=
d
\ Λ.
20

Page 21
When the system settles into thermodynamical equilibrium, the probability of the
spins to be in the configuration σ is described by the so called (finite-size) Gibbs mea-
sure:
µ
η
Λ
Λ
Λ
) =
exp(−H
η
Λ
Λ
))
Z
η
Λ
(2.2)
We denote by < . >
η
the expectation of the argument with respect to the Gibbs measure
µ
η
Λ
. Z
η
Λ
is the partition function which we obtain by summing over all configurations
the corresponding Gibbs-weight of the configuration.
The free energy of the system per spin equals
F
η
Λ
= −
1
β|Λ|
log Z
η
Λ
, with β =
1
T
(2.3)
For a set A ⊂
d
, the symbol |A| refers to the number of sites contained in A. For more
details and the derivation for the particular choice of the Gibbs measure µ
Λ
we refer to
any statistical mechanics book, for instance [44].
The functions H
η
Λ
(σ) are the energy functions or the Hamiltonians of the configu-
rations σ
Λ
. For the Ising model they are defined as follows:
H
η
Λ
Λ
) = −β
x,y ⊂Λ
x
σ
y
− 1) − β
x,y
x∈Λ, y∈Λ
c
σ
x
η
y
(2.4)
where x,y stands for nearest neighboring sites. This means in particular that x −
y = 1, where . is the Euclidean norm. By η we denote the fixed boundary conditions,
i.e. to the spin-values of the spins in Λ
c
. When we do not include boundary conditions
we speak about free boundary conditions. Equivalently we drop the second term in the
Hamiltonian. Indeed, the expression for the resulting free energy then is independent of
the boundary condition. For the corresponding Hamiltonian we write H
Λ
(σ).
Note that because the interactions are only nearest neighbor only the η’s in the sites
x ∈ Λ
c
with d(x,Λ) = 1 are involved. Z
η
Λ
is the partition function which we obtain
by summing over the Gibbs-weights of all configurations σ
Λ
. As we see from (2.4) the
spins tend to align to each other.
Mean field: Curie Weiss
In general, the partition function Z
η
Λ
is hard to compute for the Ising model. For one
dimension this can be treated simply by the so called transfer matrix methods. When
d = 2 there is the famous, much more involved, Onsager solution which gives an
complete analytic expression also by using transfer matrices. For higher dimensions
however only partial results are known. So some approximation is introduced: the
21

Page 22
mean-field theory (we follow closely [68]). With this approximation we are able to
obtain an explicit expression for the Gibbs average of the global magnetization.
We look at free boundary conditions and we rewrite the Hamiltonian to
H
Λ
Λ
) = βN(L) − β
x,y ⊂Λ
σ
x
σ
y
(2.5)
where N(L) is the number of nearest-neighbor bond pairs. Because the first term is not
dependent on the spin-variables it drops out in the Gibbs measure. So we are allowed
to ignore it.
Then we ’expand’ every spin σ
i
around its Gibbs mean value < σ
i
>≡ m and
denote the fluctuations by ∆
i
= σ
i
− m. Rewriting the Hamiltonian (2.4) gives for free
boundary conditions
H
Λ
Λ
) = −β
x,y ⊂Λ
(m + ∆
x
)(m + ∆
y
)
(2.6)
Now we assume that we can neglect the higher order terms in ∆ so
H
Λ
Λ
) = βm
2
N(L) − βm
<x,y>
x
+ σ
y
) = βm
2
N(L) − 2dβm
x
σ
x
(2.7)
Here we have assumed that every site i has 2d bonds coming out from it. The corners
and intersecting planes on the boundary of Λ are of lower dimension and therefore
ignored.
With the above the partition function easily follows:
Z
Λ
= Tr
σ
exp(−βH
Λ
Λ
)) = Tr
σ
exp βm
2
N(L) − 2dβm
x
σ
x
=
expβm
2
N(L)(2coshexp2dβm)
|Λ|
(2.8)
By Tr
σ
we mean the sum over all possible 2
|Λ|
configurations. Now we remember
that m =< σ
i
> which is the Gibbs-expectation of the mean of a single spin-value.
When we put it in, we obtain the so called mean-field equation for m:
m =
Tr
σ
σ
i
exp(−βH
Λ
Λ
))
Z
Λ
= tanh2dβm
(2.9)
This equation has three solutions m , 0 and −m whenever 2dβ > 1, i.e. when β >
1/2d. The critical value β
c
: 2dβ
c
= 1 is the value where region ends where there is no
global magnetization, i.e. there is no non-zero solution.
22

Page 23
It turns out that the above mean-field equation (2.9) (after re-scaling) is the exact
solution for m =< σ
i
> of the infinite range version of the Ising model (see e.g. [68]).
This version is also called the Curie-Weiss model which has as Hamiltonian
H
N
(σ) = −
β
N
i=j
σ
i
σ
j
(2.10)
where 1 ≤ i,j ≤ N. Each spin has an (uniform) interaction with any other spin. We
will encounter more mean-field equations in Chapter 3.
2.1.2 Thermodynamical limit
In nature macroscopic systems are extremely large of the order of 10
23
atoms and more.
So it is natural to take the system size limit L → ∞. But when we take this limit the
Hamiltonian goes to infinity as well. The infinite limit expression of the Hamiltonian
does not make any sense. So how to define an infinite-volume Gibbs measure which
depends on this divergent function?
All is settled by defining the infinite-volume Gibbs measure by the condition that
all the conditional probabilities to finite-sized volumes are finite-size Gibbs measures
in a consistent way. The corresponding equations due to this condition are called the
DLR-equations.
Definition 2.1. An infinite-volume measure µ is a Gibbs measure if it satisfies the so-
called DLR-equations:
µ(·|η
Λ
c
) = µ
η
Λ
(.)
(2.11)
for all finite Λ and µ-a.e. every η.
Equivalently: if we condition µ on the configuration η outside Λ we obtain the
finite-volume Gibbs measure µ
η
Λ
.
If we look at the finite-size Gibbs measures µ
η
Λ
L
and if we take the sequence L =
1,2,··· it depends on the boundary condition η what will happen for very large L. The
sequence does not need to settle to a single limit Gibbs measure. For L → ∞ the
sequence may oscillate between two or even more infinite-volume Gibbs measures.
To see some limiting structure one can define metastates. These metastates are
probability measures over the infinite-volume Gibbs measures. Later on we reveal more
details about metastates in Section 2.5.
When we cannot write the Gibbs measure µ as a combination of Gibbs measures,
e.g. µ = (µ +µ )/2, we call µ an extremal Gibbs measure or a pure state. From (2.11)
follows when µ and µ are pure states, then all the convex combinations in between are
infinite-volume Gibbs measures.
23

Page 24
+ + + + + + + + +
+ + + + + + + + + +
+
+ + - + + + + + - +
+
+ + + + + + -
-
- +
+
+ + + + + + - + - +
+
+ + + + + + -
-
- +
+
+ + + + + + + + + +
+
+ + + + + + + + + +
+
+ + + -
- + + + + +
+
+ + + + + + + + + +
+
+ + + + + + + + + Λ
9
Figure 2.1: Typical configuration for µ
+
β
As T → 0 the inverse temperature β → ∞. From (2.2) we see that we obtain for
infinite volumes only Gibbs measures µ for which configurations of a strictly non-zero
weight (with respect to µ) do minimize the corresponding energy function H
η
(.).
We call these states ground states and the corresponding set of non-zero weight
configurations ground-state configurations due to the following property. From the
corresponding ground-state configurations σ, for every configuration σ we can cre-
ate by flipping any finite number of spins in σ the following holds: the difference of
H
η
(σ ) − H
η
(σ) ≥ 0. Note that we need to be careful, because in the infinite volume
limit Λ → ∞ the energy tends to −∞ for a lot of configurations.
This does not mean that there are no states σ for which the difference H
η
(σ ) −
H
η
(σ) < 0, where σ is a ground-state configuration. What it does mean in dynamical
sense, is that the system will stay in the same state for an infinite amount of time.
2.1.3 Some choices of boundary conditions
For getting a better understanding of the Gibbs measure subjects we just introduced,
we consider some examples. All is for the Ising model defined in Section 2.1.1. For
simplicity we restrict ourselves mostly to 2 dimensions.
Uniformly agreeing
First we take as boundary condition η ≡ 1, i.e. every site y has η
y
= +1. Looking at
(2.4) we see easily that only the configuration σ ≡ +1 minimizes the Hamiltonian. This
means that there is exactly one ground state µ
+
which equals µ
+
β→∞
(σ) = δ(σ ≡ +1).
24

Page 25
For β large enough but finite, the Gibbs state µ
+
β
which does appear tends to concen-
trate around this configuration σ = +1. The set of configurations σ which do appear
with µ
+
β
-measure 1 is of the following structure: σ has typically small islands of −
spins in a sea +-spins. The small islands have small lakes of +-spins which can contain
islands of −-spins and so on. This set we will refer to as the +-ensemble later on. See
Figure 2.1 for an example.
The same is true for the boundary condition η ≡ −1. Then the configurations has
small islands of +-spins surrounded by −-spins: the −-ensemble.
We can make this image plausible by proving the absence of large contours: in
literature often referred to as a Peierls bound. Consider all the bonds of the dual lattice
2
between nearest neighbor spins which have opposite signs. When we take the
union, the resulting closed curves Γ do form the boundary between + and − spins.
Every closed curve we call a contour Γ. The length |Γ| of the contour is the number of
dual bonds involved. Because of the boundary condition η ≡ +1 every contour does
appear as a closed curve. Every set of non-intersecting contours defines exactly one
configuration when we only look at the +-boundary condition and vice versa. Later on
for different boundary conditions a more general definition is needed and more general
curves do appear.
When we look at the definition for the Hamiltonian (2.4) we see that
H
+
Λ
(σ = {Γ}) − H
+
Λ
(σ ≡ +) = 2β|Γ|
(2.12)
This means that for the relative probability it holds:
µ
+
(σ = {Γ})
µ
+
(σ ≡ +1)
= exp(−2β|Γ|)
(2.13)
also called the weight or the cost of contour Γ. Note that the weight of a configuration
consisting of more contours factorizes into the weights of the single contours making
up the configuration.
Now we can prove the statement:
Peierls bound: Assume β > (log 3)/2 and + boundary conditions. Then for any
θ > 0 with µ
+
-probability one there are no contours larger than L
θ
when L → ∞.
Proof.
µ
+
Λ
(σ : ∃ Γ with |Γ| ≥ L
θ
)
θ > 0, possibly L
θ
L
d
≡ µ
+
L
(σ : )
(2.14)
25

Page 26
+ - + - + - +
-
.
.
.
.
.
.
.
-
+ .
.
.
.
.
.
.
+
-
.
.
.
.
.
.
.
-
+ .
.
.
.
.
.
.
+
-
.
.
.
.
.
.
.
-
+ .
.
.
.
.
.
.
+
-
.
.
.
.
.
.
.
-
+ - + - + - + Λ
7
Figure 2.2: Alternating boundary conditions η for Λ
7
Because of factorization H
+
Λ
({Γ
1
2
}) = H
+
Λ
({Γ
1
}) + H
+
Λ
({Γ
2
}) and therefore
µ
+
Λ
(σ : ) =
1
Z
+
Λ σ:
exp(−βH
+
Λ
(σ)) =
Γ:|Γ|≥L
θ
exp(−2β|Γ|)
1
Z
+
Λ
σ: σ=
{Γ }
σ
∪Γ
exp(−βH
+
Λ
(σ )) <
L
d
n=L
θ
3
n
exp(−2βn) ≤ 2L
d
exp(−(2β − log 3)L
θ
) → 0
for L → ∞, θ > 0 , β >
1
2
log 3 (2.15)
Note that the proof of the Peierls bound heavily depends on the uniform exponential
size decay of the contour weights.
Alternating
Now we choose the boundary condition η as an alternation of + and − spins, see Figure
2.2. Every boundary spin involved has a sign opposite to its nearest neighbors. Note
that this boundary condition gives rise to contours which are not closed curves.
Because the boundary condition does not favor any sign, the ground state µ(σ) =
1
2
(δ(σ ≡ +1) + δ(σ ≡ −1)) =
1
2
+
), when we take even volume sizes. We see
that this boundary condition gives rise to a mixture; the ground state is a combination
of the two pure states µ
+
= δ(σ ≡ +1) and µ
= δ(σ ≡ −1).
26

Page 27
+ + + + + + +
+ + + + + + + +
+
+ + + + + + + +
+
+ + + + + + + +
+
+ + -
-
-
- + +
+
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Λ
7
Figure 2.3: Typical configuration for µ
Dobrushin
β
The Gibbs states do concentrate now around both pure states which together do
make the ground state: µ =
1
2
+
+ µ
). The measure µ
+
is the Gibbs measure which
concentrates only on the +-ensemble and µ
concentrates on the −-ensemble.
We claim that this means that there are no interfaces involved with probability one.
By an interface we mean a contour which crosses the square lattice (so is at least of
order L). Of a (vertically-crossing) interface maximally half of the vertical bonds do
cancel in considering the weight; the weight of an interface is at most exp(−β|Γ|) so
we can apply again the Peierls bound for proving the claim.
Dobrushin
Now we create a boundary condition for which interfaces do exist. We choose η as
follows. For the upper half of the boundary we take all the spins +1 and for lower
half we do the opposite: all the spins −1. This boundary condition is also called the
Dobrushin boundary condition.
The possible form of the ground states is dimension-dependent. For d = 2 ground
states and for d = 3 also Gibbs states do exist with an interface like in Figure 2.3. This
means that the interfaces appear with non-zero probability at a particular position.
Chaotic size dependence
When we choose the boundary conditions carefully we can ensure that the system does
not have a limiting Gibbs measure. Take for even system size L the boundary condition
+ and for odd L the boundary condition −. Then the sequence µ
Λ
2L
converges to
the unique Gibbs measure µ
+
. The restricted odd sequence µ
Λ
2L+1
converges to µ
.
However the full sequence µ
Λ
L
oscillates between µ
+
and µ
and never settles to a
27

Page 28
limit. It depends on the volume size what the measure looks like even when this size
goes to infinity. This limiting dependence we call size dependence. Instead of even and
oddness we can also choose the + or − boundary conditions in a random way. Then for
very large L the measure still may depend randomly on L. This is called chaotic size
dependence.
Quenched-random boundary conditions
The environment around the system can be changing randomly in time. However in
reality when this happens then these changes are typically adiabatically slow with re-
spect to the dynamical changes of the system. To model this we assume the external
environment is fixed and is an outcome of the random variables making the randomness
of the environment. This type of randomness we call quenched disorder. However we
must be a bit careful when we say that the external environment is fixed. Although
the disorder is quenched and therefore fixed, the boundary condition changes randomly
when we look at increasing sequences of volumes which are independently chosen of
the disorder.
Choose all η
i
i.i.d. (=independently identically distributed) according to the follow-
ing distribution
P(η
i
= ±1) =
1
2
(2.16)
What will happen now? This is the question a considerable part of this thesis is all
about, in particular Chapter 4. Are there Gibbs states or ground states involved which
do contain all the above features: mixtures, interfaces, + and - ensembles? The answer
is not obvious from the beginning. Because although the probability of having interfaces
goes to zero in the η-distribution (Dobrushin-type configurations), it certainly does not
immediately follow that the Gibbs probability µ also goes to zero for the interfaces.
Furthermore there is no such thing as a limiting Gibbs state, because chaotic size
dependence is involved. For sparse enough sequences the limiting measure oscillates
randomly between measures concentrated on the +-ensemble and measures concen-
trated on the −-ensemble. For large enough volumes, with η probability one, neither
interfaces nor mixtures will occur. The above model is one of the simplest in which one
can study these things rigorously. The concepts have been developed for spin glasses,
in which much less is clear even at a heuristic level.
28

Page 29
2.2 Spin glasses
2.2.1 Ising spin glasses
In the previous section we have considered the Ising model, which models metals with
uniform spin-interactions. The spin glasses we now consider, are modelled by a system
with the same concepts but now the spin-interactions will be modelled by random inter-
actions. The atoms do not lay regularly on a crystal but are randomly placed in space.
In these spin-glasses these random places do change very slowly in time. Compared to
the dynamics of the spins these positions are fixed. After some time the spin values are
more or less like random distributed but do not change in time anymore. This rather
unusual behavior is seen in some alloys of ferromagnets and conductors like AuFe and
CuMn. In these metals the so called RKKY spin interactions are rapidly oscillating and
slowly decreasing. Because the atoms are randomly placed the sign of the interactions
is also random distributed.
We use the term glass because of the similarity with the glass of windows, which
are fluids but where the flow is almost infinitely slow. In the literature the term spin
glass is often used for a wider class of models which have a high amount of quenched
disorder in common but where the connection to the alloys is often lost.
Spin-glass models with infinite-range interactions turn out to be useful also for
explaining pattern recognition in neural networks, in error-correcting codes, image
restoration, and in all kinds of optimization problems [68].
For an explanation of the spin-glass phenomena the Edwards-Anderson model has
been introduced. This is an Ising spin glass with only nearest neighbor interactions
and therefore has only interactions between neighboring pairs of spins. The rapidly
oscillating interactions are modelled by i.i.d. Gaussians.
The Hamiltonian is as follows
H
Λ
= −β
<i,j>∈Λ
J
ij
σ
i
σ
j
+ hβ
i
σ
i
(2.17)
The J
ij
are quenched i.i.d. non-trivial random variables with common mean IE[J
ij
] =
IE[J
12
] ≡ IE[J] and h is an uniform magnetic field. The set Λ is a subset of
d
. Because
no boundary conditions do enter in the Hamiltonian the boundary conditions here are
free.
In real life when we study spin-systems we expect to observe only quantities which
depend on macroscopic properties. The couplings are microscopic and in practice
we do not know all the random places of the individual atoms. When we make the
setup we do this without knowing the particular realization of the couplings. So for a
proper measurement we need that the macroscopic properties we measure are coupling-
29

Page 30
independent. Therefore we should observe only states which we can create in a coupling-
independent way. These states we call observable states. If a state is not observable we
will call it an invisible state [67].
When h = 0 it depends on IE[J] and on β for which kind of configurations there
is a tendency to order in the system. When β is low enough: β < β
c
then there is
no ordering in the system at all. The spins behave approximately independent of each
other; the system shows paramagnetic behavior. For some systems only this behavior
is possible and β
c
= ∞.
When β ≥ β
c
there are three possibilities. When IE[J] > 0 the system prefers
ferromagnetic behavior: all spins tend to have equal values. For IE[J] < 0 the spin
values of nearest neighbor spins tend to be different from each other, which means the
system prefers to be anti-ferromagnetic. The third possibility, happening typically when
IE[J] = 0, is the spin glass phase which we will consider now.
A good way to see these tendencies is to look at the so called Edward-Anderson
order parameter q
EA
:
q
EA
=
1
|Λ|
i∈Λ
< σ
i
>
2
(2.18)
where < . > is the Gibbs mean of the argument and |Λ| is the total size or the volume
of Λ. For paramagnetic behavior < σ
i
>= 0 for every site i, making q
EA
= 0. When
the system behaves like a ferromagnet or a anti-ferromagnet < σ
i
>
2
= 1 for every site
i. This makes q
EA
= 1: its maximal possible value.
When we set the average IE[J] of the couplings to zero the systems prefers as many
spin pairs for which σ
i
σ
j
= +1 as for which σ
i
σ
j
= −1.
With the field h we have some control over the spin-values, when the average
IE[J] = 0 or small in magnitude compared to h. When we put h > 0 the system
gives preference to +-spins, when h < 0, the −-spins are more favored. However when
the field is not too large there is an intimate interplay between tendency due to the field
h and the tendency due to the couplings J
ij
.
Denote by [.] the coupling average: the average over the disorder J
ij
. For calculating
e.g. the averaged free energy we first calculate the trace as before with a fixed random
realization. Then we take the average over the randomness. This because the change in
randomness over time is adiabatically small compared to the spin value changes due to
thermal activity.
The free energy per spin turns out to be a self-averaging quantity. This means that
with probability 1 the free energy per spin for a fixed realization of the couplings is
equal to the coupling mean when we take the system size limit Λ → ∞. So the limit is
independent of the realization of the couplings. This is as it should be, because the free
energy is a macroscopic object.
30

Page 31
When the temperature is not too low < σ
i
>= 0 for any site i, because of param-
agnetism. This makes both the average magnetism m = [< σ
i
>] = 0 and q = [<
σ
i
>
2
] = 0. For low enough temperature in general < σ
i
>= 0 due to the quenched
couplings. However when we average over the disorder it can happen due to alternat-
ing signs that [< σ
i
>] = 0 although < σ
i
>= 0 for the typical realizations of the
quenched couplings. But q = [< σ
i
>
2
] = [q
EA
] > 0. This scenario is called the
spin-glass phase which is the third possibility we have mentioned earlier on.
2.2.2 Mean field: SK-model
Because calculations for the short-range EA-model are extremely hard we can try to do
the mean-field approximation like we did in the Ising model. The result is the infinite-
range Sherrington-Kirkpatrick model of which the Hamiltonian is
H
N
= −
β
N
N
i<j
J
ij
σ
i
σ
j
+ hβ
i
σ
i
(2.19)
again with J
ij
the outcome of an i.i.d. random distribution. Often one takes J
ij
as
standard Gaussian: J
ij
∼ N(0,1) and the external field h = 0. It is believed that the
infinite dimension limit d → ∞ for the free energy density of the EA-model is equal
to the free energy density of the SK-model. As in the SK-model each spin interacts
directly with infinitely many other spins.
For the SK-model one can show that the spin-glass phase does occur when the
temperature is low enough and the coupling mean not too large [68].
When one tries to take the limit N → ∞ various limit Gibbs states µ seem to appear.
In general a Gibbs state then looks like a mixture of infinitely many pure states µ
α
:
µ(σ) ≈
α
w
J
(α)µ
α
(2.20)
where w(α) is the relative weight of the pure state µ
α
. [66]. Note that the decomposition
weights as well as the µ
α
do depend on the disorder J. Recently it was proven that in
the limit N → ∞ each configuration is a ground state [67]. However taking the infinite-
volume limit is problematic. A slightly better-behaved class are the Hopfield models.
2.3 Hopfield model
Our brain is a complex structure of many neurons which interact with each other in
a non-trivial long-range way. For instance the cerebral cortex consists already about
31

Page 32
σ
1
(t)
.
.
.
σ
N
(t)
−→ h
i
(t) =
j:j=i
J
ij
σ
j
(t) −→ σ
i
(t + ∆t) = signh
i
(t)
Figure 2.4: The zero-temperature neuron dynamics
10
10
neurons. Nowadays there is a lot of research in this area. It seems that our brain
network is scale free. It has the structure of a so called small world network: i.e. small
path length between two neurons in the order of the path length in a random edge neural
network, but with a relatively high amount of connections (e.g. [18]).
Originally Pastur and Figotin invented the Hopfield model as a model for a special
type of spin-glasses. Then Hopfield came up with it independently as a model for neural
networks as above. However it is a simplified model and the geometric structure of the
neural connections is totally missing.
2.3.1 Setting
Assume that the neural network contains N neurons and that every neuron interacts
with any other neuron (i.e. having mean-field like interactions). These interactions are
composed out of 2-neuron interactions only. Now consider a neuron i. The state of the
neuron is labelled by the variable σ
i
. If the neuron is excited then σ
i
= +1. When
σ
i
= −1 the neuron is at rest. As a current is going from neuron j into neuron i the
signal is altered due to chemical transmitters in the neuron i itself. This synaptic efficacy
we denote by J
ij
. It alters the signal σ
j
into J
ij
σ
j
. The total input h
i
of neuron i equals
h
i
=
j: j=i
J
ij
σ
j
(2.21)
2.3.2 Dynamics and ground states
Let us define the dynamics of this model. If σ
i
(t) denotes the state of a neuron i at time
t then it becomes (or stays) excited at time t + ∆t whenever h
i
exceeds a threshold θ
i
.
Otherwise it is at rest at time t + ∆t:
σ
i
(t + ∆t) = sign(h
i
(t) − θ
i
)
(2.22)
For simplicity we set this threshold to zero. Then the state of neuron i at time t + ∆t
becomes [19]
σ
i
(t + ∆t) = sign
j: j=i
J
ij
σ
j
(t)
(2.23)
32

Page 33
so without an extra constant term. See Figure 1.3 and also Figure 2.4.
A particular fixed configuration of neurons we denote by the N−dimensional vector
ξ ∈ {−1,1}
⊗N
. We call this a pattern. If this pattern ξ is a stable fixed point of the
evolution defined by (2.23) then we say it is in the system’s memory. If we assume
that there are no metastable fixed points, then the following happens. If there is only
one stable fixed point, then whatever the initial state of the neurons, after long enough
time the system will be in the state defined by ξ, i.e. the system remembers the pattern.
Of course there can be more patterns in the memory. Then if we pick at random an
initial configuration of neurons, at the end we will always end up in one of the patterns.
Furthermore every pattern of the memory can be reached with non-zero probability. In
general however it can happen that the dynamics get stuck in a metastable fixed point.
To illustrate this process better, imagine you want to answer a question for a quiz.
It can happen that you need a bit of time to remember the answer, because the question
appears to be difficult. But still you have the feeling that you might know this one.
In the process of remembering you try to find associations with the -according to you-
presumably right answer. You are fine-tuning your first thought. You are altering your
initial condition to a better one, a one with less energy. Then after some time you think
you know an answer and you believe it is right. You have recovered something of your
memory: either you are in the state of the right answer or in a state of wrongness.
A good way of measuring how well a configuration σ agrees with a given pattern ξ
is to look at the corresponding order parameter q
ξ
:
q
ξ
=
1
N
N
i=1
σ
i
ξ
i
(2.24)
When q
ξ
= 1 then the configuration σ is equal to the pattern ξ and when q
ξ
= −1 then it
is equal to the opposite: σ = −ξ. The parameters q
ξ
are also called overlap parameters.
Whenever ±q
ξ
> 0 the configuration ±σ agrees with the pattern ξ for more than half
of the neurons.
A good way of studying this system is to use the Gibbs description of statistical me-
chanics. Then the memory of the system is formed by the ground states of the model.
The equilibrium features are governed by the Gibbs measure. We choose the Hamilto-
nian as
H
N
= −
N
i=1
h
i
σ
i
= −
N
i=1
σ
i
j: j=i
J
ij
σ
j
(2.25)
Now the zero temperature dynamics of this system is equivalent to the earlier-defined
33

Page 34
neuron dynamics (2.23). We see this as follows. The energy equals
H
N
(t + ∆t) = −
N
i=1
h
i
(t)σ
i
(t + ∆t)
(2.26)
For zero temperature the Gibbs measure for time t + ∆t becomes a δ-measure on the
configurations σ(t+∆t) for which the energy H
N
(t+∆t) is minimal. As we see from
(2.26), this is the configuration σ for which for every neuron i σ
i
(t + ∆t) = signh
i
(t).
It is easy to see that the energy cannot increase in this operation. However it is not clear
that in the end when t is very large, we will end up in a single ground state configuration
or oscillate between more.
The above dynamics is deterministic. In reality the neurons might not be determin-
istic in this way. Furthermore we need to allow for some probability that the energy
in the operation can increase. This is to have the ability to get out of the local minima
formed by the metastable fixed points. Therefore we introduce a parameter β to con-
trol the uncertainty in the model. The smaller β, the more uncertainty. When β = 0
the neurons behave perfectly random, because the energies of the configurations do not
matter. Taking β → ∞ makes the system behave like the zero-temperature dynamics of
(2.23).
As example we take J
ij
≡ +1. Then the system transforms into the Curie-Weiss
model. The behaviour of this model is well understood. The ground states are σ = ±1.
It is also easy to see that the same is true when J
ij
contains only one pattern ξ, i.e.
J
ij
= ξ
i
ξ
j
. Indeed the Hamiltonian (2.25) is minimized whenever σ ≡ ±signξ. In this
case the pattern coordinates ξ
i
are allowed to have a more general distribution but they
need to be i.i.d.
In reality one often knows only the global properties of the memory of the system.
Furthermore the memory also changes in time. But these changes in time are very slow
compared over the time in which the states of the neurons are changing. A good way
of modelling this is that instead of choosing the pattern ourselves we let the patterns be
chosen according to a quenched random distribution. All the randomness is i.i.d. and
is thus described by a product measure. The measure with respect to this randomness
ξ we denote by
P
ξ
. Usually one takes the randomness as symmetric Bernoulli: ξ ∼
{−1,1}
⊗N
.
Note an important difference between the 1-pattern case and the SK-model. In the
SK-model all the bonds have independent disorder J
ij
. In the 1-pattern Hopfield model
pairs of bonds with a common neuron e.g. (ij) and (jk) have highly dependent disorder.
We generalize from the 1-pattern system to the finite p-pattern system with patterns
34

Page 35
ξ
1
,...,ξ
p
. We take for the J
ij
J
ij
=
1
N
p
µ=1
ξ
µ
i
ξ
µ
j
(2.27)
For the patterns we take a random outcome of the uniform distribution on {−1,1}
⊗N
⊗p
.
In the literature this choice of the J
ij
is referred to as the Hebb rule. Because of the scal-
ing the quenched patterns are asymptotically orthonormal to each other
lim
N→∞
1
N
ξ
i
· ξ
j
= δ
ij
+ O( 1/N)
(2.28)
It is easy to see that the states σ = ±ξ
µ
are equilibrium states of the system. Indeed if
we put in σ(t) = ξ
µ
into the β → ∞-dynamics (2.23) we obtain for N → ∞
σ
i
(t + ∆t) = sign
j: j=i
J
ij
ξ
µ
j
= sign ±
1
N
p
ν=1
ξ
ν
i
j: j=i
ξ
ν
j
ξ
µ
j
=
sign ±
p
ν=1
ξ
ν
i
δ
νµ
= sign(±ξ
ν
i
) = ±ξ
ν
i
(2.29)
In other words σ(t) = σ(t + ∆t). This makes σ(t) ≡ ±ξ
µ
fixed point configurations
and therefore equilibrium states. However it is not clear from these calculations whether
these states are also ground states. This is because the states can be unstable fixed points.
Furthermore we might not be allowed to omit the O( 1/N) term in the calculations
as we have done. Also it could be possible that the states only can be reached by a set
of
P
ξ
-measure 0. Or maybe there are more ground states than these fixed points. After
more analysis it turns out to be that the 2p states σ = ±ξ
µ
are indeed the only ground
states for this system (e.g. [10]).
2.3.3 System-size-dependent patterns
When the number p of patterns depends on the system size N several things can happen.
Denote by α the ratio between the number of patterns and the system size: α = p/N.
We consider the phase regions in the (T,α) plane, see Figure 2.5. Whenever T > T
g
where T
g
= 1 +
α the system behaves like a paramagnet.
There is a curve T
c
such that below this line all of the p patterns are stable, i.e.
absolute minima of the free energy. Between the curves T
c
and T
g
there is a different
curve T
M
which separates between stability and metastability.
35

Page 36
Figure 2.5: Phase diagram for the Hopfield model
(after [3])
Between the curves T
M
and T
c
the patterns become metastable; they are local min-
ima of the free energy. The global minima correspond to the spin-glass states. These
states have vanishingly small overlap q
µ
(of order O(1/
αN)) with all of the patterns
µ. So only if the initial configuration is close enough to a pattern the system will re-
member it.
Above T
M
and below T
g
the spin-glass states become the only one present. So
none of the patterns can be remembered. In this spin-glass phase there is presence of
ageing. The decay of the energy becomes slower for longer waiting times. According
to numerical research the spin-glass properties seem to be closely related to properties
of the SK-model, which we obtain by taking the limit α → ∞. Analytic research of
the corresponding dynamics is highly complicated; it cannot be described only by the
overlap values q
µ,t
and the neuron states σ
t
at times t [3, 58, 68].
For a more extensive discussion of the Hopfield model, including some history and
its relation with the theory of neural networks, see [10, pag. 133 and further] or [12].
2.3.4 Some generalizations
To put in more realism we take into account that not every neuron need to be connected
with every other. For this goal we define the matrix Λ
ij
, which represents the structure
of the network. If neuron i is connected with j then Λ
ij
= 1 otherwise Λ
ij
= 0. When
the network is undirected we have a symmetric matrix Λ
ij
. Now we use the Hopfield
dynamics of (2.23) but we replace J
ij
by the value Λ
ij
J
ij
.
Numerical research seems to suggest that the task of recognition of a finite number
36

Page 37
of patterns is better performed (i.e. higher overlap after long time) when we decrease
the clustering coefficient of a network [48]. By the clustering coefficient we do mean
the following. Take a vertex v of a graph. Suppose v has N neighboring edges. Then
at most N(N − 1)/2 edges can exist between these neighboring vertices. Denote by
C
v
the actual number of these edges divided by the maximal amount possible. The
clustering coefficient C is the mean of C
v
over all vertices v.
Another generalization is to allow the neurons to have more values. The neuron-
states increase to σ
i
∈ {1,...,q} instead of σ
i
= ±1. In spin-glass language we say
that we have q-state Potts spins instead of Ising spins. For the patterns we can still take
the restriction to ξ
µ
i
= ±1. When the system has only one pattern in its memory then
we easily see that the form of the ground states is of the following type. Every site i for
which ξ
i
= +1 has σ
i
≡ j and every site for which ξ
i
= −1 has σ
i
≡ k, with j = k.
Of course it is more realistic to consider Potts-patterns, when the neuron states are
equivalent to Potts-spins. Then the ground states are {ξ
µ
} with
P
ξ
probability one
whenever the number p of patterns is not too large: α : 0 ≤ α < 1 arbitrarily, p <
(α/lnq)lnN [37]. Note that p is allowed to be infinite when N → ∞.
2.4 Scenarios for the spin glass
In the last decades researchers have tried to get an analytic-rigorous grip on the phenom-
ena of short-range spin-glasses. During this process various competing theories were
formed which were not at all conclusive. Most theories we can group into three scenar-
ios; the droplet-picture of Fisher and Huse [33], the chaotic-pairs picture of Newman
and Stein [66] and the replica symmetic breaking picture which resulted from mean-
field theory for the infinite range SK spin-glass developed by Parisi [56].
2.4.1 Droplet-picture short-range spin-glasses
At the end of the eighties Fisher and Huse introduced a so called droplet picture [33] to
describe the equilibrium phenomena for short-range spin glasses. For a clear example
of this picture we take a model which has the energy function (2.17) of the Edward-
Anderson model. The couplings (the spin-interactions) we choose symmetrically and
continuously distributed. We set the field h to zero.
For finite dimensional Ising spin glasses there are two possibilities for the equilib-
rium behavior for small T. There is a critical dimension d
l
such that:
d < d
l
: System is paramagnetic at all T > 0 so T
c
= ∞.
d ≥ d
l
: There exists exactly one pair of (flip-related) ground states. For 0 < T < T
c
<
37

Page 38
∞ the behavior of the system is described by a small non-zero density of excitations
with a volume which is non-zero relatively to the (infinite-sized) system.
Now we take a particular ground state G and look at its excitations. Because T > 0
there are excited regions where the spins have opposite values compared to G. As in the
Ising model we can define contours by the boundaries of these regions. The contours
do exist on various scales. For a large enough system the probability of having at least
one large contour is of order 1, although the probability of having a particular large
contour is small. For low enough T we assume that the contours with the lowest energies
dominate the physics. These contours we call droplets. More precisely
Definition 2.2. A droplet D
L
(j) of length scale L is a contour Γ enclosing site j and has
the minimum of energy of all possible contours Γ enclosing j and containing between
L
d
and (2L)
d
spins.
The energy F
L
(j) of a droplet D
L
(j) equals
F
L
(j) =
min
Γ encl. j,
L
d
≤|Γ|<(2L)
d
E
G
(Γ)
(2.30)
where E
G
(Γ) is the energy of configuration {Γ} relatively to the ground state energy
E
G
(∅).
In case of an Ising ferromagnet (i.e. (2.17) with J
ij
≡ 1 and h = 0) F
L
(j) =
O(L
d−1
). For the current Ising spin glass with the random symmetric couplings it is
expected that the droplet energy is much lower. This because there is a big amount
of frustration and also there are many configurations which are almost like the ground
states. However for a generic contour the energy scales still like L
d−1
. Given this we
make the scaling ansatz:
F
L
(j) = O(L
θ
), θ < d − 1
(2.31)
In [33] it is argued that
θ ≤
d − 1
2
(2.32)
However the arguments in favor of (2.32) use some assumptions which need not hold in
general [24].
For θ > 0 we expect the following picture. Because of the almost degenerate
ground state the Gibbs weight of the event F
L
≈ 0 is bigger than zero even for zero
energy. As we see from the Hamiltonian only the droplets at length scale L with energy
F
L
≤ O(T) do contribute significantly to the Gibbs measure. When T
O(L
θ
), only
a small fraction of these droplets does appear. Because of the positive weight of F
L
near
zero some of the droplets will be excited at any positive temperature. These properties
make that θ > 0 implies d ≥ d
l
.
38

Page 39
When θ < 0, the energy cost is so low that the entropy will dominate and the
droplet-picture breaks down. Because every spin can be flipped with arbitrarily small
energy cost (by taking the system size large enough) the system is to be expected to
behave like a paramagnet. Therefore θ < 0 implies d < d
l
.
2.4.2 Parisi’s Replica Symmetry breaking picture
Parisi cleverly conjectured in the eighties an expression for the free energy function of
the SK-model and also an expression of the (Parisi) overlap distribution [56]. The idea
of this solution is also known as replica symmetry breaking (RSB). Recently the con-
jectured free energy expression was mathematically rigorously proven to be the correct
expression by Talagrand [74] who used in his proof results of Guerra and co-workers.
However, some of the aspects of Parisi’s RSB-picture are still open.
This RSB-picture predicts that in the infinite-volume limit states do appear which
are composed out of infinitely many pure states. It is not clear what a pure state means
for the infinite range SK-model. Assuming we still can define overlaps between differ-
ent ’pure states’, the overlap between ’pure state’ α and α is
q
αα
=
1
N
N
i=1
< σ
i
>
α
< σ
i
>
α
(2.33)
By < . >
α
we mean the Gibbs measure over the pure Gibbs state µ
α
. From this quantity
we can read how much state µ
α
looks like state µ
α
. For every pure state µ
α
it holds
q
α
=< σ
α
>
2
= q
EA
(2.34)
where q
EA
is the same parameter as in (2.18). Furthermore we see
−q
EA
≤ q
αα
≤ q
EA
(2.35)
To explicit construct the pure states is impossible. However, still some things can be
said about the distribution of the overlaps. We choose at random two pure states from
the Gibbs measures appearing in the limit of the SK-model. Then we denote by P(q)dq
the probability that the overlap of these two states lays in between q and q + dq. This
distribution is also called the Parisi overlap distribution. It looks like
P
J
(q) =
α,α
w
J
(α)w
J
(α )δ(q − q
αα
)
(2.36)
For high temperature the SK-model becomes a paramagnet and P(q) = δ(q = 0).
However the symmetric overlap function P(q) is highly non-trivial when the tempera-
ture T is low enough and consist of many δ-functions of non-zero weight. Furthermore
39

Page 40
it is coupling dependent, i.e. a non self-averaging object. When we average over the
couplings the resulting distribution shows to be continuous non zero between two δ
spikes at ±q
EA
. Furthermore there is chaotic size dependence. When we look at two
different infinite volumes which has a large difference in volume sizes then in general
P(q) also looks very different.
Another interesting concept which holds according to Parisi’s theory is ultrametric-
ity. Recall that for two equal pure states the overlap equals q
EA
. With this we create a
distance function between two pure states
d
αα
= q
EA
− q
αα
(2.37)
Then we take at random three states 1, 2, 3. With these states we can make three pairs.
Ultrametricity then claims that either
d
12
= d
13
= d
23
or d
12
= d
13
< d
23
or
d
12
= d
23
< d
21
or d
21
= d
23
< d
12
(2.38)
So the three overlaps of the state pairs are intimately related. A mathematical rigorous
proof of this ultrametric structure is still an open problem.
2.4.3 Chaotic Pairs
The remaining possibility is [65, 66] that for large L the Gibbs measure looks like (with
dependence on J)
µ
L
1
2
µ
α
L
,J
+
1
2
µ
−α
L
,J
(2.39)
When we put L → ∞ and take the union of all possible states emerging then we obtain
a set of uncountably many states. The Gibbs measure is approximately a combination
of two pure Gibbs states α
L
and −α
L
out of the infinitely many. These two states are
each other’s global spin-flip. The state-labels α
L
are chaotically dependent on L.
We encounter in Chapter 3 an example of an infinite-range system which has in-
finitely many ground states. For fixed size L only two pairs of ground states do appear
(or triples of pairs in case of 3-Potts spins) in the way of the Chaotic Pairs scenario.
2.5 Metastates
In spin glasses to get a grip on the quenched disorder we consider the following. Look
at a sequence of finite volume Gibbs measures µ
η
Λ
. The disorder of the spin glasses is
40

Page 41
prescribed by the parameter η. It is treated as quenched disorder so we consider it as
fixed. Then we take the empirical average of these measures
K
η
N
=
1
N
N
n=1
δ
µ
η
Λn
(2.40)
We try to take the limit N → ∞. The result provides the so called (empirical) metastate.
This metastate is a probability measure on the Gibbs measure and is dependent on the
quenched disorder η [66]. The metastate gives the relative weight of the event that
a quenched disordered system of a very large volume behaves like a particular Gibbs
measure.
In general however (2.40) does not converge for almost every configuration η unless
we take a sparse enough subsequence. It does converge however in distribution. The
resulting distribution over the infinite Gibbs measures does not depend on η anymore.
The limiting process of the whole path t → µ
η
Λ
[tN]
is described by the so called super-
state. The value [tN] is equal to the largest integer smaller equal tN [54]. In [51] and
[53] there are two examples for which this behavior has been examined thoroughly.
For the d = 2 random boundary field Ising model, which we consider in Chapter
4, the metastate does concentrate on two extremal Gibbs measures µ
+
and µ
. We
conjecture that for d = 2,3 every mixture of µ
+
and µ
can appear as a limit point along
the regular sequence of cubes. These mixtures are null-recurrent. So in the metastate
they do not appear and for this particular model the metastate is a.s. convergent.
For d > 3 for the random weak boundary field Ising model, the limit points along
the regular sequences are only µ
+
and µ
almost surely. Each extremal Gibbs measure
appears with probability 1/2 [28].
As example of an a.s. non-converging metastate we take the Curie-Weiss random
field Ising model. It has as Hamiltonian
H
N
= −
β
N
i<j
σ
i
σ
j
− β
N
i=1
η
i
σ
i
(2.41)
The random variables η
i
are i.i.d. and have
P
i
= ±1) = 1/2. For β large enough
and small the model behaves like a ferromagnet with one +-phase µ
+,η
and one −-
phase µ
−,η
. When one takes the sequence n = 1,2,··· the corresponding metastate
converges in distribution to
lim
N→∞
K
η
N
= lim
N→∞
1
N
N
n=1
δ
µ
η
n
law
= n
δ
µ
+,η
+ (1 − n
µ
−,η
(2.42)
The variable n
is a random variable independent of η. It is distributed as
P
(n
<
x) =
2
π
arcsin(
x) [51, 53].
41

Page 42
42

Page 43
Chapter 3
Gaussian Potts-Hopfield model
In this chapter we study a Gaussian Potts-Hopfield model. Whereas for Ising spins and
two disorder variables per site the chaotic pair scenario is realized, we find that for q-
state Potts spins q(q − 1)-tuples occur. Beyond the breaking of a continuous stochastic
symmetry, we study the fluctuations and obtain the Newman-Stein metastate description
for our model.
3.1 Introduction
The Gaussian Potts-Hopfield model is equal to the Potts-Hopfield model but with Gaus-
sian noise as patterns. What happens for two patterns with Ising or Potts-like neurons
is, surprisingly, that there are infinitely many ground-states. We study the mean-field
Potts model with Hopfield-Mattis disorder, and more in particular with Gaussianly dis-
tributed disorder. This model is a generalization of the Ising version of the model stud-
ied in [11]. It provides yet another example of a disordered model with infinitely many
low-temperature pure states, such as is sometimes believed to be typical for spin-glasses
[33]. In our model, however, in contrast to [11], instead of chaotic pairs we find that the
chaotic size dependence is realized by chaotic q(q − 1)-tuples.
A somewhat different generalization of the Hopfield model to Potts spins was intro-
duced by Kanter in [47] and was mathematically rigorously analysed in [37]. However,
whereas the version we treat here (in which the form of the disorder is the Mattis-
Hopfield one) displays the phenomenon of stochastic symmetry breaking, in which a
finite-spin, “finite pattern” model can end up with chaotic size dependence, and a real-
ization of chaotic n-tuples out of infinitely many “pure states”, we do not see how to
obtain such results in a version of Kanter’s form of the disorder distribution.
We are concerned in particular with the infinite-volume limit behaviour of the Gibbs
43

Page 44
and ground state measures. The possible limit points are labelled as the minima of
an appropriate mean-field (free) energy functional. These minima can be obtained as
solutions of a suitable mean-field equation. These minima lie on the minimal-free-
energy surface, which is a m(q −1)-sphere in the (e
1
,··· ,e
q
)
⊗m
space. This space for
q-state Potts spins and m patterns is formed by the m-fold product of the hyperplane
spanned by the end points of the unit vectors e
q
, which are the possible values of the
spins. But only a limited area of the minimal-free-energy surface is accessible. Only
those values for which certain mean-field equations hold, are allowed. These equations
have the structure of fixed point equations. We derive them in Chapter 3.4. To obtain
the Gibbs states we need to find the solutions of these equations on the minimal free
energy surface.
The structure of the ground or Gibbs states for Ising spins, where q = 2, and 2
standard-Gaussian patterns ξ, η is known since a few years [11]. Due to the Gaussian
distribution we have a nice symmetric structure: the extremal ground (and Gibbs) states
form a circle. The first time this degeneracy of the ground states due to the rotational
symmetry of the Gaussian’s is mentioned is in [2].
For a fixed configuration and a large finite volume the possible order-para-meter
values become close to two diametrical points (which ones depend on the volume of the
system) on this circle. This chapter treats the generalization of this structure to q-state
Potts spins with q > 2. To have a concrete example, we concentrate on the case q = 3.
It turns out that we again obtain a circle symmetry but also a discrete symmetry, which
generalizes the one for Ising spins. One gets instead of a single pair a triple of pairs
(living on 3 separate circles), where for each pair one has a similar structure as for the
single pair for q = 2. For q > 3 we get
q(q−1)
2
pairs and a similar higher-dimensional
structure.
Our model contains quenched disorder. It turns out that there is some kind of self-
averaging. The thermodynamic behaviour of the Hamiltonian is the same for almost
every realization. This is the case for the free energy and the associated fixed point
equations, as is familiar from many quenched disordered models. However, this is not
precisely true for the order parameters. We will see that they show a form of chaotic
size dependence, i.e. the behaviour strongly depends both on the chosen configura-
tion and on the way one takes the infinite-volume limit N → ∞ (that is, along which
subsequence).
3.2 Notations and definitions
We start with some definitions. Consider the set Λ
N
= {1,··· ,N} ⊂ IN
+
. Let the
single-spin space χ be a finite set and the N-spin configuration space be χ
⊗N
. We
44

Page 45
E
d
d
d
d
ds
 
 
 
 
 ©
(−
1
2
,
1
2
3)
(−
1
2
,−
1
2
3)
(1,0)
Figure 3.1: Wu representation spin values for q = 3
denote a spin configuration by σ and its value at site i by σ
i
. We will consider Potts
spins, in the Wu representation [76]. Each of the possible q values provides a spin-
vector e
i
. The i-th coordinates are given by e
i,j
= δ
i,j
. Then the set χ
⊗N
is the N-fold
tensor product of the single-spin space χ = {e
1
,··· ,e
q
}. The e
σ
i
are the projections of
the spin-vectors e
σ
i
on the hypertetrahedron in IR
q−1
spanned by the end points of e
σ
i
.
So every spinvalue σ
i
is represented by the projection vector e
σ
i
.
For q = 3 we get for example for e
1
, e
2
and e
3
the vectors of Figure 3.1. We have set
the projection of the origin (0,0,0) to (0,0) and rescaled the projection of e
1
to (1,0).
The Hamiltonian of our model is defined as follows:
−βH
N
=
β
N
m
k=1
N
i,j=1
ξ
k
i
ξ
k
j
δ(σ
i
j
)
(3.1)
with
δ(σ
i
j
) =
1
q
[1 + (q − 1)e
σ
i
· e
σ
j
]
(3.2)
where ξ
k
i
is the i-th component of the random N-component vector ξ
k
. For the ξ
k
i
’s we
choose i.i.d. N(0,1) distributions. The vectors ξ
k
= (ξ
k
1
,··· ,ξ
k
N
), by analogy with the
standard Hopfield model, are called patterns. If we combine the above, we can rewrite
the Hamiltonian H
N
as:
−βH
N
= β
q − 1
q
N
m
k=1
N
i=1
ξ
k
i
e
σ
i
N
2
+
1
q − 1
N
i=1
ξ
k
i
N
2
(3.3)
45

Page 46
So asymptotically
− βH
N
= N
K
2
m
k=1
q
2
kN
with K = 2β
q − 1
q
and order parameters q
kN
=
1
N
N
i=1
ξ
k
i
e
σ
i
(3.4)
The last term in (3.3) inside the brackets is an irrelevant constant; in fact it approaches
zero, due to the strong law of large numbers. Note that for the infinite pattern-limit
m → ∞ the Hamiltonian is still of the same form asymptotically. (The ξ
k
i
’s are i.i.d.
N(0,1) distributed so IEξ
k
i
= 0.) Note that any i.i.d. distribution with zero mean,
finite variance and symmetrically distributed around zero will give an analogous form
of H
N
, but we plan to consider only Gaussian distributions, for which we will find that
a continuous symmetry can be stochastically broken, just as in [11]. From now on we
drop the subscript N to simplify the notation, when no confusion can arise.
3.3 Ground states
Now it is time to reveal the characteristics of the ground states for the Potts model. First
we discuss the simple behaviour for 1 pattern. Then the more interesting part: q > 2
and 2 patterns.
3.3.1 Ground states for 1 pattern
For one pattern ξ the Hamiltonian is of the following form:
−βH
N
= N
K
2
q
2
1
=
β
N
N
i,j=1
ξ
i
ξ
j
δ(σ
i
j
)
(3.5)
We easily see that the ground states are obtained by directing the spins with ξ
i
> 0
in one direction and the spins with ξ
i
≤ 0 in a different direction. If we have as the
distribution for the ξ
i
’s P(ξ
i
= ±1) =
1
2
, then the order parameter is of the form:
q
1
=
1
2
(e
σ
i
e
σ
j
), with 1 ≤ i, j ≤ q and i = j, see also [27]. So for q = 3 we have
only 6 ground states. They form a regular hexagon:
±3/4,
3/4 , ± 3/4,
3/4 , 0,±
3/2
(3.6)
This regular hexagon with its interior is the convex set of possible order parameter
values. It is easy to see that for ξ
i
N(0,1)-distributed we get the same ground states
except for a scaling factor 2/π multiplying the values of the order parameter values.
46

Page 47
3.3.2 Ground states for 2 patterns
The Hamiltonian for 2 patterns (Gaussian i.i.d.) is:
−βH
N
=
β
N
N
i,j=1
1
i
ξ
1
j
+ ξ
2
i
ξ
2
j
)δ(σ
i
j
) = N
K
2
(q
2
1
+ q
2
2
)
(3.7)
Similarly as in [11], we make use of the fact that the distribution of 2 independent
identically distributed Gaussians has a continuous rotation symmetry. This symmetry
shows also up in the order parameters.
Ising spins
First we consider Ising-spins (i.e. we take q = 2). In [11] it is proven that the ground
states are as follows. The order parameters become ±(r cosθ,r sinθ), with θ ∈ [0,π)
and r = 2/π. Note that there are uncountably many ground-states.
This can be made plausible by the following observations. Note that the random
fields {sign(ξ
µ
i
)} are equally distributed as standard Hopfield-patterns:
P(sign(ξ
µ
i
) = ±1) = 1/2. So if we choose σ such that for each i: σ
i
= sign(ξ
1
i
), then
we obtain the state with corresponding order parameters (r ,0) for the limit N → ∞,
which is the ground state configuration corresponding to θ = 0. The spin-configuration
σ
i
= sign(ξ
2
i
) for all i corresponds to θ = π/2. By the global spin-flip symmetry of the
Hamiltonian we obtain the ground-states corresponding to θ = π and θ = 3π/2.
But what about the θ values in between? The set of Gaussian patterns has a contin-
uous rotation symmetry. We obtain two new patterns for which we multiply the patterns
with a rotation matrix, i.e. rotating the patterns over an angle θ (with 0 ≤ θ < π/2):
η
1
i
(θ)
η
2
i
(θ)
=
cosθ
sinθ
sinθ −cosθ
ξ
1
i
ξ
2
i
(3.8)
The corresponding order-parameters we define as q(θ). To obtain the original patterns
from η
1
(θ) and η
2
(θ) simply perform the rotation again:
ξ
1
i
ξ
2
i
=
cosθ
sinθ
sinθ −cosθ
η
1
i
(θ)
η
2
i
(θ)
(3.9)
By the rotation (3.8) of the standard Gaussian patterns ξ
1
and ξ
2
we obtain two new
patterns η
1
and η
2
which again are Gaussian distributed. Note that IEη
1
i
(θ) = IEη
2
i
(θ) =
IEξ
1
i
= IEξ
2
i
= 0. Furthermore the variance of η
1
i
(θ) and η
2
i
(θ) is the same as for ξ
1
i
and
ξ
2
i
, i.e. 1. Therefore the distribution of the rotated patterns η
1
(θ) and η
2
(θ) is the same
as for the old ones, namely standard N-multivariate Gaussian. It is easily checked that
47

Page 48
each η
1
i
(θ) and η
2
i
(θ) are uncorrelated and because they are both Gaussian they are also
independent.
For any θ it holds
ξ
1
i
ξ
1
j
+ ξ
2
i
ξ
2
j
= η
1
i
(θ)η
1
j
(θ) + η
2
i
(θ)η
2
j
(θ)
(3.10)
By this it follows that the energy of the configurations
σ(θ) = {sign(η
i
(θ))}
(3.11)
are the same in the limit N → ∞ and therefore ground states. This we see by calculating
the two corresponding energies:
−βH
N
(σ(0)) =
β
N
N
i=1
1
i
ξ
1
j
| +
β
N
N
i=1
ξ
2
i
ξ
2
j
sign(ξ
1
i
ξ
1
j
) =
π
+ O β/N ,
−βH
N
(σ(θ)) =
β
N
N
i=1
ξ
1
i
ξ
1
j
+ ξ
2
i
ξ
2
j
sign(η
1
i
(θ)η
1
j
(θ)) =
β
N
N
i=1
η
1
i
(θ)η
1
j
(θ) + η
2
i
(θ)η
2
j
(θ) sign(η
1
i
(θ)η
1
j
(θ)) =
π
+ O β/N
(3.12)
So in the limit indeed it holds
lim
N→∞
H
N
(σ(θ)) = H
N
(σ(0)) for all θ
(3.13)
This means that we have an uncountable number of ground-state configurations in the
limit N → ∞.
Structure of the order parameters
Now we look what this symmetry does mean for the order parameters. We consider
the Gibbs measure with the original patterns ξ
µ
. Take a configuration σ(θ). The cor-
responding order-parameters we denote by q(θ). Rewriting the patterns ξ
µ
into η(θ)
according to (3.9) gives
q
1
(θ)
q
2
(θ)
=
1
N
N
i=1
cos(θ)η
1
i
(θ) + sin(θ)η
2
i
(θ) sign(η
1
i
(θ))
1
N
N
i=1
sin(θ)η
1
i
(θ) − cos(θ)η
2
i
(θ) sign(η
1
i
(θ))
=
2
π
cosθ
sinθ
+ O
1/N
(3.14)
48

Page 49
because sign(η
1
i
) is independent of η
2
i
. The O
1/N term is in general different
for different θ. However the equality q(θ) = −q(θ + π) is exact because η
µ
(θ) =
−η
µ
(π + θ). Because of this the energy of the configurations σ = sign(η
µ
(θ)) and
σ = −sign(η
µ
(θ)) = sign(η
µ
(π + θ)) are also the same. In [11] it is proven that for
finite N only for one pair (θ
0
(N) and θ
0
(N) + π) the energy is in its global minimum.
The value of θ
0
(N) depends on the system size.
When we add to the Hamiltonian the term −( /N)
N
i=1
η
1
i
1
i
, with > 0 and
θ
1
fixed, the degeneracy of the ground-states is broken even when N → ∞. Now only
the configuration {sign(η
1
i
1
))} is a ground-state, i.e. q = 2/π(cosθ
1
,sinθ
1
). For β finite and large enough the same holds in the limit N → ∞ but with
r (β) instead of 2/π. This also corresponds to the results proven in [11].
Potts-spins
For obtaining the ground-states in case of Potts-neurons we perform the same strategy as
for the Ising-neurons. We consider the distributions {sign(η
1
i
(θ))}. The corresponding
ground-states configurations σ(θ) we obtain as follows. If sign(η
1
i
(θ)) = 1 we set
σ
i
= k. When sign(η
1
i
(θ)) = −1 we set σ
i
= k , with k = k and k,k ∈ {1,··· ,q}.
This gives us q(q − 1) possible values for the order-parameters q(θ) for each θ, the
so-called discrete symmetry. If we look carefully at the values of q(θ) we see that when
we take the union of q(θ) over all θ the resulting curves consist of q(q − 1)/2 circles in
the order-parameter space. This provides the continuous symmetry of the ground-states
which originates from the continuous rotational symmetry between the two Gaussian
patterns ξ
1
and ξ
2
.
We take one of the q(q − 1) values by considering the ground-state configurations
sign(η
1
i
(θ)) = 1 → σ
i
= e
1
, sign(η
1
i
(θ)) = −1 → σ
i
= e
2
(3.15)
In the same way as (3.14) we obtain for q
1
(θ) by using independence
q
1
(θ) =
cos(θ)
2N
N/2
i=1
1
i
(θ)|
1
0
1
2
1
2
3
+ O
1/N
= cos(θ)
2
π
3
4
1
4
3
+ O
1/N
(3.16)
and
q
2
(θ) = sin(θ)
2
π
3
4
1
4
3
+ O
1/N
(3.17)
49

Page 50
By considering the other possibilities we obtain all the six discrete points. These have
the q
1
-coordinates of (3.6) multiplied by the factor
2/π. By rotating we obtain the
circles. Because q(θ) = −q(θ + π) we obtain the same structure as for the Ising-spins.
The same is true for the order-parameters resulting from the Gibbs-states.
Without much effort this is also seen to be true for infinitely many patterns (as long
as their number grows logarithmic compared to the system size). However the precise
structure of the Gibbs-states is not proven yet but still being investigated.
This is an example of chaotic size dependence, based on the breaking of a stochas-
tic symmetry, of the same nature as in [11]. Because of weak compactness, different
subsequences exist whose q(q − 1)-tuples of ground states converge to q(q − 1)-tuples,
associated to particular θ-values. These subsequences depend on the random pattern
realization. See Section 3.5.
For any finite m ≥ 3 patterns one has the same discrete structure as before, but in-
stead of a continuous circle symmetry we have a continuous m-sphere symmetry (iso-
morpic to O(m)). The case of an infinite (that is, increasing with the system) m is
still open. However the limit meta-state structure of the Gibbs-states when considering
infinite sequences in N is more complicated.
3.4 Positive temperatures
In this section we obtain an expression for the free energy which is maximized over
the order parameters q
k
. By large deviation arguments we relate this expression and
therefore the free energy to the average of the energy over the induced measure of the
order parameters q
k
.
3.4.1 Fixed-point mean-field equations
Remember
Z
N
= Tr
σ
exp N
K
2
m
k=1
q
2
k
(3.18)
Due to the quadratic dependence on q
k
this is hard to compute. Therefore we like to
linearize the terms in the exponential. For this we use the following identity:
e
ax
2
/2
=
aN
−∞
dm e
−Nam
2
/2+
Namx
(3.19)
Note
q
2
k
=
q−1
i=1
q
2
ki
(3.20)
50

Page 51
So if we set x =
Nq
ki
and a = K we obtain
exp N
K
2
q
2
k
=
q−1
i=1
K
−∞
dm
ki
exp −KNm
2
ki
/2 + KNm
ki
q
ki
(3.21)
Applying it for every m order-parameters q
k
and putting the result into Z
N
we obtain
Z
N
= Tr
m
k=1
K
q−1
2
IR
q−1
dm
k
exp −KNm
2
k
/2 + KNm
k
· q
k
(3.22)
This transformation is called the Hubbard-Stratonovich transformation. Notice that the
dependence on q
k
now is linear. Because N → ∞ the integral behaves like its maximal
value. Maximizing the exponent in (3.22) gives the saddle point equations for m
ki
:
∂m
ki
−KNm
2
k
/2 + Km
k
· Nq
k
= 0 →
−KNm
ki
+ KNq
ki
= 0 → m
k
= q
k
(3.23)
Further rewriting of (3.22) gives that the partition function Z
N
is equal to
Z
N
=
K
m(q−1)
2
IR
m(q−1)
dm
1
···dm
m
exp − KN
m
k=1
m
2
k
/2 +
N log Tr
σ
exp
m
k=1
Km
k
· ξ
k
1
e
σ
ξ
1
1
,···ξ
m
1
(3.24)
Now we maximize this exponent. Using both equations gives the so-called fixed-point
mean-field equation. Maximizing and putting m
k
= q
k
(the first equation) give the
mean field equations for the order parameters which have the structure of a system of
fixed point equations q = F(q). When we have only two patterns ξ
1
and ξ
2
these are as
follows:
q
1
=
tr
σ
e
σ
exp [K(ξ
1
1
q
1
2
1
q
2
e
σ
]}
tr
σ
{exp [K(ξ
1
1
q
1
2
1
q
2
e
σ
]}
ξ
1
1
2
1
q
2
=
tr
σ
e
σ
{exp [K(ξ
1
1
q
1
2
1
q
2
e
σ
]}
tr
σ
{exp [K(ξ
1
1
q
1
2
1
q
2
e
σ
]}
ξ
1
1
2
1
(3.25)
3.4.2 Induced measure on order parameters
Now we try to find an expression which in the infinite neuron limit equals the induced
Gibbs measure L
∞,β
on the order-parameters. For this end we calculate the free energy
by using the Laplace method. When we look carefully at the integrand in (3.22) we see
−βf(β) = lim
N→∞
1
N
log Z
N
= max
m
(−Q(m) + c(Km))
(3.26)
51

Page 52
where c(m) is the generating function of the pattern distributions:
c(m) =
m
k=1
ln IE
σ
exp(ζ
k
m
k
· e
σ
)
ζ
k
(3.27)
Because
Q = Km
2
/2, Q(m) = Km
(3.28)
From these solutions we can also read out the fixed point equations. When we differen-
tiate (3.28) to m componentwise we get
Q(m) = K c( Q(m)) ⇒ m = c( Q(m))
(3.29)
This we can relate to the rate-function c (t), which is the Legendre transform of c(m):
c (m) = sup
t
[m · t − c(t)]
(3.30)
For fixed m the vector t has to be such that m = c(t). But because of the fixed point
equations m = c( Q(m)). Therefore t = Q(m). So
c (m) = m · Q(m) − c( Q(m))
(3.31)
Insert this into (3.26) to obtain
−βf(β) = max
m
(Q(m) − c (m))
(3.32)
For N → ∞ the equation m = q holds. This gives that
lim
N→∞
1
N
log Z
N
= lim
N→∞