7 Entropy as a measure of disorder ?

How then to determine the unknown function h (p,  t) [and w  (p,  t) ] ? According to the last section, all the required information on h (p,  t) may be obtained from a knowledge of the term Q in the differential equation (26) for ρ (x,  t ) . We shall try to solve this problem my means of the following two-step strategy: (i) Find an additional physical condition for the fundamental probability density ρ (x,   t) , (ii) determine the shape of Q [as well as that of h (p,  t) and w  (p,  t ) ] from this condition.

At this point it may be useful to recall the way probability densities are determined in classical statistical physics. After all, the present class of theories is certainly not of a deterministic nature and belongs fundamentally to the same class of statistical (i.e. incomplete with regard to the description of single events) theories as classical statistical physics; no matter how important the remaining differences may be.

The physical condition for ρ which determines the behavior of ensembles in classical statistical physics is the principle of maximal (Boltzmann) entropy. It agrees essentially with the information-theoretic measure of disorder introduced by Shannon [41]. Using this principle both the micro-canonical and the canonical distribution of statistical thermodynamics may be derived under appropriate constraints. Let us discuss this classical extremal principle in some detail in order to see if it can be applied, after appropriate modifications, to the present problem. This question also entails a comparison of different types of statistical theories.

The Boltzmann-Shannon entropy is defined as a functional S [ρ ] of an arbitrary probability density ρ . The statistical properties characterizing disorder, which may be used to define this functional, are discussed in many publications [5][22]. Only one of these conditions will, for later use, be written down here, namely the so-called “composition law”: Let us assume that ρ may be written in the form ρ  =   ρ1  ρ2 where ρi,   i =   1,  2 depends only on points in a subspace Xi of our n  -dimensional sample space X and let us further assume that X is the direct product of X1 and X2 . Thus, this system consists of two independent subsystems. Then, the composition law is given by
                   (1 )             (2 )
S  [ρ1 ρ2  ] =   S     [ρ1  ] +   S     [ρ2  ],
(41)

where   (i )
S operates only on Xi .

For a countable sample space with events labeled by indices i from an index set I and probabilities ρi , the entropy is given by
                  ∑

S  [ρ ] =   -  k         ρi  ln ρi,

                  i∈I
(42)

where k is a constant. To obtain meaningful results the extrema of (42) under appropriate constraints, or subsidiary conditions, must be found. The simplest constraint is the normalization condition ∑

     ρi  =   1 . In this case the extrema of the function
                                               (                    )

                      ∑                            ∑
F  [ρ,  λ ] =   -  k         ρ   ln  ρ   +   λ (          ρ   -   1 )
                               i       i                    i

                       i∈I                         i∈I
(43)

with respect to the variables ρ1,   ... ρN   , λ must be calculated. One obtains the reasonable result that the minimal value of F  [ρ,  λ ] is 0 (one of the ρi equal to 1 , all other equal to 0 ) and the maximal value is k  ln N (all ρi equal, ρi  =   1 ∕N ).

For most problems of physical interest the sample space is non-denumerable. A straightforward generalization of Eq. (42) is given by
                  ∫

S  [ρ ] =   -  k      dx   ρ (x  ) ln  ρ (x ),
(44)

where the symbol x denotes now a point in the appropriate (generally n -dimensional) sample space. There are some problems inherent in the this straightforward transition to a continuous set of events which will be mentioned briefly in the next section. Let us put aside this problems for the moment and ask if (44) makes sense from a physical point of view. For non-denumerable problems the principle of maximal disorder leads to a variational problem and the method of Lagrange multipliers may still be used to combine the requirement of maximal entropy with other defining properties (constraints). An important constraint is the property of constant temperature which leads to the condition that the expectation value of the possible energy values E  (x ) is given by a fixed number E ,
        ∫


E   =       dx   ρ (x  )E  (x  ),
(45)

If, in addition, normalizability is implemented as a defining property, then the true distribution should be an extremum of the functional
                  ∫                                   ∫                               ∫

K  [ρ ] =   -  k       dx   ρ (x ) ln  ρ (x  )-  λ        dx   ρ (x  )E  (x  )-  λ        dx   ρ (x  ).
                                                   2                               1
(46)

It is easy to see that the well-known canonical distribution of statistical physics is indeed an extremum of K   [ρ ] . Can we use a properly adapted version of this powerful principle of maximal disorder (entropy) to solve our present problem ?

Let us compare the class of theories derived in section 5 with classical theories like (46). This may be of interest also in view of a possible identification of ’typical quantum mechanical properties’ of statistical theories. We introduce for clarity some notation, based on properties of the sample space. Classical statistical physics theories like (46) will be referred to as ”phase space theories”. The class of statistical theories, derived in section 5, will be referred to as ”configuration space theories”.

The most fundamental difference between phase space theories and configuration space theories concerns the physical meaning of the coordinates. The coordinates x of phase space theories are (generally time-dependent) labels for particle properties. In contrast, configuration space theories are field theories; individual particles do not exist and the (in our case one-dimensional) coordinates x are points in space.

A second fundamental difference concerns the dimension of the sample space. Elementary events in phase space theories are points in phase space (of dimension 6 for a 1-particle system) including configuration-space and momentum-space (particle) coordinates while the elementary events of configuration space theories are (space) points in configuration space (which would be of dimension 3 for a 1 - particle system in three spatial dimensions). This fundamental difference is a consequence of a (generally nonlocal) dependence between momentum coordinates and space-time points contained in the postulates of the present theory, in particular in the postulated form of the probability current [see (7)]. This assumption, a probability current, which takes the form of a gradient of a function S (multiplied by ρ ) is a key feature distinguishing configuration space theories, as potential quantum-like theories, from the familiar (many body) phase space theories. The existence of this dependence per se is not an exclusive feature of quantum mechanics, it is a property of all theories belonging to the configuration class, including the theory characterized by ℏ   =  0 , which will be referred to as ”classical limit theory”. What distinguishes the classical limit theory from quantum mechanics is the particular form of this dependence; for the former it is given by a conventional functional relationship (as discussed in section 4) for the latter it is given by a nonlocal relationship whose form is still to be determined.

This dependence is responsible for the fact that no ”global” condition [like (45) for the canonical distribution] must be introduced for the present theory in order to guarantee conservation of energy in the mean - this conservation law can be guaranteed ”locally” for arbitrary theories of the configuration class by adjusting the relation between Q (the form of the dynamic equation) and h (the definition of expectation values). In phase space theories the form of the dynamical equations is fixed (given by the deterministic equations of classical mechanics). Under constraints like (45) the above principle of maximal disorder creates - basically by selecting appropriate initial conditions - those systems which belong to a particular energy; for non-stationary conditions the deterministic differential equations of classical mechanics guarantee then that energy conservation holds for all times. In contrast, in configuration space theories there are no initial conditions (for particles). The conditions which are at our disposal are the mathematical form of the expectation values (the function h ) and/or the mathematical form of the differential equation (the function Q ). Thus, if something like the principle of maximal disorder can be used in the present theory it will determine the form of the differential equation for ρ rather than the explicit form of ρ .

These considerations raise some doubt as to the usefulness of an measure of disorder like the entropy (44) - which depends essentially on E instead of x and does not contain derivatives of ρ - for the present problem. We may still look for an information theoretic extremal principle of the general form
          ∑

I [ρ ] +        λlCl    [ρ ] →    extremum.

            l
(47)

Here, the functional I [ρ ] attains its maximal value for the function ρ which describes - under given constraints Cl  [ρ ] - the maximal disorder. But I [ρ ] will differ from the entropy functional and appropriate constraints Cl  [ρ ] , reflecting the local character of the present problem, have still to be found. Both terms in (47) are at our disposal and will be defined in the next sections.