2 Basic equations for a class of statistical theories

In I three different types of theories have been defined which differ from each other with regard to the role of probability. We give a short review of the defining properties and supply some additional comments characterizing these theories.

The dogma underlying theories of type 1 is determinism with regard to single events; probability does not play any role. If nature behaves according to this dogma, then measurements on identically prepared individual systems yield identical results. Classical mechanics is obviously such a deterministic type 1 theory. We shall use below (as a ’template’ for the dynamics of our statistical theories) the following version of Newton’s law, where the particle momentum pk  (t ) plays the role of a second dynamic variable besides the spatial coordinate xk  (t ) :
 d                p  (t )      d
----x   (t )  =   --k-----,    ---p   (t ) =   F    (x,  p,  t).
      k                             k             k
dt                  m          dt

In classical mechanics there is no restriction as regards the admissible forces. Thus, Fk is an arbitrary function of x   , p  , t
  k     k ; it is, in particular, not required that it be derivable from a potential. Note that Eqs. (2) do not hold in the present theory; these relations are just used to establish a correspondence between classical mechanics and associated statistical theories.

Experimental data from atomic systems, recorded since the beginning of the last century, indicate that nature does not behave according to this single-event deterministic dogma. A simple but somewhat unfamiliar idea is, to construct a theory which is deterministic only in a statistical sense. This means that measurements on identically prepared individual systems do not yield identical results (no determinism with regard to single events) but repeated measurements on ensembles [consisting each time of a large (infinite) number of measurements on individual systems] yield identical results. In this case we have ’determinism’ with regard to ensembles (expectation values, or probabilities).

Note that such a theory is far from chaotic even if our macroscopic anticipation of (single-event) determinism is not satisfied. Note also that there is no reason to assume that such a statistical theory for microscopic events is incompatible with macroscopic determinism. It is a frequently observed (but not always completely understood) phenomenon in nature that systems with many (microscopic) degrees of freedom can be described by a much smaller number of variables. During this process of elimination of variables the details of the corresponding microscopic theory for the individual constituents are generally lost. In other words, there is no reason to assume that a fundamental statistical law for individual atoms and a deterministic law for a piece of matter consisting of, say, 1023 atoms should not be compatible with each other. This way of characterizing the relation between two physical theories is completely different from the common reductionistic point of view. Convincing arguments in favor of the former may, however, be found in [4][39].

As discussed in I two types (referred to as type 2 and type 3) of indeterministic theories may be identified. In type 2 theories laws for individual particles exist (roughly speaking the individuality of particles remains intact) but the initial values are unknown and are described by probabilities only. An example for such a (classical-statistical) type 2 theory is statistical thermodynamics. On the other hand, in type 3 theories the amount of uncertainty is still greater, insofar as no dynamic laws for individual particles exist any more. A possible candidate for this ’extreme’ type of indeterministic theory is quantum mechanics.

The method used in I to construct statistical theories was based on the following three assumptions,

These (properly generalized) assumptions represent also the formal basis of the present work. The first and second of these cover type 2 as well as type 3 theories, while it will be shown that the third - the requirement of maximal disorder - does only hold for a single type 3 theory, namely quantum mechanics. In this sense quantum mechanics may be considered as the most reasonable theory among all statistical theories defined by the first two assumptions. There is obviously an analogy between quantum mechanics and the principle of minimal Fisher information on the one hand and classical statistical mechanics and the principle of maximal entropy on the other hand; both theories are realizations of the principle of maximal disorder.

Let us now generalize the basic equations of I (see section 3 of I) with respect to the number of spatial dimensions and with respect to gauge freedom. The continuity equation takes the form
∂ ρ (x,  t )       ∂    ρ (x,  t) ∂  ˜S (x,   t)
-------------+   ------------------------------  =   0.
    ∂ t          ∂ x       m          ∂ x
                     k                    k

We use the summation convention, indices i, k,  ... run from 1 to 3 and are omitted if the corresponding variable occurs in the argument of a function. The existence of a local conservation law for the probability density ρ (x,  t) is a necessity for a probabilistic theory. The same is true for the fact that the probability current takes the form j  (x,   t)  =   ρ (x,  t) ˜p  (x,   t) ∕m
  k                          k , where ˜pk  (x,  t) is the k -th component of the momentum probability density. The only non-trivial assumption contained in (3), is the fact that ˜pk  (x,  t ) takes the form of the gradient,
                 ∂ S  (x,  t)
˜pk (x,   t)  =   -------------,
                    ∂ x

of a function ˜S (x,  t ) . In order to gain a feeling for the physical meaning of (4) we could refer to the fact that a similar relation may be found in the Hamilton-Jacobi formulation of classical mechanics [63]; alternatively we could also refer to the fact that this condition characterizes ’irrotational flow’ in fluid mechanics. Relation (4) could also be justified by means of the principle of simplicity; a gradient is the simplest way to represent a vector field, because it can be derived from a single scalar function.

In contrast to I we allow now for multi-valued functions  ˜
S  (x,  t ) . At first sight this seems strange since a multi-valued quantity cannot be an observable and should, consequently, not appear in equations bearing a physical meaning. However, only derivatives of  ˜
S (x,   t) occur in our basic equations. Thus, this freedom is possible without any additional postulate; we just have to require that
                          ˜       ˜
 ˜                     ∂--S-  -∂-S---
S  (x,  t )multi- valued,     ,        single- valued.
                        ∂ t   ∂  xk

(the quantity ˜p defined in (4) is not multi-valued; this notation is used to indicate that this quantity has been defined with the help of a multi-valued S˜ ). As discussed in more detail in section 3 this new ’degree of freedom’ is intimately related to the existence of gauge fields. In contrast to S˜ , the second dynamic variable ρ is a physical observable (in the statistical sense) and is treated as a single-valued function.

The necessary and sufficient condition for single-valuedness of a function ˜
S (x,  t ) (in a subspace            4
G   ⊆   R ) is that all second order derivatives of  ˜
S  (x,  t) with respect to xk and t commute with each other (in G) [see e.g. [28]]. As a consequence, the order of two derivatives of S˜ with respect to anyone of the variables x   , t
  k must not be changed. We introduce the (single-valued) quantities
             [                                ]                 [                           ]
                     2                2                               2               2
                  ∂   S˜            ∂   S˜                          ∂   S˜         ∂   S˜
S˜       =     ------------- -   -------------   , ˜S        =     ---------- -    ----------
  [j,k ]       ∂  x   ∂ x        ∂ x    ∂ x          [0,k ]       ∂  t∂ x         ∂ x   ∂ t
                    j     k           k     j                              k          k

in order to describe the non-commuting derivatives in the following calculations.

The second of the assumptions listed above has been referred to in I as ’statistical conditions’. For the present three-dimensional theory these are obtained in the same way as in I, by replacing the observables xk  (t ), pk  (t ) and the force field Fk  (x  (t ), p (t),  t) of the type 1 theory (2) by averages ----- ----
xk  , pk and -----

Fk . This leads to the relations

d  -----      pk
---xk    =    ----                                      (7)
dt             m
    pk   =    Fk  (x,   p, t ),                         (8)

where the averages are given by the following integrals over the random variables xk  , pk (which should be clearly distinguished from the type I observables xk  (t ), pk  (t ) which will not be used any more):

                       ∫   ∞
            -----                 3
            xk    =             d   x ρ (x,  t)xk                              (9)

                       ∫ -  ∞
             ----          ∞
             pk   =             d   pw  (p,  t )pk                            (10)
                         -  ∞
-----------------      ∫   ∞
                                  3     3
Fk   (x,  p,  t)  =             d   xd   pW    (x,   p, t )Fk   (x,  p,  t).  (11)
                         -  ∞
The time-dependent probability densities W,   ρ,  w should be positive semidefinite and normalized to unity, i.e. they should fulfill the conditions
∫   ∞                         ∫   ∞                         ∫   ∞
           3                            3                              3    3
         d  x ρ (x,   t)  =           d   pw   (p,  t)  =            d  xd    pW    (x,  p,  t)  =   1

  -  ∞                          - ∞                           - ∞

The densities ρ and w may be derived from the fundamental probability density W by means of the relations
               ∫   ∞                                                ∫   ∞
                          3                                                    3
ρ (x,  t ) =            d  pW    (x,   p, t );      w  (p,  t ) =            d   xW    (x,  p,  t).

                 -  ∞                                                 -  ∞

The present construction of the statistical conditions (7) and (8) from the type 1 theory (2) shows two differences as compared to the treatment in I. The first is that we allow now for a p -dependent external force. This leads to a more complicated probability density W   (x,   p, t ) as compared to the two decoupled densities ρ (x,  t ) and w (p,  t ) of I. The second difference, which is in fact related to the first, is the use of a multi-valued S˜ (x,  t) .

Note, that the p -dependent probability densities w  (p,  t) and W   (x,   p, t ) have been introduced in the above relations in a purely formal way. We defined an expectation value ----
pk [via Eq. (7)] and assumed [in Eq. (10) ] that a random variable p
  k and a corresponding probability density w  (p,  t ) exist. But the validity of this assumption is not guaranteed . There is no compelling conceptual basis for the existence of these quantities in a pure configuration-space theory. If they exist, they must be defined with the help of additional considerations (see section 6 of I). The deeper reason for this problem is that the concept of measurement of momentum (which is proportional to the time derivative of position) is ill-defined in a theory whose observables are defined in terms of a large number of experiments at one and the same instant of time (measurement of a derivative requires measurements at different times). Fortunately, these considerations, which have been discussed in more detail in I, play not a prominent role [apart from the choice of W   (x,   p, t ) discussed in section 4], for the derivation of Schrödinger’s equation reported in the present paper3 .

Using the continuity equation (3) and the statistical conditions (7) and (8) the present generalization of the integral equation Eq. (24) of I may be derived. The steps leading to this result are very similar to the corresponding steps in I and may be skipped. The essential difference to the one-dimensional treatment is - apart from the number of space dimensions - the non-commutativity of the second order derivatives of  ˜
S (x,   t) leading to non-vanishing quantities  ˜         ˜
S  [j,k ], S [0,k ] defined in Eq. (6). The result takes the form
                          ⌊                                               ⌋
    ∫                                              (         )2
        ∞                       ˜            ∑            ˜
               3   -∂-ρ---⌈  ∂-S--     --1---          ∂--S---            ⌉
-            d   x                 +                               +  V
      -  ∞         ∂ xk       ∂ t      2m              ∂ xj
                                               j                                 ,
    ∫                [                                 ]       -------------------
        ∞                      ˜
               3        -1--∂--S---˜            ˜                 (e)
 +           d   x ρ              S  [j,k ] +   S [0,k ]    =   F k   (x,   p, t )
      -  ∞              m   ∂ xj

In the course of the calculation leading to (14) it has been assumed that the macroscopic force Fk   (x,  p,  t) entering the second statistical condition (8) may be written as a sum of two contributions,    (m  )
F        (x,  t)
   k and    (e )
F      (x,   p, t )
   k ,
                         (m  )                (e)
Fk   (x,  p,  t)  =   F       (x,  t ) +   F      (x,  p,  t),
                        k                    k

where    (m  )
F       (x,  t )
  k takes the form of a negative gradient of a scalar function V  (x,   t) (mechanical potential) and    (e )
F  k   (x,   p, t ) is the remaining p -dependent part.

Comparing Eq. (14) with the corresponding formula obtained in I [see Eq. (24) of I] we see that two new terms appear now, the expectation value of the p -dependent force on the r.h.s., and the second term on the l.h.s. of Eq. (14). The latter is a direct consequence of our assumption of a multi-valued variable S˜ . In section 4 it will be shown that for vanishing multi-valuedness Eq. (14) has to agree with the three-dimensional generalization of the corresponding result [Eq. (24) of I] obtained in I. This means that the p -dependent term on the r.h.s. has to vanish too in this limit and indicates a relation between multi-valuedness of ˜
S and p -dependence of the external force.