Theoretical and experimental justification for the Schrödinger equation
The theoretical and experimental justification for the Schrödinger equation motivates the discovery of the Schrödinger equation, the equation that describes the dynamics of nonrelativistic particles. The motivation uses photons, which are relativistic particles with dynamics described by Maxwell's equations, as an analogue for all types of particles.
- This article is at a postgraduate level. For a more general introduction to the topic see Introduction to quantum mechanics.
Classical electromagnetic wavesEdit
Nature of lightEdit
The quantum particle of light is called a photon. Light has both a wave-like and a particle-like nature. In other words, light can appear to be made of photons (particles) in some experiments and light can act like waves in other experiments. The dynamics of classical electromagnetic waves are completely described by Maxwell's equations, the classical description of electrodynamics. In the absence of sources, Maxwell's equations can be written as wave equations in the electric and magnetic field vectors. Maxwell's equations thus describe, among other things, the wave-like properties of light. When "classical" (coherent or thermal) light is incident on a photographic plate or CCD, the average number of "hits", "dots", or "clicks" per unit time that result is approximately proportional to the square of the electromagnetic fields of the light. By formal analogy, the wavefunction of a material particle can be used to find the probability density by taking its absolute-value squared. Unlike electromagnetic fields, quantum-mechanical wavefunctions are complex. (Often in the case of EM fields complex notation is used for convenience, but it is understood that in fact the fields are real. However, wavefunctions are genuinely complex.)
Maxwell's equations were completely known by the latter part of the nineteenth century. The dynamical equations for light were, therefore, well-known long before the discovery of the photon. This is not true for other particles such as the electron. It was surmised from the interaction of light with atoms that electrons also had both a particle-like and a wave-like nature. Newtonian mechanics, a description of the particle-like behavior of macroscopic objects, failed to describe very small objects such as electrons. Abductive reasoning was performed to obtain the dynamics of massive objects (particles with mass) such as electrons. The electromagnetic wave equation, the equation that described the dynamics of light, was used as a prototype for discovering the Schrödinger equation, the equation that describes the wave-like and particle-like dynamics of nonrelativistic massive particles.
Plane sinusoidal wavesEdit
Electromagnetic wave equationEdit
The electromagnetic wave equation describes the propagation of electromagnetic waves through a medium or in a vacuum. The homogeneous form of the equation, written in terms of either the electric field E or the magnetic field B, takes the form:
Plane wave solution of the electromagnetic wave equationEdit
for the electric field and
for the magnetic field, where k is the wavenumber,
is the angular frequency of the wave, and is the speed of light. The hats on the vectors indicate unit vectors in the x, y, and z directions. In complex notation, the quantity is the amplitude of the wave.
is the Jones vector in the x-y plane. The notation for this vector is the bra–ket notation of Dirac, which is normally used in a quantum context. The quantum notation is used here in anticipation of the interpretation of the Jones vector as a quantum state vector. The angles are the angle the electric field makes with the x axis and the two initial phases of the wave, respectively.
is the state vector of the wave. It describes the polarization of the wave and the spatial and temporal functionality of the wave. For a coherent state light beam so dim that its average photon number is much less than 1, this is approximately equivalent to the quantum state of a single photon.
Energy, momentum, and angular momentum of electromagnetic wavesEdit
Energy density of classical electromagnetic wavesEdit
Energy in a plane waveEdit
The energy per unit volume in classical electromagnetic fields is (cgs units)
For a plane wave, converting to complex notation (and hence dividing by a factor of 2), this becomes
where the energy has been averaged over a wavelength of the wave.
Fraction of energy in each componentEdit
The fraction of energy in the x component of the plane wave (assuming linear polarization) is
with a similar expression for the y component.
The fraction in both components is
Momentum density of classical electromagnetic wavesEdit
The momentum density is given by the Poynting vector
For a sinusoidal plane wave traveling in the z direction, the momentum is in the z direction and is related to the energy density:
The momentum density has been averaged over a wavelength.
Angular momentum density of classical electromagnetic wavesEdit
The angular momentum density is
For a sinusoidal plane wave the angular momentum is in the z direction and is given by (going over to complex notation)
where again the density is averaged over a wavelength. Here right and left circularly polarized unit vectors are defined as
Unitary operators and energy conservationEdit
A wave can be transformed by, for example, passing through a birefringent crystal or through slits in a diffraction grating. We can define the transformation of the state from the state at time t to the state at time as
To conserve energy in the wave we require
where is the adjoint of U, the complex conjugate transpose of the matrix.
This implies that a transformation that conserves energy must obey
Hermitian operators and energy conservationEdit
If is an infinitesimal real quantity , then the unitary transformation is very close to the identity matrix (the final state is very close to the initial state) and can be written
and the adjoint by
The factor of i is introduced for convenience. With this convention, it will be shown that energy conservation requires H to be a Hermitian operator and that H is related to the energy of a particle.
Energy conservation requires
Since is infinitesimal, which means that may be neglected with respect to , the last term can be omitted. Further, if H is equal to its adjoint:
it follows that (for infinitesimal translations in time )
so that, indeed, energy is conserved.
Operators that are equal to their adjoints are called Hermitian or self-adjoint.
The infinitesimal translation of the polarization state is
Thus, energy conservation requires that infinitesimal transformations of a polarization state occur through the action of a Hermitian operator. While this derivation is classical, the concept of a Hermitian operator generating energy-conserving infinitesimal transformations forms an important basis for quantum mechanics. The derivation of the Schrödinger equation follows directly from this concept.
Quantum analogy of classical electrodynamicsEdit
The treatment to this point has been classical. However, the quantum mechanical treatment of particles follows along lines formally analogous however, to Maxwell's equations for electrodynamics. The analog of the classical "state vectors"
in the classical description is quantum state vectors in the description of photons.
Energy, momentum, and angular momentum of photonsEdit
The early interpretation is based on the experiments of Max Planck and the interpretation of those experiments by Albert Einstein, which was that electromagnetic radiation is composed of irreducible packets of energy, known as photons. The energy of each packet is related to the angular frequency of the wave by the relation
where is an experimentally determined quantity known as the reduced Planck's constant. If there are photons in a box of volume , the energy (neglecting zero point energy) in the electromagnetic field is
and the energy density is
The energy of a photon can be related to classical fields through the correspondence principle which states that for a large number of photons, the quantum and classical treatments must agree. Thus, for very large , the quantum energy density must be the same as the classical energy density
The average number of photons in the box in a coherent state is then
The correspondence principle also determines the momentum and angular momentum of the photon. For momentum
which implies that the momentum of a photon is
- (or equivalently ).
Angular momentum and spinEdit
Similarly for the angular momentum
which implies that the angular momentum of the photon is
the quantum interpretation of this expression is that the photon has a probability of of having an angular momentum of and a probability of of having an angular momentum of . We can therefore think of the angular momentum of the photon being quantized as well as the energy. This has indeed been experimentally verified. Photons have only been observed to have angular momenta of .
The spin of the photon is defined as the coefficient of in the angular momentum calculation. A photon has spin 1 if it is in the state and -1 if it is in the state. The spin operator is defined as the outer product
The expected value of a spin measurement on a photon is then
An operator S has been associated with an observable quantity, the angular momentum. The eigenvalues of the operator are the allowed observable values. This has been demonstrated for angular momentum, but it is in general true for any observable quantity.
Probability for a single photonEdit
There are two ways in which probability can be applied to the behavior of photons; probability can be used to calculate the probable number of photons in a particular state, or probability can be used to calculate the likelihood of a single photon to be in a particular state. The former interpretation is applicable to thermal or to coherent light (see Quantum optics). The latter interpretation is the option for a single-photon Fock state. Dirac explains this [Note 1] in the context of the double-slit experiment:
Some time before the discovery of quantum mechanics people realized that the connection between light waves and photons must be of a statistical character. What they did not clearly realize, however, was that the "wave function" gives information about the probability of one photon being in a particular place and not the probable number of photons in that place. The importance of the distinction can be made clear in the following way. Suppose we have a beam of light consisting of a large number of photons split up into two components of equal intensity. On the assumption that the beam is connected with the probable number of photons in it, we should have half the total number going into each component. If the two components are now made to interfere, we should require a photon in one component to be able to interfere with one in the other. Sometimes these two photons would have to annihilate one another and other times they would have to produce four photons. This would contradict the conservation of energy. The new theory, which connects the wave function with probabilities for one photon gets over the difficulty by making each photon go partly into each of the two components. Each photon then interferes only with itself. Interference between two different photons never occurs.— Paul Dirac, The Principles of Quantum Mechanics, Fourth Edition, Chapter 1
The probability for a photon to be in a particular polarization state depends on the probability distribution over the fields as calculated by the classical Maxwell's equations (in the Glauber-Sudarshan P-representation of a one-photon Fock state.) The expectation value of the photon number in a coherent state in a limited region of space is quadratic in the fields. In quantum mechanics, by analogy, the state or probability amplitude of a single particle contains the basic probability information. In general, the rules for combining probability amplitudes look very much like the classical rules for composition of probabilities: (The following quote is from Baym, Chapter 1)
- The probability amplitude for two successive probabilities is the product of amplitudes for the individual possibilities. ...
- The amplitude for a process that can take place in one of several indistinguishable ways is the sum of amplitudes for each of the individual ways. ...
- The total probability for the process to occur is the absolute value squared of the total amplitude calculated by 1 and 2.
de Broglie wavesEdit
In 1923 Louis de Broglie addressed the question of whether all particles can have both a wave and a particle nature similar to the photon. Photons differ from many other particles in that they are massless and travel at the speed of light. Specifically de Broglie asked the question of whether a particle that has both a wave and a particle associated with it is consistent with Einstein's two great 1905 contributions, the special theory of relativity and the quantization of energy and momentum. The answer turned out to be positive. The wave and particle nature of electrons was experimentally observed in 1927, two years after the discovery of the Schrödinger equation.
de Broglie hypothesisEdit
De Broglie supposed that every particle was associated with both a particle and a wave. The angular frequency and wavenumber of the wave was related to the energy E and momentum p of the particle by
The question reduces to whether every observer in every inertial reference frame can agree on the phase of the wave. If so, then a wave-like description of particles may be consistent with special relativity.
First consider the rest frame of the particle. In that case the frequency and wavenumber of the wave are related to the energy and momentum of the particles properties by
where m is the rest mass of the particle.
This describes a wave of infinite wavelength and infinite phase velocity
The wave may be written as proportional to
This, however, is also the solution for a simple harmonic oscillator, which can be thought of as a clock in the rest frame of the particle. We can imagine a clock ticking at the same frequency as the wave is oscillating. The phases of the wave and the clock can be synchronized.
Frame of the observerEdit
It is shown that the phase of the wave in an observer frame is the same as the phase of the wave in a particle frame, and also the same as clocks in the two frames. There is, therefore, consistency of both a wave-like and a particle-like picture in special relativity.
Phase of the observer clockEdit
In the frame of an observer moving at relative speed v with respect to the particle, the particle clock is observed to tick at a frequency
The phase of the observer clock is
where is time measured in the particle frame. Both the observer clock and the particle clock agree on the phase.
Phase of the observer waveEdit
The frequency and wavenumber of the wave in the observer frame is given by
with a phase velocity
The phase of the wave in the observer frame is
The phase of the wave in the observer frame is the same as the phase in the particle frame, as the clock in the particle frame, and the clock in the observer frame. A wave-like picture of particles is thus consistent with special relativity.
In fact, we now know that these relations can be succinctly written using special relativistic 4-vector notation:
The relevant four-vectors are:
The relations between the four-vectors are as follows:
The phase of the wave is the relativistic invariant:
Inconsistency of observation with classical physicsEdit
The de Broglie hypothesis helped resolve outstanding issues in atomic physics. Classical physics was unable to explain the observed behaviour of electrons in atoms. Specifically, accelerating electrons emit electromagnetic radiation according to the Larmor formula. Electrons orbiting a nucleus should lose energy to radiation and eventually spiral into the nucleus. This is not observed. Atoms are stable on timescales much longer than predicted by the classical Larmor formula.
Also, it was noted that excited atoms emit radiation with discrete frequencies. Einstein used this fact to interpret discrete energy packets of light as, in fact, real particles. If these real particles are emitted from atoms in discrete energy packets, however, must the emitters, the electrons, also change energy in discrete energy packets? There is nothing in Newtonian mechanics that explains this.
The de Broglie hypothesis helped explain these phenomena by noting that the only allowed states for an electron orbiting an atom are those that allow for standing waves associated with each electron.
The Balmer series identifies those frequencies of light that can be emitted from an excited hydrogen atom:
Assumptions of the Bohr modelEdit
The Bohr model, introduced in 1913, was an attempt to provide a theoretical basis for the Balmer series. The assumptions of the model are:
- The orbiting electrons existed in circular orbits that had discrete quantized energies. That is, not every orbit is possible but only certain specific ones.
- The laws of classical mechanics do not apply when electrons make the jump from one allowed orbit to another.
- When an electron makes a jump from one orbit to another the energy difference is carried off (or supplied) by a single quantum of light (called a photon) which has an energy equal to the energy difference between the two orbitals.
- The allowed orbits depend on quantized (discrete) values of orbital angular momentum, L according to the equation
Where n = 1,2,3,… and is called the principal quantum number.
Implications of the Bohr modelEdit
In a circular orbit the centrifugal force balances the attractive force of the electron
where m is the mass of the electron, v is the speed of the electron, r is the radius of the orbit and
where e is the charge on the electron or proton.
The energy of the orbiting electron is
which follows from the centrifugal force expression.
The angular momentum assumption of the Bohr model implies
which implies that, when combined with the centrifugal force equation, the radius of the orbit is given by
This implies, from the energy equation,
The difference between energy levels recovers the Balmer series.
De Broglie's contribution to the Bohr modelEdit
The Bohr assumptions recover the observed Balmer series. The Bohr assumptions themselves, however, are not based on any more general theory. Why, for instance, should the allowed orbits depend on the angular momentum? The de Broglie hypothesis provides some insight.
If we assume that the electron has a momentum given by
as postulated by the de Broglie hypothesis, then the angular momentum is given by
where is the wavelength of the electron wave.
If only standing electron waves are permitted in the atom then only orbits with perimeters equal to integral numbers of wavelengths are allowed:
This implies that allowed orbits have angular momentum
which is Bohr's fourth assumption.
Assumptions one and two immediately follow. Assumption three follows from energy conservation, which de Broglie showed was consistent with the wave interpretation of particles.
Need for dynamical equationsEdit
The problem with the de Broglie hypothesis as applied to the Bohr atom is that we have forced a plane wave solution valid in empty space to a situation in which there is a strong attractive potential. We have not yet discovered the general dynamic equation for the evolution of electron waves. The Schrödinger equation is the immediate generalization of the de Broglie hypothesis and the dynamics of the photon.
Analogy with photon dynamicsEdit
The dynamics of a photon are given by
where H is a Hermitian operator determined by Maxwell's equations. The Hermiticity of the operator ensures that energy is conserved.
Erwin Schrödinger assumed that the dynamics for massive particles were of the same form as the energy-conserving photon dynamics.
where is the state vector for the particle and H is now an unknown Hermitian operator to be determined.
Particle state vectorEdit
Rather than polarization states as in the photon case, Schrödinger assumed the state of the vector depended on the position of the particle. If a particle lives in one spatial dimension, then he divided the line up into an infinite number of small bins of length and assigned a component of the state vector to each bin
The subscript j identifies the bin.
Matrix form and transition amplitudesEdit
The transition equation can be written in matrix form as
The Hermitian condition requires
Schrödinger assumed that probability could only leak into adjacent bins during the small time step dt. In other words, all components of H are zero except for transitions between neighboring bins
Moreover, it is assumed that space is uniform in that all transitions to the right are equal
The same is true for transitions to the left
The transition equation becomes
The first term on the right side represents the movement of probability amplitude into bin j from the right. The second term represents leakage of probability from bin j to the right. The third term represents leakage of probability into bin j from the left. The fourth term represents leakage from bin j to the left. The final term represents any change of phase in the probability amplitude in bin j.
If we expand the probability amplitude to second order in the bin size and assume space is isotropic, the transition equation reduces to
Schrödinger equation in one dimensionEdit
The transition equation must be consistent with the de Broglie hypothesis. In free space the probability amplitude for the de Broglie wave is proportional to
in the non-relativistic limit.
The de Broglie solution for free space is a solution of the transition equation if we require
The time derivative term in the transition equation can be identified with the energy of the de Broglie wave. The spatial derivative term can be identified with the kinetic energy. This suggests that the term containing is proportional to the potential energy. This yields the Schrödinger equation
where U is the classical potential energy and
Schrödinger equation in three dimensionsEdit
In three dimensions the Schrödinger equation becomes
The solution for the hydrogen atom describes standing waves of energy exactly given by the Balmer series. This was a spectacular validation of the Schrödinger equation and of the wave-like behaviour of matter.
- This explanation is in some sense antiquated or even obsolete, as we now know that the concept of a single-photon wavefunction is disputed , that in a coherent state one indeed deals with the probable number of photons, given by coherent-state Poissonian statistics, and that different photons can indeed interfere.