home blog misc links contact about

[Rant] Antiparticles and conservation laws

While studying for my particle physics lecture, I came across this section of Thomson’s “Particle Physics” (2013):

Mildly speaking, this section drove me up the wall. In my opinion, it is indicative of a large-scale lack of understanding of the nature of antiparticles, and of conserved quantities and Noether’s theorem in general. It illustrates how particle physicists often still cling to the ideas of 1920s quantum mechanics, even though it has been superseded by quantum field theory. In this rant, I’ll argue that the problem stems from insufficiently distinguishing the concepts of symmetry generator eigenvalues/vertex conservation laws and conserved Noether currents.

How the reasoning on antiparticles normally goes

To see what I mean, let us first retrace the reasoning underlying the misunderstanding exemplified in the above excerpt.

If someone were to point a gun at me and ask me to summarize pre-QFT quantum mechanics in two sentences, I’d probably say something in the lines of “Particles are waves, and the wave frequency corresponds to their energy, and the wavenumber corresponds to their momentum. Energy and momentum are linked together via dispersion relations”. Classically, particles are pointlike objects with an energy and a momentum, but quantum mechanics reveals that if we sufficiently “zoom in” to these points, they actually are tiny wavepackets. Perhaps the most spectacular experiment to demonstrate this is the double-slit experiment. If we do sufficiently many experiments, we will find that the momentum pp of the electrons we are firing into the double-slit apparatus corresponds to their measured wavenumber kk/wavelength λ\lambda with: p=kp = \hbar k where k=2πλ.k = \frac{2\pi}{\lambda}. Similarly, the energy of the particle goes with E=ω=hfE = \hbar \omega = hf where ω\omega is the angular frequency and ff the regular frequency of the wave.

Given a plane wave ϕ(x,t)ei(ωt+kx)\phi(x, t) \propto e^{i(-\omega t + kx)} we can determine its energy and momentum by differentiating: itϕ(x,t)=Eϕ(x,t)ixϕ(x,t)=pϕ(x,t) \begin{align*} i\hbar \partial_t \phi(x, t) &= E \phi(x, t) \\ -i\hbar \partial_x \phi(x, t) &= p \phi(x, t) \end{align*}

In order to write down an equation telling us how a given wavefunction evolves in time, we need to link these two quantities together. Here’s where dispersion relations enter the game. For instance, the Schrodinger equation is derived by taking the Newtonian dispersion relation E=p22m \begin{align*} E = \frac{p^2}{2m} \end{align*}

and rewriting EE and pp as differential operators: itϕ=2x22mϕi\hbar \partial_t \phi = -\frac{\hbar^2 \partial_x^2}{2m} \phi

This approach works perfectly well for classical, nonrelativistic quantum mechanics. It does however fail miserably once we choose the actual relativistic dispersion relation: E2=m2c4+p2c2E^2 = m^2c^4 + p^2c^2 or E2=m2+p2E^2 = m^2 + p^2 in natural units. The reason for that is that for a given mm and pp, there are two solutions: +m2+p2+\sqrt{m^2 + p^2} and m2+p2-\sqrt{m^2 + p^2}. When doing classical relativistic mechanics, we can just throw away the negative solution, but in quantum mechanics, this is not possible due to the way waves work. The negative-energy solutions of the dispersion relation forcibly correspond to negative-frequency solutions of the wave equation (for instance the Klein-Gordon or Dirac wave equation). And within the framework of QFT, these negative-frequency modes correspond to antiparticles, which apparently have negative energy. Now, Thomson argues that this is in clear contradiction to experimental results - after all, these antiparticles produce calorimeter showers just like their positive-energy counterparts. So how come they deposit a positive amount of energy in our calorimeters? This seems to violate conservation of energy.

What is energy, and what is momentum?

And this step in the reasoning is exactly where I think that a major conceptual mistake pervades large parts of the particle physics/quantum field theory community, which is also interlinked with the axiomatic foundations quantum gravity. We interpret the wave frequency of the mode as the energy of the particle, and the wavenumber as the momentum. As we all know, energy and momentum are conserved - so a particle with a negative frequency should take away energy from the calorimeter. “Alright, sounds reasonable”, an interested student might say now. “Noether’s theorem tells us that every symmetry corresponds to a conservation law, and time invariance corresponds to energy conservation. We all learned about this in introductory quantum field theory - for instance, we saw that the U(1) symmetry of the Dirac equation corresponds to the Noether current jμ=eψˉγμψj^\mu = -e \bar{\psi} \gamma^\mu \psi which is conserved, i.e. μjμ=0\partial_\mu j^\mu = 0 So what is the corresponding conservation law for energy?”

** The dramatic sound of crickets chirping. **

There is none. If we define energy as the frequency of our wavemodes, energy is not conserved. But what about Noether’s theorem? Well, Noether’s theorem tells us that there is a conserved current for every symmetry, so we actually have to ask ourselves how to define the energy current. Let’s call it EμE^\mu for a start, such that E0E^0 is the energy density and EiE^i are the three spatial currents of energy flowing around. How do we obtain an explicit expression for EμE^\mu? Let’s take the Dirac field, because it makes this particularly simple. The symmetry operation of shifting a Dirac field ψ(x)\psi(x) around in time by an infinitesimal amount δt\delta t is given by ψψ+δt0ψLL+δt0ψ=L+μ(δ0μL)=:L+μWμ \begin{align*} \psi &\mapsto \psi + \delta t \partial_0 \psi \\ \mathcal{L} &\mapsto \mathcal{L} + \delta t \partial_0 \psi = \mathcal{L} + \partial_\mu (\delta^\mu_0 \mathcal{L}) =: \mathcal{L} + \partial_\mu W^\mu \end{align*} Noether’s theorem in its fully glory now states that Eμ=L(μψ)ΔψWμ=ψˉγμ0ψδ0μψˉ(iγμμm)ψ \begin{align*} E^\mu &= \frac{\partial \mathcal{L}}{\partial(\partial_\mu \psi)} \Delta \psi - W^\mu \\ &= \bar{\psi} \gamma^\mu \partial_0 \psi - \delta^\mu_0 \bar{\psi} (i \gamma^\mu \partial_\mu - m) \psi \end{align*} Now let’s take an antiparticle plane-wave solution of the Dirac equation:1 ψ(x)=u(p,s)eipx\psi(x) = u(p, s) e^{-ipx} with p0<0p^0 < 0, such that we have a scary “negative-energy” state, and calculate the corresponding energy current EμE^\mu. As this is a solution to the Dirac equation, WW is zero, such that: Eμ=ψˉγμ0ψ=uˉ(p,s)eipxγμp0u(p,s)eipx=1mpμp0E^\mu = \bar{\psi} \gamma^\mu \partial_0 \psi = \bar{u}(p, s) e^{ipx} \gamma^\mu p_0 u(p,s) e^{-ipx} = \frac{1}{m} p^\mu p_0 It is now abundantly obvious that the energy density E0E^0 of this “negative-energy” state is positive, as the two minus signs cancel out: E0=1m(E)(E)=E2mE^0 = \frac{1}{m} (-E)(-E) = \frac{E^2}{m} And this is the conserved quantity - not the frequency of the wavemodes, but the Noether current corresponding to time invariance is conserved. The apparent contradiction described by Thomson dissolves into a cloud of dust.

It is worth noting that in any Lorentz-covariant quantum field theory, we not only have time invariance, but also spatial translation invariance. The conserved Noether current corresponding to a translation along the ν\nu axis is denoted as TμνT^{\mu \nu}, and commonly called the energy-momentum tensor. Most prominently, it appears on the right-hand side of the Einstein field equations.

The anatomy of a misunderstanding

Energy-as-in-frequency is an eigenvalue of the temporal translation operator generating the time symmetry (also called the Hamiltonian), while the actual energy is the Noether current corresponding to said symmetry. The former is not conserved, while the latter is. The former is negative for antiparticles, while the latter is positive. So there’s no problem at all. The sad thing is, I don’t know a single book on QFT that discusses this difference. Instead, people came up with superfluous concepts like the Dirac sea or particles moving back in time (the “Feynman-Stueckelberg interpretation”) to hide the mess they are responsible for. If people had listened to Emmy Noether, arguably one of the greatest minds of the 20th century, instead of uselessly clinging to what Heisenberg and Schrodinger said, this would’ve been clear from the very beginning.

The basic distinction - eigenvalues of symmetry operators and conserved Noether currents - is often ignored in QFT. There is a semi-good reason for that - looking at Feynman diagrams, we can see that the former is what needs to sum to zero at vertices. We know that the four-momentum (-as-in-frequency-and-wavenumber) at a vertex must sum to zero (similarly, the incoming EM charges must sum to zero, the spins must sum to zero, etc). We could therefore be tempted to believe that these quantities are what is conserved as a whole. But this makes us run into Thomson’s fallacy showcased above. What the conservation of eigenvalues at Feynman diagram vertices really means is that the vertex is invariant under the symmetry transformation. The actual conserved current needs to be derived separately. In a way, we could say that energy-as-in-frequency is a symmetry, while energy-as-in-energy-density is the conserved current corresponding to that symmetry.

(Quiz question: Nobody seems to have a problem with antiparticles adding positive instead of negative charge to the calorimeters. Why don’t we have this problem here?)

This concludes my rant.


  1. The avid reader will have noticed that in the framework of the standard conventions, there is no such thing as u(p,s)u(p, s) with p0<0p^0 < 0. This is correct - what we actually need to do here is to use v(p,s)v(-p, s), define an second-quantized operator mode expansion with anticommuting creation and annihilation operators such that the norm between two antiparticle states is positive, and then write down the energy density in terms of the creation and annihilation operators. I have taken a shortcut and smuggled in anticommuting/Grassmann numbers through the backdoor by defining u(p,s)u(p, s) for both p0<0,p0>0p^0 < 0, p^0 > 0 such that u(p,s)uˉ(p,s)=(p+m)(1+is).u(p, s) \bar{u}(p, s) = (p + m)(1 + is). where ss is the spin bivector, orthogonal to pp. Note that as long as our Dirac spinors are c-number-valued, this is only possible for p0>0p^0 > 0.↩︎