Disagreeing with “Our Mathematical Universe”

My wife and I have been listening to Max Tegmark’s book “Our Mathematical Universe: My Quest for the Ultimate Nature of Reality” as an audiobook during our trips to and from work lately.

When he hit his chapter explaining Quantum Mechanics and his “Level 3 multiverse” I found that I profoundly disagree with this guy. It’s clear that he’s a grade A cosmologist, but I think he skirts dangerously close to being a quantum crank when it comes to multi-universe theory. I’ve been disagreeing with his take for the last couple driving sessions and I will do my best to try to sum for memory the specific issues that I’ve taken. Since this is a physicist making these claims, it’s important that I be accurate about my disagreement. In fact, I’ll start with just one and see whether I feel like going further from there…

The first place where I disagree is where he seems to show physicist Dunning-Kruger when regarding other fields in which he is not an expert. Physicists are very smart people, but they have a nasty habit of overestimating their competence in neighboring sciences… particularly biology. I am in a unique position in that I’ve been doubly educated; I have a solid background in biochemistry and cell molecular biology in addition to my background in quantum mechanics. I can speak at a fair level on both.

Professor Tegmark uses an anecdote (got to be careful here; anecdotes inflate mathematical imprecision) to illustrate how he feels quantum mechanics connects to events at a macroscopic level in organisms. There are many versions, but essentially he says this: when he is biking, the quantum mechanical behavior of an atom crossing through a gated ion channel in his brain affects whether or not he sees an oncoming car, which then may or may not hit him. By quantum mechanics, whether he gets hit or not by the car should be a superposition of states depending on whether or not the atom passes through the membrane of a neuron and enables him to have the thought to save himself or not. He ultimately elaborates this by asserting that “collapse free” quantum mechanics states that there is one universe where he saved himself and one universe where he didn’t… and he uses this as a thought experiment to justify what he calls a “level 3” multiverse with parallel realities that are coherent to each other but differ by the direction that a quantum mechanical wave function collapse took.

I feel his anecdote is a massive oversimplification that more or less throws the baby out with the bath water. Illustration of the quantum event in question is “Whether or not a calcium ion in his brain passes through a calcium gate” as connected to the macroscopic biological phenomenon of “whether he decides to bike through traffic” or alternatively “whether or not he decides to turn his eye in the appropriate direction” or alternatively “whether or not he sees a car coming when he starts to bike.”

You may notice this as a variant of the Schrodinger “Cat in a box” thought experiment. In this experiment, a cat is locked in a perfectly closed box with a sample of radioactive material and a Geiger counter that will dump acid onto the cat if it detects a decay; as long as the box is closed, the cat will remain in some superposition of states, conventionally considered “alive” or “dead” as connected with whether or not the isotope emitted a radioactive decay or not. I’ve made my feelings of this thought experiment known before here.

The fundamental difficulty comes down to what the superposition of states means when you start connecting an object with a very simple spectrum of states, like an atom, to an object with a very complex spectrum of states, like a whole cat. You could suppose that the cat and the radioactive emission become entangled, but I feel that there’s some question whether you could ever actually know whether or not they were entangled simply because you can’t discretely figure out what the superposition should mean: alive and dead for the cat are not a binary on-off difference from one another as “emitted or not” is for the radioactive atom. There are a huge number of states the cat might occupy that are very similar to one another in energy and the spectrum spanning “alive” to “dead” is so complicated that it might as well just be a thermal universe. If the entanglement actually happened or not, in this case, the classical thermodynamics and statistical mechanics should be enough to tell you in classically “accurate enough” terms what you find when you open the box. If you wait one half-life of a bulk radioactive sample, when you open the box, you’ll find a cat that is burned by acid to some degree or another. At some point, quantum mechanics does give rise to classical reality, but where?

The “but where” is always where these arguments hit their wall.

In the anecdote Tegmark uses, as I’ve written above, the “whether a calcium ion crossed through a channel or not” is the quantum mechanical phenomenon connected to “whether an oncoming car hit me or not while I was biking.”

The problem that I have with this particular argument is that it loses scale. This is where quantum flapdoodle comes from. Does the scale make sense? Is all the cogitation associated with seeing a car and operating a bike on the same scale as where you can actually see quantum mechanical phenomena? No, it isn’t.

First, all the information coming to your brain from your eyes telling you that the car is present originate from many many cells in your retina, involving billions of interactions with light. The muscles that move your eyes and your head to see the car are instructed from thousands of nerves firing simultaneously and these nerves fire from gradients of Calcium and other ions… molar scale quantities of atoms! A nerve doesn’t fire or not based on the collapse of possibilities for a single calcium ion. It fires based on thermodynamic quantities of ions flowing through many gated ion channels all at once. The net effect of one particular atom experiencing quantum mechanical ambivalence is swamped under statistically large quantities of atoms picking all of the choices they can pick from the whole range of possibilities available to them, giving rise to the bulk phenomenon of the neuron firing. Let’s put it this way: for the nerve to fire or not based on quantum mechanical superposition of calcium ions would demand that the nerve visit that single thermodynamic state where all the ions fail to flow through all the open ion gates in the membrane of the cell all at once… and there are statistically few states where this has happened compared to the statistically many states where some ions or many ions have chosen to pass through the gated pore (this is what underpins the chemical potential that drives the functioning of the cell). If you bothered to learn any stat mech at all, you would know that this state is such a rare one that it would probably not be visited even once in the entire age of the universe. Voltage gradients in nerve cells are established and maintained through copious application of chemical energy, which is truthfully constructed from quantum mechanics and mainly expressed in bulk level by plain old classical thermodynamics. And this is merely the state of whether a single nerve “fired or not” taken in aggregate with the fact that your capacity for “thought” doesn’t depend enough on a single nerve that you can’t lose that one nerve and fail to think –if a single nerve in your retina failed to fire, all the sister nerves around it would still deliver an image of the car speeding toward you to your brain.

Do atoms like a single calcium ion subsist in quantum mechanical ambivalence when left to their own devices? Yes, they do. But, when you put together a large collection of these atoms simultaneously, it is physically improbable that every single atom will make the same choice all at once. At some point you get a bulk thermodynamic behavior and the decision that your brain makes are based on bulk thermodynamic behaviors, not isolated quantum mechanical events.

Pretending that a person made a cognitive choice based on the quantum mechanical outcomes of a single atom is a reductio ad absurdum and it is profoundly disingenuous to start talking about entire parallel universes where you swerved right on your bike instead of left based on that single calcium atom (regardless of how liberally you wave around the butterfly effect). The nature of physiology in a human being at all levels is about biasing fundamentally random behavior into directed, ordered action, so focusing on one potential speck of randomness doesn’t mean that the aggregate should fail to behave as it always does. All the air in the room where you’re standing right now could suddenly pop into the far corner leaving you to suffocate (there is one such state in the statistical ensemble), but that doesn’t mean that it will…. closer to home, you might win a $500 million Power Ball Jackpot, but that doesn’t mean you will!

I honestly do not know what I think about the multiverse or about parallel universes. I would say I’m agnostic on the subject. But, if all parallel universe theory is based on such breathtaking Dunning-Kruger as Professor Tegmark exhibits when talking about the connection between quantum mechanics and actualization of biological systems, the only stance I’m motivated to take is that we don’t know nearly enough to be speculating. If Tegmark is supporting multiverse theory based on such thinking, he hasn’t thought about the subject deeply enough. Scale matters here and neglecting the scale means you’re neglecting the math! Is he neglecting the math elsewhere in his other huge, generalizing statements? For the scale of individual atoms, I can see how these ideas are seductive, but stretching it into statistical systems is just wrong when you start claiming that you’re seeing the effects of quantum mechanics at macroscopic biological levels when people actually do not. It’s like Tegmark is trying to give Deepak Chopra ammunition!

Ok, just one gripe there. I figure I probably have room for another.

In another series of statements that Tegmark makes in his discussion of quantum mechanics, I think he probably knows better, but by adopting the framing he has, he risks misinforming the audience. After a short discussion of the origins of Quantum Mechanics, he introduces the Schrodinger Equation as the end-all, be-all of the field (despite speaking briefly of Lagrangian path integral formalism elsewhere). One of the main theses of his book is that “the universe is mathematical” and therefore the whole of reality is deterministic based on the predictions of equations like Schrodinger’s equation. If you can write the wave equation of the whole universe, he says, Schrodinger’s equation governs how all of it works.

This is wrong.

And, I find this to miss most of the point of what physics is and what it actually does. Math is valuable to the physics, but one must always be careful that the math not break free of its observational justification. Most of what physics is about is making measurements of the world around us and fitting those measurements to mathematical models, the “theories” (small caps) provided to us by the Einsteins and the Sheldon Coopers… if the fit is close enough, the regularity of a given equation will sometimes make predictions about further observations that have not yet been made. Good theoretical equations have good provenance in that they predict observations that are later made, but the opposite can be said for bad theory, and the field of physics is littered with a thick layer of mathematical theories which failed to account for the observations, in one way or another. The process of physics is a big selection algorithm where smart theorists write every possible theory they can come up with and experimentalists take those theories and see if the data fit to them, and if they do accommodate observation, such a theory is promoted to a Theory (big caps) and is explored to see where its limits exist. On the other hand, small caps “theories” are discarded if they don’t accommodate observation, at which point they are replaced by a wave of new attempts that try to accomplish what the failure didn’t. As a result, new theories fit over old theories and push back predictive limits as time goes on.

For the specific example of Schrodinger’s equation, the mathematical model that it offers fits over the Bohr model by incorporating deBroglie’s matter wave. Bohr’s model itself fit over a previous model and the previous models fit over still earlier ideas had by the ancient Greeks. Each later iteration extends the accuracy of the model, where the development is settled depending on whether or not a new model has validated predictive power –this is literally survival of the fittest applied to mathematical models. Schrodinger’s equation itself has a limit where its predictive power fails: it cannot handle Relativity except as a perturbation… meaning that it can’t exactly predict outcomes that occur at high speeds. The deficiencies of the Schrodinger equation are addressed by the Klein-Gordon equation and by the Dirac equation and the deficiencies of those in turn are addressed by the path integral formalisms of Quantum Field Theory. If you knew the state equation for the whole universe, Schrodinger’s equation would not accurately predict how time unfolds because it fails to work under certain physically relevant conditions. The modern Quantum Field Theories fail at gravity, meaning that even with the modern quantum, there is no assured way of predicting the evolution of the “state equation of the universe” even if you knew it. There are a host of follow-on theories, String Theory, Quantum loop gravity and so and so forth that vy for being The Theory That Fills The Holes, but, given history, probably will only extend our understanding without fully answering all the remaining questions. That String Theory has not made a single prediction that we can actually observe right now should be lost on no one –there is a grave risk that it never will. We cannot at the moment pretend that the Schrodinger equation perfectly satisfies what we actually know about the universe from other sources.

It would be most accurate to say that reality seems to be quantum mechanical at its foundation, but that we have yet to derive the true “fully correct” quantum theory. Tegmark makes a big fuss about trying to explain “wave function collapse” doesn’t fit within the premise of Schrodinger’s equation but that the equation could hold as good quantum regardless if a “level three multiverse” is real. The opposite is also true: we’ve known Schrodinger’s equation is incomplete since the 1930s, so “collapse” may simply be another place where it’s incomplete that we don’t yet know why. A multiverse does not necessarily follow from this. Maybe pilot wave theory is correct quantum, for all I know.

It might be possible to masturbate over the incredible mathematical regularity of physics in the universe, but beware of the fact that it wasn’t particularly mathematical or regular until we picked out those theories that fit the universe’s behavior very closely. Those theories have predictive power because that is the nature of the selection criteria we used to find them; if they lacked that power, they would be discarded and replaced until a theory emerged meeting the selection criteria. To be clear, mathematical models can be written to describe anything you want, including the color of your bong haze, but they only have power because of their self consistency. If the universe does something to deviate from what the math says it should, the math is simply wrong, not the universe. Every time you find neutrino mass, God help your massless neutrino Standard Model!

Wonderful how the math works… until it doesn’t.

Edit 12-19-17:

We’re still listening to this book during our car trips and I wanted to point out that Tegmark uses an argument very similar to my argument above to suggest why the human brain can’t be a quantum computer. He approaches the matter from a slightly different angle. He says instead that a coherent superposition of all the ions either inside or outside the cell membrane is impossible to maintain for more than a very very short period of time because eventually something outside of the superposition would rapidly bump against some component of the superposition and that since so many ions are involved, the frequency of things bumping on the system from the outside and “making a measurement” becomes high. I do like what he says here because it starts to show the scale that is relevant to the argument.

On the other hand, it still fails to necessitate a multiverse. The simple fact is that human choice is decoupled from the scale of quantum coherence.

Edit 1-10-18:

As I’m trying desperately to recover from stress in the process of thesis writing, I thought I would add a small set of thoughts in this subject in an effort to defocus and defrag a little. My wife and I have continued to listen to this book and I think I have another fairly major objection with Tegmark’s views.

Tegmark lives in a version of quantum mechanics that fetishizes the notion of wave function collapse where he views himself as going against the grain by offering an alternative where collapse does not have to happen.

For a bit of context, “collapse” is a side effect of the Copenhagen convention of quantum mechanics. In this way of looking at the subject, the wave function will remain in superposition until something is done to determine what state the wave function is in… at this point, the wave function will cease to be coherent and will drop into some allowed eigenstate, after which it will remain in that eigenstate. This is a big, dominant part of quantum mechanics, but I would suggest that it misses some of the subtlety of what actually happens in quantum mechanics by trying to interpret, perhaps wrongly, what the wave function is.

Fact of the matter is that you can never observe a wave function. When you actually look at what you have, you only ever find eigenstates. But, there is an added subtlety to this. If you make an observation, you find an object somewhere, doing something. That you found the object is indisputable and you can be pretty certain what you know about it at the time slice of the observation. Unfortunately, you only know exactly what you found; from this –directly– you actually have no idea either what the wave function was or even really what the eigenstates are. Location is clearly an eigenstate of the position operator, as quantum mechanics operates, but from finding a particle “here” you really don’t actually know what the spectrum of locations it was potentially capable of occupying actually were. In order to learn this, the experiment which is performed is to set up the situation in a second instance, put time in motion and see that you find the new particle ending up “there,” then to tabulate the results together. This is repeated a number of times until you get “here,” “there” and “everywhere.” Binning each trial together, you start to learn a distribution of how the possibilities could have played out. From this distribution, you can suddenly write a wave function, which tells the probability of making some observation across the continuum of the space you’re looking at… the wave function says that you have “this chance of finding the object ‘here’ or ‘there’.”

The wave function, however you try to pack it, is fundamentally dependent on the numerical weight of a statistically significant number of observations. From one observation, you can never know anything about the wave function.

The same thing holds true for coherence. If you make one observation, you find what you found that one time; you know nothing about the spectrum of possibilities. For that one hit, the particle could have been in coherence, or it could have been collapsed to an eigenstate. You don’t know. You have to build up a battery of observations, which gives you the ability to say “there’s a xx% chance this observation and that observation were correlated, meaning that coherence was maintained to yy degree.”

This comes back to Feynman’s old double slit experiment anecdote. For one BB passing through the system and striking the screen, you only know that it did, and not anything about how it did. The wave function written for the circumstances of the double slit provides a forecast of what the possible outcomes of the experiment could be. If you start measuring which slit a BB went through, the system becomes fundamentally different based upon how the observation is made and different things are knowable, giving the chance that the wave function will forecast different statistical outcomes. But, you cannot know this unless you make many observations in order to see the difference. If you measure the location of 1 BB at the slit and the location of 1 BB at the screen, that’s all you know.

In this way, the wave function is a bulk phenomenon, a beast of statistical weight. It can tell you observations that you might find… if you know the set up of the system. An interference pattern at the screen tells that the history was muddy and that there are multiple possible histories that could explain an observation at the screen. This doesn’t mean that a BB went through both slits, merely that you don’t know what history brought it to the place where it is. “Collapse” can only be known after two situations have been so thoroughly examined that the chances for the different outcomes are well understood. In a way, it is as if the phenomenon of collapse is written into the outcome of the system by the set-up of the experiment and that the types of observations that are possible are ordained before the experiment is carried out. In that way, the wave function really is basically just a forecast of possible outcomes based on what is known about a system… sampling for the BB at the slit or not, different information is present about the system, creating different possible outcomes, requiring the wave function to make a different forecast that includes that something different is known about the system. The wave function is something that never actually exists at all except to tell you the envelope of what you can know at any given time, based upon how the system is different from one instance to the next.

This view directly contradicts the notions in Tegmark’s book that individual quantum mechanical observations at “collapse” allow for two universes to be created based upon whether the wave function went one way or another. On a statistical weight of one, it cannot be known whether the observed outcome was from a collection of different possibilities or not. The possible histories or futures are unknown on a data point of one; that one is what it is and it can’t be known that there may have been other choices without a large conspiracy to know what other choices could have happened and what that gives you is the ability to say is “there’s a sixty percent chance this observation matches this eigenstate and a forty percent chance it’s that one.” Which is fundamentally not the same as the decisiveness which would be required for a collapse of one data point to claim “we’re definitely in the universe where it went through the right slit.”

I guess I would say this: Tegmark’s level 3 multiverse is strongly contradicted by the Uncertainty Principle. Quantum mechanics is structurally based on indecisiveness, while Tegmark’s multiverse is based on a clockwork decisiveness. Tegmark is saying that the history of every particle is always known.

This is part of the issue with quantum computers: the quantum computer must run its processing experiment repeatedly, multiple times, in order to establish knowledge about coherence in the system. On a sampling of one, the wave function simply does not exist.

Tegmark does this a lot. He routinely puts the cart ahead of the horse; saying that math implies the universe rather than that math describes the universe (Tegmark: Math therefore Universe. Me: Universe, therefore Math). The universe is not math; math is simply so flexible that you can pick out descriptions that accurately tell what’s going on in the universe (until they don’t). For all his cherry picking the “mathematical regularity of the universe,” Tegmark quite completely turns his eye to where math fails to work: most problems in quantum mechanics are not exactly solvable and most quantum advancement is based strongly on perturbation… that is approximations and infinite expansions that are cranked through computers to churn out compact numbers that are close to what we see. In this, the math that ‘works’ is so overloaded with bells and whistles to make it approach the actual observational curve that one can only ever say that the math is adopting the form of the universe, not that the universe arises from the math.

edit 1-17-18:

Still listening to this book. We listened through a section where Tegmark admits that he’s putting the cart ahead of the horse by putting math ahead of reality. He simply refers to it as a “stronger assertion” which I think is code for “where I know everyone will disagree with me.”

Tegmark slipped gently out of reality again when he started into a weird observer-observation duality argument about how time “flows” for a self-aware being. You know he’s lost it when his description fails to even once use the word “entropy.” Tegmark is under the impression that the quantum mechanical choice of every distinct ion in your brain is somehow significant to the functioning of thought. This shows an unbelievable lack of understanding of biology, where mass structures and mass action form behavior. Fact of the matter is that biological thought (the awareness of a thinking being) is not predictable from the quantum mechanical behavior of its discrete underpinning parts. In reality, quantum mechanics supplies the bulk steady state from which a mass effect like biological self-awareness is formed. Because of the difference in scale between the biological level and the quantum mechanical level, biology depends only on the prevailing quantum mechanical average… fluctuations away from that average, the weirdness of quantum, are almost entirely swamped out by simple statistical weight. A series of quantum mechanical arguments designed to connected the macroscale of thought to the quantum scale is fundamentally broken without taking this into account.

Consider this: the engine of your gas fueled car is dependent on a quantum mechanical behavior. Molecules of gasoline are mixed with molecules of oxygen in the cylinder head and are triggered by a pulse of heat to undergo a chemical reaction where the atoms of the gas and oxygen reconfigure the quantum mechanical states of their electrons in order to organize into molecules of CO2 and CO. After the reorganization, the collected atoms in these new molecules of CO2 and CO are at a different average state of quantum mechanical excitation than they were prior to the reconfiguration –you could say that they end up further from their quantum mechanical zero point for their final structure as compared to prior to the reorganization. In ‘human baggage’ we call this differential “heat” or “release of heat.” The quantum mechanics describe everything about how the reorganization would proceed, right down to the direction a CO2 molecule wants to speed off after it has been formed. What the quantum mechanics does not directly tell you is that 10^23 of these reactions happen and for all the different directions that CO2 molecules are moving after they are formed, the average distribution of their expansion is all that is needed to drive the cylinder head… that this molecule speeds right or that one speeds left are immaterial: if it didn’t, another would, and if that one didn’t still another would and so on and so forth until you achieve a bulk behavior of expansion in CO2 atmosphere that can push the piston. The statistics are important here. That the gasoline is 87 octane versus 91 octane, two quantum mechanically different approaches to the same thing, does not change that both drive the piston… you could use ethanol or kerosine or RP-1 to perform the same action and the specifics of the quantum mechanics result in an almost indistinguishable state where an expanding gas pushes back the piston head to produce torque on the crankshaft to drive the wheels around. The quantum mechanics are watered out to a simple average where the quantum mechanical differences between one firing of the piston are indistinguishable from the next. But, to be sure, every firing of the piston is not quantum mechanically exactly the same as the one before it. In reality, that piston moves despite these differences. There is literally an unthinkably huge ensemble of quantum mechanical states that result in the cylinder head moving and you cannot distinguish any of them from any other. There is literally no choice but to group them all together by what they hold in common and to treat them as if they are the same thing, even though at the basement layer of reality, they aren’t. Without what Tegmark refers to as “human baggage” there would be no way to connect the quantum level to the one we can actually observe in this case. That this particular molecule of fuel failed to react or not based on fluctuations of the quantum mechanics is pretty much immaterial.

The brain is not different. If you were to consider “thought” to be a quantum mechanical action, the specific difference between one thought and the next are themselves huge ensembles of different quantum mechanical configurations… even the same thought twice is not the same quantum mechanical configuration twice. The “units” of thought are in this way decoupled from the fundamental level since two versions of the “same thing” are actually so statistically removed from their quantum mechanical foundation as to be completely unpredictable from it.

This is a big part of the problem with Tegmark’s approach; he basically says “Quantum underlies everything, therefore everything should be predictable from quantum.” This is a fool’s errand. The machineries of thought in a biological person are simply at a scale where the quantum mechanics has salted out into Tegmark’s “human baggage”… named conceptual entities, like neuroanatomy, free energy and entropy, that are not mathematically irreducible. He gets to ignore the actual mechanisms of “thought” and “self-awareness” in order to focus on things he’s more interested in, like what he calls the foundation structure of the universe. Unfortunately, he’s trying to attach to levels of reality that are not naturally associated… thought and awareness are by no means associated with fundamental reality –time passage as experienced by a human being, for instance, has much more in common with entropy and statistical mechanics than it does with anything else, and Tegmark totally ignored it in favor of a rather ridiculous version of the observer paradox.

One thing that continues to bother me about this book is something that Tegmark says late in it. The man is clearly very skilled and very capable at what he does, but he dedicates the last part of his book to all the things he will not publish on for fear of destroying his career. He feels the ideas deserve to be out (and as an arrogant theorist, he feels that even the dross in his theories are gold), but by publishing a book about them, he gets to circumvent peer review and scientific discussion and bring these ideas straight to an audience that may not be able to sort which parts of what he says are crap from those few trinkets which are good. I don’t mean that he should be muzzled, he has the freedom of speech, but if his objective is to favor dissemination of scientific education, he should be a model of what he professes. If Tegmark truly believes these ideas are useful, he should damned well be publishing them directly into the scientific literature so that they can be subjected to real peer review. Like all people, this one should face his hubris. The first of which is his incredible weakness at stat mech and biology.


Interaction Picture

It’s not always about the cat. Here, I will show how to hop from the time dependent Schrodinger equation to the Interaction picture form.

This post is intended to help recover a tiny fraction of the since-destroyed post I originally entitled “NMR and Spin Flipping part II.” I have every intention to reconstruct that post when I have time, but I decided to do it in fragments because the original loss was 5,000 words. I don’t have time to bust my head against that whole mess for the moment, but I can do it in bits, I think.

One section of that post which stands pretty well as a separate entity from the NMR theme was the fraction of work where I spent time deriving a version of the time dependent Schrodinger equation in the interaction picture.

I thought I would go ahead and expand this a little bit and talk generally about some of the basic structural features of non-relativistic quantum mechanics. Likely, this will mostly not be very mathematical, except for the derivation at the end. I’ll warn you when the real derivation is about to start if you are math averse…

Everybody has heard about Schrodinger’s cat. Poor cat is dragged out and flogged semi-dead, semi-alive pretty much any time anybody wants to speak as if they know something about “quantum physics.” The cat might be the one great mascot of quantum in popular culture. The kitty drags with it a name that you no doubt have heard: Erwin Schrodinger, the guy who first coined the anecdote of feline torture as an abstraction to describe some features of quantum mechanics on a level that laymen can embrace, if not totally understand. This name is immediately synonymous with the spine of quantum mechanics as the Schrodinger equation. This equation is not so simple as E = mc^2 or F = ma, but it is a popular equation…

Schrodinger equation

I’ve included it here in its full-on psi-baiting time-dependent form with Planck’s constant uncompressed from ħ.

You hardly ever see it written this way anymore.

All this equation says is that the sum of kinetic and potential energy is total energy, which is tied implicitly to the evolution of the system with time. This equation is popular enough that I found it scrawled on a wall along with some Special Relativity inside the game “Portal 2” once. Admittedly, the game designers used ħ instead of h for Planck’s constant. It may not look that way, but the statement of this equation is no more complicated than F = ma or E = mc^2. It just says “conservation of energy” and that’s pretty much it.

Schrodinger’s equation is the source of wave mechanics, where Psi “ψ” is the notorious quantum mechanical wave function. If you care nothing more about Quantum mechanics, I could say that you’ve seen it all and we could stop here.

The structure of basic quantum mechanics has a great deal to it. Schrodinger’s equation tells you how dynamics happens in quantum physics. It says that the way the wave equation changes in time is tied to some characteristics related to the momentum of the object in question and to where it’s located. Structurally, this is the foundation of all non-relativistic quantum mechanics (I say “non-relativistic” because the more complete form of the Schrodinger equation competent to special relativistic energy is the Klein-Gordon Equation, which I will not touch anywhere in this post.) Pretty much all of quantum mechanics is about manipulating this basic relation in some manner or another in order to get what you want to know out of it. Here, the connection between position and momentum as well as between energy and time hides the famous “uncertainty relations,” all built directly into the Schrodinger equation and implicit to its solutions.

One thing you may not immediately know about Schrodinger’s equation is that it’s actually a member of a family of similar equations. In this case, the equation written above tells about the motion of an object in some volume of space, where the space in question in literally only one dimensional, along an effective line. Another Schrodinger equation (as the one written in this post) expands space into three dimensions. Still other Schrodinger equation-like forms are needed to understand how an object tumbles or rotates, or even how it might turn itself inside out or how it might play hopscotch on a crystalline lattice or bend and twist in a magnetic field. There are many different ways that the functional form above might be repurposed to express some permutation of the same set of general ideas.

This tremendous diversity is accomplished by a mathematical structure called “operator formalism.” Operators are small parcels of mathematical operation that transform the entity of the wave function in particular ways. An operator is sort of like a box of gears that hides what’s going on. You might fold down the gull-wing door in the equation above and hide the gears in an operator called the “Hamiltonian.”

Schrodinger equation 2

This just shuffles everything you don’t care about at a given time under the rug and lets you work overarching operations on the outside. Operators can encode most everything you might want. There are a ton of rules that go into the manipulation of operators, which I won’t spend time on here because it distracts from where I’m headed. A hundred types of Schrodinger equation can be written by swapping out the inside of the Hamiltonian.

An additional simplification of operators comes from what’s called “representation formalism.” The first Schrodinger equation I wrote above is within a representation of position. Knowing about the structure of the representation places many requirements which help to define the form of the Hamiltonian. I could as easily have written the same Schrodinger equation in a representation of momentum, where the position variable becomes some strange differential equation… momentum is in that equation above, but you would never know it to look at because it’s in a form related to velocity, which is connected back to position, so that position and time are the only variables relevant to the representation. By backing out of a representation, into a representation free, “abstract form,” operators lose their bells and whistles while wave functions are converted to a structure called a “ket.”

Ket is short for “Bra-Ket,” which is a representation free notation developed by Paul Dirac, another quantum luminary working in Schrodinger’s time. A “bra” is related to a “ket” by an operation called a “conjugate transform,” but you need only know that it’s a way to talk about the wave equation when you are not saying how the wave equation is represented. If you’ve dealt with kets, you’ve probably been in a quantum mechanics class… “wave function” has a place in popular culture, “ket” does not.

Most quantum mechanics is performed with operators and kets. The operators act on kets to transform them.

One place where this general structure becomes slightly upset is when you start talking about time. And, of course time is needed if you’re going to talk about how things in the real world interact or behave. The variable of time is very special in quantum mechanics because of how it enters into Schrodinger’s equation… this may not be apparent from what I’ve written above, but time is treated as its own thing. Schrodinger’s equation can be rewritten to form what’s called a time displacement operation.

You might take a breath, derivation begins here….

hamiltonian time dependence

This is just a way to completely twist around Schrodinger’s time dependent equation into a ket form where the ket now has its time dependence expressed by a time displacement modulated by the Hamiltonian. I’ve even broken up the Hamiltonian into static and time dependent parts (as this will be important to the Interaction Picture, down below). The time displacement operation just acts on the ket to push it forward in time. The thing inside the exponential is a form of quantum phase.

This ket is an example of a “state ket.” It is the abstract representation of a generalized wave function that solves Schrodinger’s time dependent equation. A second form of ket, called an “eigen ket,” emerges from a series of special solutions to the Schrodinger equation that have no time dependence. An eigen ket (I often write “eigenket”) remains the same at all times and is considered a “stationary solution” to the Schrodinger equation. “Eigen solutions” tend to be very special solutions in many other forms of physics: the notes on your flute or piano are eigen solutions, or stationary wave solutions, for the oscillatory physics in that particular instrument. In quantum mechanics, eigen modes are exceptionally useful because any general time dependent solution to the Schrodinger equation can be fabricated out of a linear sum of eigenkets. This math is connected intimately to Fourier series. The collection of all possible eigenket solutions to a particular Schrodinger equation forms a complete description of a given representation of that Schrodinger equation, which is called a Hilbert space. You can write any general solution for one particular Schrodinger equation using the Hilbert space of that equation. A particular eigenket solves the Hamiltonian of a Schrodinger equation with a constant, called an eigenvalue, which is the same as saying that an eigenket doesn’t change with time (producing Schrodinger’s time-independent wave equation).

eigen function equation 2

This is just the eigenvalue equation for the stationary part of the Hamiltonian written above, which could be expanded into Schrodinger’s time independent equation.

Deep breath now, this dives into Interaction Picture quickly.

How quantum mechanics treats time can be reduced in its extrema to two paradigms which are called “Pictures.” The first picture is called the “Schrodinger Picture,” while the second is called the “Heisenberg Picture” for Werner Heisenberg. Heisenberg and Schrodinger developed the basics of non-relativistic quantum mechanics in parallel from two separate directions; Schrodinger gave us wave mechanics while Heisenberg gave us operator formalism. They are essentially the same thing and work extremely well when used together. Schrodinger and Heisenberg pictures are connected to each other from the time displacement operator. In Schrodinger picture, the time displacement operation acts on the state ket, causing the state to evolve forward in time. In Heisenberg picture, the time displacement is shifted onto the operators and the eigenkets, while the state ket remains constant in time. Schrodinger picture is like sitting on a curbside and watching a car drive past, while Heisenberg picture is like sitting inside the car and watching the world drive past. Both pictures agree that the car is traveling the same speed, but they are looking at the situation from different vantage points. The Schrodinger time dependent equation is balanced by the Heisenberg equation of motion.

Where time dependence starts to become really interesting is if the Hamiltonian is not completely constant. As I wrote above, you might have a part of the Hamiltonian which contains some dependence on time. One way in which quantum mechanics addresses this is by a construction called the “Interaction Picture”… Sakurai also calls it the “Dirac Picture.” The interaction picture is sort of like driving along in your car and wondering at the car you’re passing; the world outside appears to be moving, as is the car you’re looking at, if only at different speeds and maybe in different directions.

I’ve likened this notion to switching frames of reference, but I caution you from pushing that analogy too far. The transformation between one picture and the next is by quantum mechanical phase, not by some sort transformation of frame of reference. Switching pictures is simply changing where time dependent phase is accumulated. As the Schrodinger picture places all this phase in the ket, Heisenberg picture places it all on the operator. Interaction picture splits the difference: the stationary phase is stuck to the operator while the time dependent phase is accumulated by the ket. In all three pictures, the same observables result (rather, the same expectation values) but the phases are broken up. Here is how the phases can be split inside a state ket.

State function in schrodinger picture 3

I’ve written the state ket as a sum of eigenkets |n>. The time dependence from a time varying potential “V” is hidden in the eigenket coefficient while the stationary phase remains behind. The “n” index of the sum allows you to step through the entire Hilbert space of eigenkets without writing any but the one. Often, the coefficient Cn(t) is what we’re ultimately interested in, so it helps to remember that it has the following form when represented in bra-ket notation:

expanding the ket 5

I’ve skipped ahead a little by writing that ket in the Interaction picture (these images were created for the NMR post that died, so they’re not quite in sequence now), but the effect is consistent. The usage of “1” here just a way to move into a Hilbert space representation of eigenkets… with probability normalized eigenkets, “spanning the space” means that you can construct a linear projection operator that is the same as identity. The 1 = sum is all that says. This is just a way to write the coefficient above in a bra-ket form.

The actual transformation to the Interaction picture is accomplished by canceling out the stationary phase…

Transformation to interaction picture 4

By multiplying through with the conjugate of the stationary phase, only the time varying phase in the coefficient remains. This extra phase will then show up on operators translated into the interaction picture…

Potential in interaction picture 7

This takes the potential as it appears in the Schrodinger picture and converts it to a form consistent with the Interaction picture.

You can then start passing these relationships through the time dependent Schrodinger equation. One must only keep in mind that every derivative of time must be accounted for and that there are several…

Time dependent Schrodinger in interaction pic 5

(edit 5-22-18: The image right here contains a bit of wrong math, see the end of the post for a more comprehensive and correct version. I made a mistake and I won’t try to hide it: see if you can find it!;-)

This little bit of algebra creates a new form for the time dependent Schrodinger equation where the time dependence is only due to the time varying potential “V”. You can then basically just drop into a representation and use all the equalities I’ve justified above…

Time dep in Interaction pic diff eq 8

The last result here has eliminated all the ket notation and created a version of the time dependent Schrodinger equation where the differential equation is for the coefficients describing how much of each eigenket shows up in the state ket. The dot over the coefficient is a shorthand to mean “time derivative.”

This form of the time dependent Schrodinger equation gives an interesting story. The interaction represented by the time dependent potential “V” scrambles eigenket m into eigenket n. As you might have guessed, this is one in the huge family of different equations related to the Schrodinger equation and this particular version has an apt use in describing interactions. Background quantum mechanical phase accumulated only by the forward passage of time is ignored in order to look at phase accumulated by an interaction.

I will ultimately use this to talk a bit more about the two state problem and NMR, as from the post that died. Much of this particular derivation appears in the Sakurai Quantum Mechanics text.

edit 5-22-18:

There is a quirk in this derivation for the interaction picture that continues to bother me. I didn’t really see it at first, but it bothers me having thought some time about it. The full Hamiltonian is defined to be some basic part plus some separable time-dependent potential. In the derivative that produces the evolution from the time-dependent potential, there is a basic assumption that this time-dependent potential does not contain time explicitly, meaning that no time derivative is taken on the potential. This seems like a self-contradiction to me: the potential is defined as time dependent, but must be the same form as the basic part of the Hamiltonian and not contain explicit time dependence in order for the derivation to work as shown above. I’m still thinking about it.

Here is a better version of the derivative that gives the time dependent Schrodinger form involving only the potential within the interaction picture:

time dependent schrodinger

Chemical Orbitals from Eigenstates

A small puzzle I recently set for myself was finding out how the hydrogenic orbital eigenstates give rise to the S- P- D- and F- orbitals in chemistry (and where s, p, d and f came from).

The reason this puzzle is important to me is that many of my interests sort of straddle how to go from the angstrom scale to the nanometer scale. There is a cross-over where physics becomes chemistry, but chemists and physicists often look at things very differently. I was not directly trained as a P-chemist; I was trained separately as a Biochemist and a Physicist. Remarkably, the Venn diagrams describing the education for these pursuits only overlap slightly. When Biochemists and Molecular Biologists talk, the basic structures below that are frequently just assumed (the scale here is >1nm), while Physicists frequently tend to focus their efforts toward going more and more basic (the scale here is <1 Angstrom). This leads to a clear non-overlap in the scale where chemistry and P-chem are relevant (~1 angstrom). Quite ironically, the whole periodic table of the elements lies there. I have been through P-chem and I’ve gotten hit with it as a Chemist, but this is something of an inconvenient scale gap for me. So, a cat’s paw of mine has been understanding, and I mean really understanding, where quantum mechanics transitions to chemistry.

One place is understanding how to get from the eigenstates I know how to solve to the orbitals structuring the periodic table.


This assemblage is pure quantum mechanics. You learn a huge amount about this in your quantum class. But, there are some fine details which can be left on the counter.

One of those details for me was the discrepancy between the hydrogenic wave functions and the orbitals on the periodic table. If you aren’t paying attention, you may not even know that the s-, p-, d- orbitals are not all directly the hydrogenic eigenstates (or perhaps you were paying a bit closer attention in class than I was and didn’t miss when this detail was brought up). The discrepancy is a very subtle one because often times when you start looking for images of the orbitals, the sources tend to freely mix superpositions of eigenstates with direct eigenstates without telling why the mixtures were chosen…

For example, here are the S, P and D orbitals for the periodic table:


This image is from http://www.chemcomp.com. Focusing on the P row, how is it that these functions relate to the pure eigenstates? Recall the images that I posted previously of the P eigenstates:

P-orbital probabiltiy densityorbital21-1 squared2

In the image for the S, P and D orbitals, of the Px, Py and Pz orbitals, all three look like some variant of P210, which is the pure state on the left, rather than P21-1, which is the state on the right. In chemistry, you get the orbitals directly without really being told where they came from, while in physics, you get the eigenstates and are told somewhat abstractly that the s-, p-, d- orbitals are all superpositions of these eigenstates. I recall seeing a professor during an undergraduate quantum class briefly derive Px and Py, but I really didn’t understand why he selected the combinations he did! Rationally, it makes sense that Pz is identical to P210 and that Px and Py are superpositions that have the same probability distribution as Pz, but are rotated into the X-Y plane ninety degrees from one another. How do Px and Py arise from superpositions of P21-1 and P211? P21-1 and P211 have identical probability distributions despite having opposite angular momentum!

Admittedly, the intuitive rotations that produce Px and Py from Pz make sense at a qualitative level, but if you try to extend that qualitative understanding to the D-row, you’re going to fail. Four of the D orbitals look like rotations of one another, but one doesn’t. Why? And why are there four that look identical? I mean, there are only three spatial dimensions to fill, presumably. How do these five fit together three dimensionally?

Except for the Dz^2, none of the D-orbitals are pure eigenstates: they’re all superpositions. But what logic produces them? What is the common construction algorithm which unites the logic of the D-orbitals with that of the P-orbitals (which are all intuitive rotations).

I’ll actually hold back on the math in this case because it turns out that there is a simple revelation which can give you the jump.

As it turns out, all of chemistry is dependent on angular momentum. When I say all, I really do mean it. The stability of chemical structures is dependent on cases where angular momentum has tended in some way to cancel out. Chemical reactivity in organic chemistry arises from valence choices that form bonds between atoms in order to “complete an octet,” which is short-hand for saying that species combine with each other in such a way that enough electrons are present to fill in or empty out eight orbitals (roughly push the number of electrons orbiting one type of atom across the periodic table in its appropriate row to match the noble gases column). For example, in forming the salt crystal sodium chloride, sodium possesses only one electron in its valence shell while chlorine contains seven: if sodium gives up one electron, it goes to a state with no need to complete the octet (with the equivalent electronic completion of neon), while chlorine gaining an electron pushes it into a state that is electronically equal to argon, with eight electrons. From a physicist stand-point, this is called “angular momentum closure,” where the filled orbitals are sufficient to completely cancel out all angular momentum in that valence level. As another example, one highly reactive chemical structure you might have heard about is a “radical” or maybe a “free radical,” which is simply chemist shorthand for the situation a physicist would recognize contains an electron with uncancelled spin and orbital angular momentum. Radical driven chemical reactions are about passing around this angular momentum! Overall, reactions tend to be driven to occur by the need to cancel out angular momentum. Atomic stoichiometry of a molecular species always revolves around angular momentum closure –you may not see it in basic chemistry, but this determines how many of each atom can be connected, in most cases.

From the physics, what can be known about an orbital is essentially the total angular momentum present and what amount of that angular momentum is in a particular direction, namely along the Z-axis. Angular momentum lost in the X-Y plane is, by definition, not in either the X or Y direction, but in some superposition of both. Without preparing a packet of angular momentum, the distribution ends up having to be uniform, meaning that it is in no particular direction except not in the Z-direction. For the P-orbitals, the eigenstates are purely either all angular momentum in the Z-direction, or none in that direction. For the D-orbitals, the states (of which there are five) can be combinations, two with angular momentum all along Z, two with half in the X-Y plane and half along Z and one with all in the X-Y plane.

What I’ve learned is that, for chemically relevant orbitals, the general rule is “minimal definite angular momentum.” What I mean by this is that you want to minimize situations where the orbital angular momentum is in a particular direction. The orbits present on the periodic table are states which have canceled out angular momentum located along the Z-axis. This is somewhat obvious for the homology between P210 and Pz. P210 points all of its angular momentum perpendicular to the z-axis. It locates the electron on average somewhere along the Z-axis in a pair of lobes shaped like a peanut, but the orbital direction is undefined. You can’t tell how the electron goes around.

As it turns out, Px and Py can both be obtained by making simple superpositions of P21-1 and P211 that cancel out z-axis angular momentum… literally adding together these two states so that their angular momentum along the z-axis goes away. Px is the symmetric superposition while Py is the antisymmetric version. For the two states obtained by this method, if you look for the expectation value of the z-axis angular momentum, you’ll find it missing! It cancels to zero.

It’s as simple as that.

The D-orbitals all follow. D320 already has no angular momentum on the z-axis, so it is directly Dzz. You therefore find four additional combinations by simply adding states that cancel the z-axis angular momentum: D321 and D32-1 symmetric and antisymmetric combinations and then the symmetric and antisymmetric combinations of D322 and D32-2.

Notice, all I’m doing to make any of these states is by looking at the last index (the m-index) of the eignstates and making a linear combination where the first index plus the second gives zero. 1-1 =0, 2-2=0. That’s it. Admittedly, the symmetric combination sums these with a (+) sign and a 1/sqrt(2) weighting constant so that Px = (1/sqrt(2))(P21 + P21-1) is normalized and the antisymmetric combination sums with a (-) sign as in Py = (1/sqrt(2))(P211 – P21-1), but nothing more complicated than that! The D-orbitals can be generated in exactly the same manner. I found one easy reference on line that loosely corroborated this observation, but said it instead as that the periodic table orbitals are all written such that the wave functions have no complex parts… which is also kind of true, but somewhat misleading because you sometimes have to multiply by a complex phase to put it genuinely in the form of sines for the polar coordinate (and as the polar coordinate is integrated over 360 degrees, expectation values on this coordinate, as z-axis momentum would contain, cancel themselves out; sines and cosines integrated over a full period, or multiples of a full period, integrate to zero.)

Before I wrap up, I had a quick intent to touch on where S-, P-, D- and F- came from. “Why did they pick those damn letters?” I wondered one day. Why not A-, B-, C- and D-? The nomenclature emerged from how spectral lines appeared visually and groups were named: (S)harp, (P)rincipal, (D)iffuse and (F)undamental. (A second interesting bit of “why the hell???” nomenclature is the X-ray lines… you may hate this notation as much as me: K, L, M, N, O… “stupid machine uses the K-line… what does that mean?” These letters simply match the n quantum number –the energy level– as n=1,2,3,4,5… Carbon K-edge, for instance, is the amount of energy between the n=1 orbital level and the ionized continuum for a carbon atom.) The sharpness tends to reflect the complexity of the structure in these groups.

As a quick summary about structuring of the periodic table, S-, P-, D-, and F- group the vertical columns (while the horizontal rows are the associated relative energy, but not necessarily the n-number). The element is determined by the number of protons present in the nucleus, which creates the chemical character of the atom by requiring an equal number of electrons present to cancel out the total positive charge of the nucleus. Electrons, as fermions, are forced to occupy distinct orbital states, meaning that each electron has a distinct orbit from every other (fudging for the antisymmetry of the wave function containing them all). As electrons are added to cancel protons, they fall into the available orbitals depicted in the order on the periodic table going from left to right, which can be a little confusing because they don’t necessarily purely close one level of n before starting to fill S-orbitals of the next level of n; for example at n=3, l can equal 0, 1 and 2… but, the S-orbitals for n=4 will fill before D-orbitals for n=3 (which are found in row 4). This has purely to do with the S-orbitals having lower energy than P-orbitals which have lower energy than D-orbitals, but that the energy of an S-orbital for a higher n may have lower energy than the D-orbital for n-1, meaning that the levels fill by order of energy and not necessarily by order to angular momentum closure, even though angular momentum closure influences the chemistry. S-, P-, D-, and F- all have double degeneracy to contain up and down spin of each orbital, so that S- contains 2 instead of 1, P- contains 6 instead of 3, and D- from 10 instead of 5. If you start to count, you’ll see that this produces the numerics of the periodic table.

Periodic table is a fascinating construct: it contains a huge amount of quantum mechanical information which really doesn’t look much like quantum mechanics. And, everybody has seen the thing! An interesting test to see the depth of a conversation about periodic table is to ask those conversing if they understand why the word “periodic” is used in the name “Periodic table of the elements.” The choice of that word is pure quantum mechanics.

Parity symmetry in Quantum Mechanics

I haven’t written about my problem play for a while. Since last I wrote about rotational problems, I’ve gone through the entire Sakurai chapter 4, which is an introduction to symmetry. At the moment, I’m reading Chapter 5 while still thinking about some of the last few problems in Chapter 4.

I admit that I had a great deal of trouble getting motivated to attack the Chapter 4 problems. When I saw the first aspects of symmetry in class, I just did not particularly understand it. Coming back to it on my own was not much better. Abstract symmetry is not easy to understand.

In Sakurai chapter 4, the text delves into a few different symmetries that are important to quantum mechanics and pretty much all of them are difficult to see at first. As it turns out, some of these symmetries are very powerful tools. For example, use of the reflection symmetry operation in a chiral molecule (like the C-alpha carbon of proteins or the hydrated carbons of sugars) can reveal neighboring degenerate ground states which can be accessed by racemization, where an atomic substituent of the molecule tunnels through the plane of the molecule and reverses the chirality of the state at some infrequent rate. Another example is translation symmetry operation, where a lattice of identical attractive potentials serves to hide a near infinite number of identical states where a bound particle can hop from one minimum to the next and traverse the lattice… this behavior essentially a specific model describing the passage of electrons through a crystalline semiconductor.

One of the harder symmetries was time reversal symmetry. I shouldn’t say “one of the harder;” for me time reversal was the hardest to understand and I would be hesitant to say that I completely understand it yet. Time reversal operator causes time to translate backward, making momenta and angular momenta reverse. Time reversal is really hard because the operator is anti-unitary, meaning that the operation switches the sign on complex quantities that it operates on. Nevertheless, time reversal has some interesting outcomes. For instance, if a spinless particle is bound to a fixed center where the state in question is not degenerate (Only one state at the given energy), time reversal says that the state can have no average angular momentum (it can’t be rotating or orbiting). On the other hand, if the particle has spin, the bound state must be degenerate because the particle can’t have no angular momentum!

A quick digression here for the laymen: in quantum mechanics, the word “degenerate” is used to refer to situations where multiple states lie on top of one another and are indistinguishable. Degeneracy is very important in quantum mechanics because certain situations contain only enough information to know an incomplete picture of the model where more information is needed to distinguish alternative answers… coexisting alternatives subsist in superposition, meaning that a wave function is in a superposition of its degenerate alternative outcomes if there is no way to distinguish among them. This is part of how entanglement arises: you can generate entanglement by creating a situation where discrete parts of the system simultaneously occupy degenerate states encompassing the whole system. The discrete parts become entangled.

Symmetry is important because it provides a powerful tool by which to break apart degeneracy. A set of degenerate states can often be distinguished from one another by exploiting the symmetries present in the system. L- and R- enantiomers in a molecule are related by a reflection symmetry at a stereo center, meaning that there are two states of indistinguishable energy that are reflections of one another. People don’t often notice it, but chemists are masters of quantum mechanics even though they typically don’t know as much of the math: how you build molecules is totally governed by quantum mechanics and chemists must understand the qualitative results of the physical models. I’ve seen chemists speak competently of symmetry transformations in places where the physicists sometimes have problems.

Another place where symmetry is important is in the search for new physics. The way to discover new physical phenomena is to look for observational results that break the expected symmetries of a given mathematical model. The LHC was built to explore symmetries. Currently known models are said to hold CPT symmetry, referring to Charge, Parity and Time Reversal symmetry… I admit that I don’t understand all the implications of this, but simply put, if you make an observation that violates CPT, you have discovered physics not accounted for by current models.

I held back talking about Parity in all this because I wanted to speak of it in greater detail. Of the symmetries covered in Sakurai chapter 4, I feel that I made the greatest jump in understanding on Parity.

Parity is symmetry under space inversion.


Just saying that sounds diabolical. Space inversion. It sounds like that situation in Harry Potter where somebody screws up trying to disapparate and manages to get splinched… like they space invert themselves and can’t undo it.

The parity operation carries all the cartesian variables in a function to their negative values.

parity operation

Here Phi just stands in for the parity operator. By performing the parity operation, all the variables in the function which denote spatial position are turned inside out and sent to their negative value. Things get splinched.

You might note here that applying parity twice gets you back to where you started, unsplinching the splinched. This shows that parity operator has the special property that it is it’s own inverse operation. You might understand how special this is by noting that we can’t all literally be our own brother, but the parity operator basically is.


Applying parity twice is like multiplying by 1… which is how you know parity is its own inverse. This also makes parity a unitary operator since it doesn’t effect absolute value of the function. Parity operation times inverse parity is one, so unitary.

parity3 or parity4

Here, the daggered superscript means “complex conjugate” which is an automatic requirement for the inverse operation if you’re a unitary operator. Hello linear algebra. Be assured I’m not about the break out the matrices, so have no fear. We will stay in a representation free zone. In this regard, parity operation is very much like a rotation: the inverse operation is the complex conjugate of the operation, never mind the details that the inverse operation is the operation.

Parity symmetry is “symmetry under the parity operation.” There are many states that are not symmetric under parity, but we would be interested in searching particularly for parity operation eigenstates, which are states that parity operator will transform to give back that state times some constant eigenvalue. As it turns out, the parity operator can only ever have two eigenvalues, which are +1 and -1. A parity eigenstate is a state that only changes its sign (or not) when acted on by the parity operator. The parity eigenvalue equations are therefore:


All this says is that under space inversion, the parity eigenstates will either not be affected by the transformation, or will be negative of their original value. If the sign doesn’t change, the state is symmetric under space inversion (called even). But, if the sign does change, the state is antisymmetric under space inversion (called odd). As an example, in a space of one dimension (defined by ‘x’), the function sine is antisymmetric (odd) while the function cosine is symmetric (even).


In this image, taken from a graphing app on my smartphone, the white curve is plain old sine while the blue curve is the parity transformed sine. As mentioned, cosine does not change under parity.

As you may be aware, sines and cosines are energy eigenstates for the particle-in-the-box problem and so would constitute one example of legit parity eigenstates with physical significance.

Operators can also be transformed by parity. In order to see the significance, you just note that the definition of parity is that the position operation is reversed. So, a parity transformation of the position operator is this:


Kind of what should be expected. Position under parity turns negative.

As expressed, all of this is really academic. What’s the point?

Parity can give some insights that have deep significance. The deepest result that I understood is that matrix elements and expectation values will conserve with parity transformation. Matrix elements are a generalization of the expectation value where the bra and ket are not necessarily to the same eigenfunction. The proof of the statement here is one line:


At the end, the squiggles all denote parity transformed values, ‘m’ and ‘n’ are blanket eigenstates with arbitrary parity eigenvalues and V is some miscellaneous operator. First, the complex conjugation that turns a ket into a bra does not affect the parity eigenvalue equation, since parity is its own inverse operation and since the eigenvalues of 1 and -1 are not complex, so the bra above has just the same eigenvalue as if it were a ket. So, the matrix element does not change with the parity transformation –the combined parity transformation of all these parts are as if you just multiplied by identity a couple times, which should do nothing but return the original value.

What makes this important is that it sets a requirement on how many -1 eigenvalues can appear within the parity transformed matrix element (which is equal to the original matrix element): it can never be more than an even number (either zero or two). For the element to exist (that is, for it to have a non-zero value), if the initial and final states connected by the potential are both parity odd or parity even, the potential connecting them must be symmetric. Conversely, if the potential is parity odd, either the initial or final state must be odd, while the other is even. To sum up, a parity odd operator has non-zero matrix elements only when connecting states of differing parity while a parity even operator must connect states of the same parity. This restriction is observed simply by noting that the sign can’t change between a matrix element and the parity transformed matrix element.

Now, since an expectation value (average position, for example) is always a matrix element connecting an eigenket to itself, expectation values can only be non-zero for operators of even parity. For example, in a system defined across all space, average position ends up being zero because the position operator is odd, while both eigenbra and eigenket are of the same function, and therefore have the same parity. For average position to be non-zero, the wavefunction would need to be a superposition of eigenkets of opposite parity (and therefore not an eigenstate of parity at all!)

A tangible, far reaching result of this symmetry, related particularly to the position operator, is that no pure eigenstate can have an electric dipole moment. The dipole moment operator is built around the position operator, so a situation where position expectation value goes to zero will require dipole moment to be zero also. Any observed electric dipole moment must be from a mixture of states.

If you stop and think about that, that’s really pretty amazing. It tells you whether an observable is zero or not depending on which eigenkets are present and whether the operator for that observable can be inverted or not.

Hopefully I got that all correct. If anybody more sophisticated than me sees holes in my statement, please speak up!

Welcome to symmetry.

(For the few people who may have noticed, I still have it in mind to write more about the magnets puzzle, but I really haven’t had time recently. Magnets are difficult.)

What is a qubit?

I was trolling around in the comments of a news article presented on Yahoo the other day. What I saw there has sort of stuck with me and I’ve decided I should write about it. The article in question, which may have been by an outfit other than Yahoo itself, was about the recent decision by IBM to direct a division of people toward the task of learning how to program a quantum computer.

Using the word ‘quantum’ in the title of a news article is a sure fire way to incite click-bait. People flock in awe to quantum-ness even if they don’t understand what the hell they’re reading. This article was a prime example. All the article really talked about was that IBM has decided that quantum computers are now a promising enough technology that they’re going to start devoting themselves to the task of figuring out how to compute with them. Note, the article spent a lot of time kind of masturbating over how marvelous quantum computers will be, but it really actually didn’t say anything new. Another tech company deciding to pretend to be in quantum computing by figuring out how to program an imaginary computer is not an advance in our technology… digital quantum computers are generally agreed to be at least a few years off yet and they’ve been a few years off for a while now. There’s no guarantee that the technology will suddenly emerge into the mainstream –and I’m neglecting the DSpace quantum computer because it is generally agreed among experts that DSpace hasn’t even managed to prove that their qubits remain coherent through a calculation to actually be a useful quantum computer, let alone that they achieved anything at all by scaling it up.

The title of this article was a prime example of media quantum click-bait. The title boldly declared that “IBM is planning to build a quantum computer millions of times faster than a normal computer.” Now, that title was based on an extrapolation in the midst of the article where a quantum computer containing a mere 1000 qubits suddenly becomes the fastest computing machine imaginable. We’re very used to computers that contain gigabytes of RAM now, which is actually several billion on-off switches on the chip, so a mere 1,000 qubits seems like a really tiny number. This should be underwritten with the general concerns of the physics community that an array of 100 entangled qubits may exceed what’s physically possible… and it neglects that the difficulty of dealing with entangled systems increases exponentially with the number of qubits to be entangled. Scaling up normal bits doesn’t bump into the same difficulty. I don’t know if it’s physically possible or not, but I am aware that IBM’s declaration isn’t a major break-through so much as splashing around a bit of tech gism to keep the stockholders happy. All the article really said was that IBM has happily decided to hop on the quantum train because that seems to be the thing to do right now.

I really should understand that trolling around in the comments on such articles is a lost cause. There are so many misconceptions about quantum mechanics running around in popular culture that there’s almost no hope of finding the truth in such threads.

All this background gets us to what I was hoping to talk about. One big misconception that seemed to be somewhat common among commenters on this article is that two identical things in two places actually constitute only one thing magically in two places. This may stem from a conflation of what a wave function is versus what a qubit is and it may also be a big misunderstanding of the information that can be encoded in a qubit.

In a normal computer we all know that pretty much every calculation is built around representing numbers using binary. As everybody knows, a digital computer switch has two positions: we say that one position is 0 and the other is 1. An array of two digital on-off switches then can produce four distinct states: in binary, to represent the on-off settings of these states, we have 00, 01, 10 and 11. You could easily map those four settings to mean 1, 2, 3 and 4.

Suppose we switch now to talk about a quantum computer where the array is not bits anymore, but qubits. A very common qubit to talk about is the spin of an atom or an electron. This atom can be in two spin states: spin-up and spin-down. We could easily map the state spin-up to be 1, and call it ‘on,’ while spin-down is 0, or ‘off.’ For two qubits, we then get the states 00, 01, 10 and 11 that we had before, where we know about what states the bits are in, but we also can turn around and invoke entanglement. Entanglement is a situation where we create a wave function that contains multiple distinct particles at the same time such that the states those particles are in are interdependent on one another based upon what we can’t know about the system as a whole. Note, these two particles are separate objects, but they are both present in the wave function as separate objects. For two spin-up/spin-down type particles, this can give access to the so-called singlet and triplet states in addition to the normal binary states that the usual digital register can explore.

The quantum mechanics works like this. For the system of spin-up and spin-down, the usual way to look at this is in increments of spinning angular momentum: spin-up is a 1/2 unit of angular momentum pointed up while spin-down is -1/2 unit of angular moment, but pointed the opposite direction because of the negative sign. For the entangled system of two such particles, you can get three different values of entangled angular momentum: 1, 0 and -1. Spin 1 has both spins pointing up, but not ‘observed,’ meaning that it is completely degenerate with the 11 state of the digital register since it can’t fall into anything but 11 when the wave function collapses. Spin -1 is the same way: both spins are down, meaning that they have 100% probability of dropping into 00. The spin 0 state, on the other hand, is kind of screwy, and this is where the extra information encoding space of quantum computing emerges. The 0 states could be the symmetric combination of spin-up with spin-down or the anti-symmetric combination of the same thing. Now, these are distinct states, meaning that the size of your register just expanded from (00, 01, 10 and 11) to (00, 01, 10, 11 plus anti-symmetric 10-01 and symmetric 10+01). So, the two qubit register can encode 6 possible values instead of just 4. I’m still trying to decide if the spin 1 and -1 states could be considered different from 11 and 00, but I don’t think they can since they lack the indeterminacy present in the different spin 0 states. I’m also somewhat uncertain whether you have two extra states to give a capacity in the register of 6 or just 5 since I’m not certain what the field has to say about the practicality of determining the phase constant between the two mixed spin-up/spin-down eigenstates, since this is the only way to determine the difference between the symmetric and anti-symmetric combinations of spin.

As I was writing here, I realized also that I made a mistake myself in the interpretation of the qubit as I was writing my comment last night. At the very unentangled minimum, an array of two qubits contains the same number of states as an array of two normal bits. If I consider only the states possible by entangled qubits, without considering the phasing constant between 10+01 and 10-01, this gives only three states, or at most four states with the phase constant. I wrote my comment without including the four purely unentangled cases, giving fewer total states accessible to the device, or at most the same number.

Now, the thing that makes this incredibly special is that the number of extra states available to a register of qubits grows exponentially with the number of qubits present in the register. This means that a register of 10 qubits can encode many more numbers than a register of ten bits! Further, this means that fewer bits can be used to make much bigger calculations, which ultimately translates to a much faster computer if the speed of turning over the register is comparable to that of a more conventional computer –which is actually somewhat doubtful since a quantum computer would need to repeat calculations potentially many times in order to build up quantum statistics.

One of the big things that is limiting the size of quantum computers at this point is maintaining coherence. Maintaining coherence is very difficult and proving that the computer maintains all the entanglements that you create 100% of the time is exceptionally non-trivial. This comes back to the old cat-in-the-box difficulty of truly isolating the quantum system from the rest of the universe. And, it becomes more non-trivial the more qubits you include. I saw a seminar recently where the presenting professor was expressing optimism about creating a register of 100 Josephson junction type qubits, but was forced to admit that he didn’t know for sure whether it would work because of the difficulties that emerge in trying to maintain coherence across a register of that size.

I personally think it likely that we’ll have real digital quantum computers in the relatively near future, but I think the jury is still out as to exactly how powerful they’ll be when compared to conventional computers. There are simply too many variables yet which could influence the power and speed of a quantum computer in meaningful ways.

Coming back to my outrage at reading comments in that thread, I’m still at ‘dear god.’ Quantum computers do not work by teleportation: they do not have any way of magically putting a single object in multiple places. The structure of a wave function is defined simply by what you consider to be a collection of objects that are simultaneously isolated from the rest of the universe at a given time. A wave function quite easily spans many objects all at once since it is merely a statistical description of the disposition of that system as seen from the outside, and nothing more. It is not exactly a ‘thing’ in and of itself insomuch as collections of indescribably simple objects tend to behave in absolutely consistent ways among themselves. Where it becomes wave-like and weird is that we have definable limits to how precisely we can understand what’s going on at this basic level and that our inability to directly ‘interact’ with that level more or less assures that we can’t ever know everything about that level or how it behaves. Quantum mechanics follows from there. It really is all about what’s knowable; building a situation where certain things are selectively knowable is what it means to build a quantum computer.

That’s admittedly pretty weird if you stop and think about it, but not crazy or magical in that wide-eyed new agey smack-babbling way.

Beyond F=ma

Every college student taking that requisite physics class sees Newton’s second law. I saw it once even in a textbook for a martial art: Force equals mass times acceleration… the faster you go, the harder you hit! At least, that’s what they were saying, never mind that the usage wasn’t accurate. F=ma is one of those crazy simple equations that is so bite-sized that all of popular culture is able to comprehend it. Kind of.

Newton’s second law is, of course, one of three fundamental laws. You may even already know all of Newton’s laws without realizing that you do. The first law is “An object in motion remains in motion while an object at rest remains at rest,” which is really actually just a specialization of Newton’s second law where F = 0. Newton’s third law is the ever famous “For every action there is an equal and opposite reaction.” The three laws together are pretty much everything you need to get started on physics.

Much is made of Newton’s Laws in engineering. Mostly, you can comprehend how almost everything in the world around you operates based on a first approximation with Newton’s Laws. They are very important.

Now, as a Physicist, freshman physics is basically the last time you see Newton’s Laws. However important they are, physicists prefer to go other directions.

What? Physicists don’t use Newton’s Laws?!! Sacrilege!

You heard me right. Most of modern physics opens out beyond Newton. So, what do we use?

Believe it or not, in the time before computer games, TVs and social media, people needed to keep themselves entertained. While Newton invented his physics in the 1600s, there were a couple hundred years yet between his developments and the era of modern physics… two hundred years even before electrodynamics and thermodynamics became a thing. In that time, physicists were definitely keeping themselves entertained. They did this by reinventing the wheel repeatedly!

As a field, classical mechanics is filled with the arcane formalisms that gird the structure of modern physics. If you want to understand Quantum Mechanics, for instance, it did not emerge from a vacuum; it was birthed from all this development between Newtonian Mechanics and the Golden years of the 20th century. You can’t get away from it, in fact. People lauding Quantum Mechanics as somehow breaking Classical physics generally don’t know jack. Without the Classical physics, there would be no Quantum Mechanics.

For one particular thread, consider this. Heisenberg Uncertainty Principle depends on operator commutation relations, or commutators. Commutators, then, emerged from an arcanum called Poisson brackets. Poisson brackets emerged from a structure called Hamiltonian formalism. And, Hamiltonian formalism is a modification of Lagrangian formalism. Lagrangian formalism, finally, is a calculus of variations readjustment from D’Alembert’s principle which is a freaky little break from Newtonian physics. If you’ve done any real quantum, you’ll know that you can’t escape from the Hamiltonians without tripping over Lagrangians.

This brings us to what I was hoping to talk about. Getting past Newton’s Laws into this unbounded realm of the great Beyond is a non-trivial intellectual break. When I called it a freaky little break, I’m not kidding. Everything beyond that point hangs together logically, but the stepping stone at the doorway is a particularly high one.

Perhaps the easiest way to see the depth of the jump is to see the philosophy of how mechanics is described on either side.

With Newton’s laws, the name of the game is to identify interactions between objects. An ‘interaction’ is another name for a force. If you lean back against a wall, there is an interaction between you and the wall, where you and the wall exert forces on one another. Each interaction corresponds to a pair of forces: the wall pushing against you and you pushing against the wall. Newton’s second law then states that if the sum of all forces acting on one object are not equal to zero, that the object will undergo an acceleration in some direction and the instantaneous forces then work together to describe the path the object will travel. The logical strategy is to find the forces and then calculate the accelerations.

On the far side of the jump is the lowest level of non-Newtonian mechanics, Lagrangian mechanics. You no longer work with forces at all and everything is expressed instead using energies. The problem proceeds by generating an energy laden mathematical entity called a ‘Lagrangian’ and then pushing that quantity through a differential equation called Lagrange’s equation. After constructing Lagrange’s equation, you gain expressions for position as a function of time. This tells you ultimately the same information that you gain by working Newton’s laws, which is that some object travels along a path through space.

Reading these two paragraphs side-by-side should give you a sense of the great difference between these two methods. Newtonian mechanics is typically very intuitive since it divides up the problem into objects and interactions while Lagrangian mechanics has an opaque, almost clinical quality that defies explanation. What is a Lagrangian? What is the point of Lagrange’s equation? This is not helped by the fact that Lagrangian formalism usually falls into generalized coordinates, which can hide some facets of coordinate position in favor of expedience. To the beginner, it feels like turning a crank on a gumball machine and hoping answers pop out.

There is a degree of menace to it while you’re learning it the first time. The teaching of where Lagrange’s equation comes from is from an opaque branch of mathematics called the “Calculus of variation.” How very officious! Calculus of variation is a special calculus where the objective of the mathematics is to optimize paths. This math is designed to answer the question “What is the shortest path between two points?” Intuitively, you could say the shortest path is a line, but how do you know for sure? Well, you compare all the possible paths to each other and pick out the shortest among them. Calculus of variations does this by noting that for small variations from the optimal path, neighboring paths do not differ from each other by as much. So, in the collection of all paths, those that are most alike tend to cluster around the one that is most optimal.

This is a very weird idea. Why should the density of similar paths matter? You can have an infinite number of possible paths! What is variation from the optimal path? It may seem like a rhetorical question, but this is the differential that you end up working with.

A recasting of the variational problem can express one place where this kind of calculus was extremely successful.


Roller coasters!

Under action of gravity where you have no sliding friction, what is the fastest path traveling from point A to point B where point B does not lie directly beneath point A? This is the Brachistochrone problem. Calculus of variations is built to handle this! The strategy is to optimize a path of undetermined length which gives the shortest time of travel between two points. As it turns out, by happy mathematical contrivance, the appropriate path satisfies Lagrange’s equation… which is why Lagrange’s equation is important. The optimal path here is called the curve of quickest descent.

Now, the jump to Lagrangian mechanics is but a hop! It turns out that if you throw a mathematical golden cow called a “Lagrangian” into Lagrange’s equation, the optimal path that pops out is the physical trajectory that a given system described by the Lagrangian tends to follow in reality –and when I say trajectory in the sense of Lagrange’s equation, the ‘trajectory’ is delineated by position or merely the coordinate state of the system as a function of time. If you can express the system of a satellite over the Earth in terms of a Lagrangian, Lagrange’s equation produces the orbits.

This is the very top of a deep physical idea called the “Principle of Least Action.”

In physics, adding up the Lagrangian at every point along some path in time gives a quantity called, most appropriately, “the Action.” The system could conceivably take any possible path among an infinite number of different paths, but physical systems follow paths that minimize the Action. If you find the path that gives the smallest Action, you find the path the system takes.

As an aside to see where this reasoning ultimately leads, Quantum Mechanics finds that while objects tend to follow paths that minimize the Action, they actually try to take every conceivable path… but that the paths which don’t tend to minimize the Action rapidly cancel each other out because their phases vary so wildly from one another. In a way, the minimum Action path does not cancel out from a family of nearby paths since their phases are all similar. From this, a quantum mechanical particle can seem to follow two paths of equal Action at the same time. In a very real way, the weirdness of quantum mechanics emerges directly because of path integral formalism.

All of this, all of the ability to know this, starts with the jump to Lagrangian formalism.

In that, it always bothered me: why the Lagrangian? The path optimization itself makes sense, but why specifically does the Lagrangian matter? Take this one quantity out of nowhere and throw it into a differential equation that you’ve rationalized as ‘minimizing action’ and suddenly you have a system of mechanics that is equal to Newtonian mechanics, but somehow completely different from it! Why does the Lagrangian work? Through my schooling, I’ve seen the derivation of Lagrange’s equation from path integral optimization more than once, but the spark of ‘why optimize using the Lagrangian’ always eluded me. Early on, I didn’t even comprehend enough about the physics to appreciate that the choice of the Lagrangian is usually not well motivated.

So, what exactly is the Lagrangian?

Lagrangian is defined as the difference between kinetic and potential energy. Kinetic energy is the description that an object is moving while potential energy is the expression that by having a particular location in space, the object has the capacity to gain a certain motion (say by falling from the top of a building). The formalism can be modified to work where energy is not conservative, but typically physicists are interested in cases where it does conserve. Energies emerge in Newtonian mechanics as an adaption which allows descriptions of motion to be detached from progression through time, where the first version of energy the freshman physicist usually encounters is “Work.” Work is the Force over a displacement times the spatial length of that displacement. It’s just a product of length times force. And, there is no duration over which the displacement is known to take place, meaning no velocity or acceleration. Potential energy and kinetic energy come next, where kinetic energy is simply a way to connect physical velocity of the object to the work that has been done on it and potential energy is a way to connect a physical situation, typically in terms of a conservative field, to how much work that field can enact on a given object.

When I say ‘conservative,’ the best example is usually the gravitational field that you see under everyday circumstances. When you lift your foot to take a step, you do a certain amount of work against gravity to pick it up… when you set your foot back down, gravity does an equal amount of work on your foot pulling it down. Energy was invested into potential energy picking your foot up, which was then released again as you put your foot back down. And, since gravity worked on your foot pulling it down, your foot will have a kinetic energy equal to the potential energy from how high you raised it before it strikes the ground again and stops moving (provided you aren’t using your muscles to slow its decent). It becomes really mind-bending to consider that gravity did work on your foot while you lifted it up, also, but that your muscles did work to counteract gravity’s work so that your foot could raise. As a quantity, you can chase energy around in this way. In a system like a spring or a pendulum, there are minimal dispersive interactions, meaning that after you start the system moving, it can trade energy back and forth from potential to kinetic forms pretty much  without limit so that the sum of all energies never changes, which is what we call ‘conservative.’

Energy, as it turns out, is one of the chief tokens of all physics. In fields like thermodynamics, which are considered classical but not necessarily Lagrangian, you only rarely see force directly… usually force is hidden behind pressure. The idea that the quantity of energy can function as a gearbox for attaching interactions to one another conceals Newton’s laws, making it possible to talk about interactions without knowing exactly what they are. ‘Heat of combustion’ is a black-box of energy that tells you a way to connect the burning of a fuel to how much work can be derived from the pressure produced by that fuel’s combustion. On one side, you don’t need to know what combustion is, you can tell that it will deliver a stroke of so much energy when the piston compresses, while on the other side, you don’t need to know about the engine, just that you have a process that will suck away some of the heat of your fire to do… something.

Because of the importance of energy, two quantities that are of obvious potential utility are 1.) the difference between kinetic and potential energy  and 2.) the sum of kinetic and potential energy. The first quantity is the Lagrangian, while the second is the so-called Hamiltonian.

There is some clear motivation here why you would want to explore using the quantity of the Lagrangian in some way. Quantities that can conserve, like energy and momentum, are convenient ways of characterizing motion because they can tell you about what to expect from the disposition of your system without huge effort. But for all of these manipulations, the clear connection between F=ma and Lagrange’s equation is still a subtle leap.

The final necessary connection to get from F=ma to the Lagrangian is D’Alembert’s Principle. The principle states simply this: for a system in equilibrium, (rather, while the system isn’t static, it’s not taking in more or less energy than it’s losing) perturbative forces ultimately do no net work. So, all interactions internal to a system in equilibrium can’t shift it away from equilibrium. This statement turns out to be another variational principle.

There is a way to drop F = ma into D’Alembert’s principle and directly produce that the quantity which should be optimized in Lagrange’s equation is the Lagrangian! May not seem like much, but it turns out to be a convoluted mathematical thread… and so, Lagrangian formalism directly follows as a consequence of a special case of Newtonian formalism.

As a parting shot, what does all this path integral, variational stuff mean? The Principle of Least Action has really profound implications on the functioning of reality as a whole. In a way, classical physics observes that reality tends to follow the lazy path: a line is the shortest path between two points and reality operates in such a way that at macroscopic scales the world wants to travel in the equivalent of ‘straight lines.’ The world appears to be lazy. At the fundamental quantum mechanical scale, it thinks hard about the peculiar paths and even seems to try them out, but those efforts are counteract such that only the lazy paths win.

Reality is fundamentally slovenly, and when it tries not to be, it’s self-defeating. Maybe not the best message to end on, but it gives a good reason to spend Sunday afternoon lying in a hammock.

Nonlocality and Simplicity

I just read an article called “How quantum mechanics could be even weirder” in the Atlantic.

The article is actually relatively good in explaining some of how quantum mechanics actually works in terms that are appropriate to laymen.

Neglecting almost everything about ‘super-quantum,’ there is one particular element in this article which I feel somewhat compelled to respond to. It relates to the following passages

But in 1935, Einstein and two younger colleagues unwittingly stumbled upon what looks like the strangest quantum property of all, by showing that, according to quantum mechanics, two particles can be placed in a state in which making an observation on one of them immediately affects the state of the other—even if they’re allowed to travel light years apart before measuring one of them. Two such particles are said to be entangled, and this apparent instantaneous “action at a distance” is an example of quantum nonlocality.

Erwin Schrödinger, who invented the quantum wave function, discerned at once that what later became known as nonlocality is the central feature of quantum mechanics, the thing that makes it so different from classical physics. Yet it didn’t seem to make sense, which is why it vexed Einstein, who had shown conclusively in the theory of special relativity that no signal can travel faster than light. How, then, were entangled particles apparently able to do it?

This is outlining the appearance of entanglement. The way that it’s detailed here, the implication is that there’s a signal being broadcast between the entangled particles and that it breaks the limits of speed imposed by relativity. This is a real argument that is still going on, and not being an expert, I can’t claim that I’m at the level of the discussion. On the other hand, I feel fairly strongly that it can’t be considered a ‘communication.’ I’ll try to rationalize my stance below.

One thing that is very true is that if you think a bit about the scope of the topic and the simultaneous requirements of the physics in order to assure the validity of quantum mechanics, the entanglement phenomenon becomes less metaphysical overall.

Correcting several common misapprehensions of the physics shrinks the loopiness from gaga bat-shit Deepak Chopra down to real quantum size.

The first tripping stone is highlighted by Schrodinger’s Cat, as I’ve mentioned previously. In Schrodinger’s Cat, the way the thought experiment is most frequently constructed, the idea of quantum superposition is imposed on states of “Life” and “Death.” A quantum mechanical event creates a superposition of Life and Death that is not resolved until the box is opened and one state is discovered to dominate. This is flawed because Life and Death are not eigenstates! I’ve said it elsewhere and I’ll repeat it as many times as necessary. There are plenty of brain-dead people whose bodies are still alive. The surface of your skin is all dead, but the basement layer is alive. Your blood cells live three days, and then die… but you do not! Death and Life in the biological sense are very complicated states of being that require a huge number of parameters to define. This is in contrast with an eigenstate which literally is defined by requiring only one number to describe it, the eigenvalue. If you know the eigenvalue of a nondegenerate eigenstate, you know literally everything there is to know about the eigenstate –end of story! I won’t talk about degeneracy because that muddies the water without actually violating the point.

Quantum mechanical things are objects stripped down to such a degree of nakedness that they are simple in a very profound way. For a single quantum mechanical degree of freedom, if you have an eigenvalue to define it, there is nothing else to know about that state. One number tells you everything! For a half-spin magnetic moment, it can exist in exactly two possible eigenstates, either parallel or antiparallel. Those two states can be used together to describe everything that spin can ever do. By the nature of the object, you can’t find it in any other disposition, except parallel or antiparallel… it won’t wander off into some undefined other state because its entire reality is to be pointing in some direction with respect to an external magnetic field… meaning that it can only ever be found as some combination of the two basic eigenstates. There is not another state of being for it. There is no possible “comatose and brain-dead but still breathing” other state.

This is what it means to be simple. We humans do not live where we can ever witness things that are that simple.

The second great tripping stone people never quite seem to understand about quantum mechanics is exactly what it means to have the system ‘enclosed by a box’ prior to observation. In Schrodinger’s Cat, your intuition is lead to think that we’re talking about a paper box closed by packing tape and that the obstruction of our line of vision by the box lid is enough to constitute “closed.” This is not the case… quantum mechanical entities are a combination of so infinitesimal or so low in energy that an ‘observation’ literally usually means nothing more than bouncing a single corpuscle of light off of it. An upshot of this is that as far as the object is concerned, the ‘observer’ is not really different from the rest of the universe. ‘Closed’ in the sense of a quantum mechanical ‘box’ is the state where information is not being exchanged between the rest of the universe and our quantum mechanical system.

Now, that’s closed!

If a simple system which is so simple that it can’t occupy a huge menu of states is allowed to evolve where it is not in contact with the rest of the universe, can you expect to see anything in that system different from what’s already there? One single number is all that’s needed to define what the system is doing behind that closed door!

The third great tripping stone is decoherence. Decoherence is when the universe slips between the observer and the quantum system and talks to it behind our backs. Decoherence is why quantum computers are difficult to build out of entangled quantum states. So the universe fires a photon into or pulls a photon out of our quantum mechanical system, and suddenly the system doesn’t give the entangled answers we thought that it should anymore. Naturally: information moved around. That is what the universe does.

With these several realizations, while it may still not be very intuitive, the magic of entanglement is tempered by the limits of the observation. You will not find a way to argue that ‘people’ are entangled, for instance, because they lack this degree of utter simplicity and identicalness.

One example of an entangled state is a spin singlet state with angular momentum equal to zero. This is simply two spin one-half systems added together in such a way that their spins cancel each other out. Preparing the state gives you two spins that are not merely in superposition but are entangled together by the spin zero singlet. You could take these objects and separate them from one another and then examine them apart. If the universe has not caused the entanglement to decohere, these spins are so simple and identical that they can both only occupy expected eigenstates. They evolve in exactly the same manner since they are identical, but the overarching requirement –if decoherence has not taken place and scrambled things up– is that they must continue to be a net spin-zero state. Whatever else they do, they can’t migrate away from the prepared state behind closed doors simply because entropy here is meaningless. If information is not exchanged externally, any communication by photons between the members of the singlet can only ever still produce the spin singlet.

If you then take one of those spins and determine its eigenstate, you find that it is either the parallel or antiparallel state. Entanglement then requires the partner, separated from it no matter how far, to be in the opposite state. They can’t evolve away from that.

What makes this so brain bending is that the Schrodinger equation can tell you exactly how the entangled state evolves as long as the box remains unopened (that is that the universe has not traded information with the quantum mechanical degree of freedom). There is some point in time when you have a high probability of finding one spin ‘up’ while the other is ‘down,’ and the probability switches back and forth over time as the wave function evolves. When you make the observation to find that one spin is up, the probability distribution for the partner ceases to change and it always ends up being down. After you bounce a photon off of it, that’s it, it’s done… the probability distribution for the ‘down’ particle only ever ends up ‘down.’

This is what they mean by ‘non-locality.’ That you can separate the entangled states by a great distance and still see this effect of where one entangled spin ‘knows’ that the other has decided to be in a particular state. ‘Knowledge’ of the collapse of the state moves between the spins faster than light can travel, apparently.

From this arises heady ideas that maybe this can be the basis of a faster-than-light communication system: like you can tap out Morse code by flipping entangled spins like a light switch.

Still, what information are we asking for?

The fundamental problem is that when you make the entangled state, you can’t set a phase which can tell you  which partner starts out ‘up’ and which starts out ‘down.’ They are in a superposition of both states and the jig is up if you stop to see which is which. One is up and one is down in order to be the singlet state, but you can’t set which. You make a couplet that you can’t look at, by definition! The wave function evolves without there being any way of knowing. When you stop and look at them, you get one up and one down, but no way of being able to say “that one was supposed to be ‘up’ and the other ‘down.'”

You can argue that they started out exactly as they ended up on only a single trial. As I understand it, the only way to know about entanglement is literally by running the experiment enough times to know about the statistical distributions of the outcome, that ‘up’ and ‘down’ are correlated. If you’re separated by light years, one guy finds that his partner particle is ‘up’… he can’t know that the other guy looked at his particle three days ago to find ‘down’ and was expecting the answer in the other party’s hands to be ‘up.’ So much for flipping a spin like a switch and sending message! When was it that the identities of ‘up’ and ‘down’ were even picked?

But these things are very simple, uncomplicated things! If neither party does anything to disrupt the closed box you started out with, you can argue that the choice of which particle ends with which spin was decided before they were ever separated from one another and that they have no need after the separation to be anything but very identical and so simple that you can’t find them in anything but two possible states. No ‘communication’ was necessary and the outcome observed was preordained to be observed. You didn’t look and can’t look, so you can’t know if they always would have given the same answer that they ultimately give. If the universe bumps into them before you can look, you scream ‘decoherence’ and any information preserved from the initial entanglement becomes unknowable. Without many trials, how do you ever even know with one glance if the particles decohered before you could look, or if a particle was still in coherence? That’s the issue with simple things that are in a probability distribution. Once you build up statistics, you see evidence that spins are correlated to a degree that requires an answer like quantum entanglement, but it’s hard to look at them beforehand and know what state they’re in –nay: by definition, it’s impossible. The entangled state gives you no way of knowing which is up or down, and that’s the point!

As such, being unable to pick a starting phase and biasing that one guy has ‘up’ and the other ‘down,’ there is no way to transmit information by looking –or not– at set times.

Since I’m not an experimentalist that works with entangled states, there is some chance that I’ve misunderstood something. In the middle of writing this post, I trolled around looking for information about how entanglement is examined in the lab. As far as I could tell, the information about entanglement is based upon statistics for the correlation of entangled states with each other. The statistics ultimately tell the story.

I won’t say that it isn’t magical. But, I feel that once you know the reality, the wide-eyed extravagance of articles like the one that spawned this post seem ignorant. It’s hard not to crawl through the comments section screaming at people “No, no, no! Dear God, no!”

So then, to take the bull by the horns, I made an earlier statement that I should follow up on explicitly. Why doesn’t entanglement violate relativity? The conventional answer is that the information about knowing of the wave function collapse is useless! The guy who looked first can’t tell the guy holding the other particle that he can look now. Even if the particles know that the wavefunction has collapsed, the parties holding those particles can’t be sure whether or not the state collapsed or decohered. Since the collapse can’t carry information from one party to the other, it doesn’t break relativity. That’s the standard physicist party line.

My own personal feeling is that it’s actually a bit stiffer than that. Once the collapse occurs, the particles in hand seem as if they’ve _always_ made the choice you finally learn them to contain. They don’t talk: it’s just the concrete substrate of reality determined before they’re separated. The on-line world talks about this in two ways: either information can be written backward in time (yeah, they do actually say that) or reality is so deterministic as to eliminate all free will: as if that the experiment you chose to carry out is foreordained at the time when the spin singlet is created, meaning that the particles know what answer they’ll give before you know that you’ve been predestined to ask.

This is not necessarily a favored interpretation. People don’t like the idea that free will doesn’t exist. I personally am not sure why it matters: life and death aren’t eigenstates, so why must free will exist? Was it necessary that your mind choose to be associated with your anus or tied to a substrate in the form of your brain? How many fundamental things about your existence do you inherit by birth which you don’t control? Would it really matter in your life if someone told you that you weren’t actually choosing any of it when there’s no way at all to tell the difference from if you were? Does this mean that Physics says that it can’t predict for you what direction your life will go, but that your path was inevitable before you were born?

At some level one must simply shrug. What I’m suggesting is not a nihilistic stance or that people should just give up because they have no say… I’m suggesting that, beyond the scope of your own life and existence, you are not in a position to make any claims about your own importance in the grand scheme of the universe. The wrr and tick of reality is not in human hands.

If you wish to know more about entanglement, the EPR paradox and this stuff about non-locality and realism, I would recommend learning something about Bell’s inequality.