Magnets, how do they work? (part 1)

Subtitle: Basic derivation of Ampere’s Law from the Biot-Savart equation.

Know your meme.

It’s been a while since this became a thing, but I think it’s actually a really good question. If you stop to think about it, magnets are one of those things where the structure goes deep down and the pieces which drive the phenomenon become quite confusing and mind bending. Truly, the original meme exploded from an unlikely source who wanted to relish in appreciating those things that seem magical without really appreciating how mind-bending and thought-expanding the explanation to this seemingly earnest question actually is.

As I got on in this writing, I realized that the scope of the topic is bigger than can be tackled in a single post. What is presented here will only be the first part. The succeeding posts may end up being as mathematical as this, but perhaps less so. Moveover, as I got to writing, I realized that I haven’t posted a good bit of math here in a while: what good is the the mathematical poetry of physics if nobody sees it?

Magnets do not get less magical when you understand how they work: they get more compelling.


This image, taken from a website that sells quackery, highlights the intriguing properties of magnets. A solid object with apparently no moving parts has this manner of influencing the world around it. How can that not be magical? Lodestones have been magic forever and they do not get less magical with the explanation.

Truthfully, I’ve been thinking about the question of how they work for a couple days now. When I started out, I realized that I couldn’t just answer this out of hand, even though I would like to think that I’ve got a working understanding of magnetic fields. How the details fit together gets deep in a hurry. What makes a bar magnet like the one in the picture above special? You don’t put batteries in it. You don’t flick a switch. It just works.

For most every person, that pattern above is the depth of how it works. How does it work? Well, it has a magnetic field. When a piece of a certain kind of metal is in a magnetic field, it feels a force due to the magnet and this causes the magnet to pull on it or maybe to stick to it. If you have two magnets together, you can orient them in a certain way and they push each other apart.


In this picture from penguin labs, these magnets are exerting sufficient force on one another that many of them apparently defy gravity. Here, the rod simply keeps the magnets confined so that they can’t change orientations with respect to one another and they exert sufficient repulsive force to climb up the rod as if they have no weight.

It’s definitely cool, no denying.

But, is it better knowing how they work, or just blindly appreciating them because it’s too hard to fill in the blank?

Maybe we can answer that.

The central feature of how magnets work is quite effortlessly explained by the physics of Electromagnetism. Or, maybe it’s better to say that the details are laboriously and completely explained. People rebel against how hard it is to understand the details, but no true explanation is required to be easily explicable.

The forces which hold those little pieces of metal apart are relatively understandable.

Lorentz force

Here’s the Lorentz force law. It says that the force (F) on an object with a charge is equal to sum of the electric force on the object (qE) plus the magnetic force (qvB). Magnets interact solely by magnetic force, the second term.


In this picture from Wikipedia, if a charge (q) moving with speed (v) passes into a region containing this thing we call a “magnetic field,” it will tend to curve in its trajectory depending on whether the charge is negative or positive. We can ‘see’ this magnetic field thing in the image above with the bar magnet and iron filings. What is it, how is it produced?

The fundamental observation of magnetic fields is tied up into a phenomenological equation called the Biot-Savart law.


This equation is immediately intimidating. I’ve written it in all of it’s horrifying Jacksonian glory. You can read this equation like a sentence. It says that all the magnetic field (B) you can find at a location in space (r) is proportional to a sum of all the electric currents (J) at all possible locations where you can find any current (r’) and inversely proportional to the square of the distance between where you’re looking for the magnetic field and where all the electrical currents are –it may say ‘inverse cube’ in the equation, but it’s actually an inverse square since there’s a full power of length in the numerator. Yikes, what a sentence! Additionally, the equation says that the direction of the magnetic field is at right angles to both the direction that the current is traveling and the direction given by the line between where you’re looking for magnetic field and where the current is located. These directions are all wrapped up in the arrow scripts on every quantity in the equation and are determined by the cross-product as denoted by the ‘x’. The difference between the two ‘r’ vectors in the numerator creates a pure direction between the location of a particular current element and where you’re looking for magnetic field. The ‘d’ at the end is the differential volume that confines the electric currents and simply means that you’re adding up locations in 3D space. The scaling constants outside the integral sign are geometrical and control strength; the 4 and Pi relate to the dimensionality of the field source radiated out into a full solid angle (it covers a singularity in the field due to the location of the field source) and the ‘μ’ essentially tells how space broadcasts magnetic field… where the constant ‘μ’ is closely tied to the speed of light. This equation has the structure of a propagator: it takes an electric current located at r’ and propagates it into a field at r.

It may also be confusing to you that I’m calling current ‘J’ when nearly every basic physics class calls it ‘I’… well, get used to it. ‘Current vector’ is a subtle variation of current.

I looked for some diagrams to help depict Biot-Savart’s components, but I wasn’t satisfied with what Google coughed up. Here’s a rendering of my own with all the important vectors labeled.

biotsavart diagram

Now, I showed the crazy Biot-Savart equation, but I can tell you right now that it is a pain in the ass to work with. Very few people wake up in the morning and say “Boy oh boy, Biot-Savart for me today!” For most physics students this equation comes with a note of dread. Directly using it to analytically calculate magnetic fields is not easy. That cross product and all the crazy vectors pointing in every which direction make this equation a monster. There are some basic feature here which are common to many fields, particularly the inverse square, which you can find in the Newtonian gravity formula or Coulomb’s law for electrostatics, and the field being proportional to some source, in this case an electric current, where gravity has mass and electrostatics have charge.

Magnetic field becomes extraordinary because of that flipping (God damned, effing…) cross product, which means that it points in counter-intuitive directions. With electrostatics and gravity, the field is usually going toward or away from the source, while magnetism has the field seems to be going ‘around’ the source. Moreover, unlike electrostatics and gravity, the source isn’t exactly a something, like a charge or a mass, it’s dynamic… as in a change in state; electric charges are present in a current, but if you have those charges sitting stationary, even though they are still present, they can’t produce a magnetic field. Moreover, if you neutralize the charge, a magnetic field can still be present if those now invisible charges are moving to produce a current: current flowing in a copper wire is electric charges that are moving along the wire and this produces a magnetic field around the wire, but the presence of positive charges fixed to the metal atoms of the wire neutralizes the negative charges of the moving electrons, resulting in a state of otherwise net neutral charge. So, no electrostatic field, even though you have a magnetic field. It might surprise you to know that neutron stars have powerful magnetic fields, even though there are no electrons or protons present in order give any actual electric currents at all. The requirement for moving charges to produce a magnetic field is not inconsistent with the moving charge required to feel force from a magnetic field as well. Admittedly, there’s more to it than just ‘currents’ but I’ll get to that in another post.

With a little bit of algebraic shenanigans, Biot-Savart can be twisted around into a slightly more tractable form called Ampere’s Law, which is one of the four Maxwell’s equations that define electromagnetism. I had originally not intended to show this derivation, but I had a change of heart when I realized that I’d forgotten the details myself. So, I worked through them again just to see that I could. Keep in mind that this is really just a speed bump along the direction toward learning how magnets work.

For your viewing pleasure, the derivation of the Maxwell-Ampere law from the Biot-Savart equation.

In starting to set up for this, there are a couple fairly useful vector identities.

Useful identities 1

This trio contains several basic differential identities which can be very useful in this particular derivation. Here, the variables r are actually vectors in three dimensions. For those of you who don’t know these things, all it means is this:


These can be diagrammed like this:

vector example

This little diagram just treats the origin like the corner of a 3D box and each distance is a length along one of the three edges emanating from the corner.

I’ll try not to get too far afield with this quick vector tutorial, but it helps to understand that this is just a way to wrap up a 3D representation inside a simple symbol. The hatted symbols of x,y and z are all unit vectors that point in the relevant three dimensional directions where the un-hatted symbols just mean a variable distance along x or y or z. The prime (r’) means that the coordinate is used to tell where the electric current is located while the unprime (r) means that this is the coordinate for the magnetic field. The upside down triangle is an operator called ‘del’… you may know it from my hydrogen wave function post. What I’m doing here is quite similar to what I did over there before. For the uninitiated, here are gradient, divergence and curl:


Gradient works on a scalar function to produce a vector, divergence works on a vector to produce a scalar function and curl works on a vector to produce a vector. I will assume that the reader can take derivatives and not go any further back than this. The operations on the right of the equal sign are wrapped up inside the symbols on the left.

One final useful bit of notation here is the length operation. Length operation just finds the length of a vector and is denoted by flat braces as an absolute value. Everywhere I’ve used it, I’ve been applying it to a vector obtained by finding the distance between where two different vectors point:


As you can see, notation is all about compressing operations away until they are very compact. The equations I’ve used to this point all contain a great deal of math lying underneath what is written, but you can muddle through by the examples here.

Getting back to my identity trio:

Useful identities 1

The first identity here (I1) takes the vector object written on the left and produces a gradient from it… the thing in the quotient of that function is the length of the difference between those two vectors, which is simply a scalar number without a direction as shown in the length operation as written above.

The second identity (I2) here takes the divergence of the gradient and reveals that it’s the same thing as a Dirac delta (incredibly easy way to kill an integral!). I’ve not written the operation as divergence on a gradient, but instead wrapped it up in the ‘square’ on the del… you can know it’s a divergence of a gradient because the function inside the parenthesis is a scalar, meaning that the first operation has to be a gradient, which produces a vector, which automatically necessitates the second operation to be a divergence, since that only works on vectors to produce scalars.

The third identity (I3) shows that the gradient with respect to the unprimed vector coordinate system is actually equal to a negative sign times the primed coordinate system… which is a very easy way to switch from a derivative with respect to the first r and the same form of derivative with respect to the second r’.

To be clear, these identities are tailor-made to this problem (and similar electrodynamics problems) and you probably will never ever see them anywhere but the *cough cough* Jackson book. The first identity can be proven by working the gradient operation and taking derivatives. The second identity can be proven by using the vector divergence theorem in a spherical polar coordinate system and is the source of the 4*Pi that you see everywhere in electromagnetism. The third identity can also be proven by the same method as the first.

There are two additional helpful vector identities that I used which I produced in the process of working this derivation. I will create them here because, why not! If the math scares you, you’re on the wrong blog. To produce these identities, I used the component decomposition of the cross product and a useful Levi-Civita kroenecker delta identity –I’m really bad at remembering vector identities, so I put a great deal of effort into learning how to construct them myself: my Levi-Civita is ghetto, but it works well enough. For those of you who don’t know the ol’ Levi-Civita symbol, it’s a pretty nice tool for constructing things in a component-wise fashion: εijk . To make this work, you just have to remember it as I just wrote it… if any indices are equal, the symbol is zero, if they are all different, they are 1 or -1. If you take it as ijk, with the indices all different as I wrote, it equals 1 and becomes -1 if you reverse two of the indices: ijk=1, jik=-1, jki=1, kji=-1 and so on and so forth. Here are the useful Levi-Civita identities as they relate to cross product:


Using these small tools, the first vector identity that I need is a curl of a curl. I derive it here:

vector id 1

Let’s see how this works. I’ve used colors to show the major substitutions and tried to draw arrows where they belong. If you follow the math, you’ll note that the Kroenecker deltas have the intriguing property of trading out indices in these sums. Kroenecker delta works on a finite sum the same way a Dirac delta works on an integral, which is nothing more than an infinite sum. Also, the index convention says that if you see duplicated indices, but without a sum on that index, you associate a sum with that index… this is how I located the divergences in that last step. This identity is a soft stopping point for the double curl: I could have used the derivative produce rule to expand it further, but that isn’t needed (if you want to see it get really complex, go ahead and try it! It’s do-able.) One will note that I have double del applied on a vector here… I said that it only applies on scalars above… in this form, it would only act on the scalar portion of each vector component, meaning that you would end up with a sum of three terms multiplied by unit vectors! Double del only ever acts on scalars, but you actually don’t need to know that in the derivation below.

This first vector identity I’ve produced I’ll call I4:

useful vector id 1

Here’s a second useful identity that I’ll need to develop:

useful vector id 2

This identity I’ll call I5:

vector id 2

*Pant Pant* I’ve collected all the identities I need to make this work. If you don’t immediately know something off the top of your head, you can develop the pieces you need. I will use I1, I2, I3, I4 and I5 together to derive the Maxwell-Ampere Law from Biot-Savart. Most of the following derivation comes from Jackson Electrodynamics, with a few small embellishments of my own.

first line amp devIn this first line of the derivation, I’ve rewritten Biot-Savart with the constants outside the integral and everything variable inside. Inside the integral, I’ve split the meat so that the different vector and scalar elements are clear. In what follows, it’s very important to remember that unprimed del operators are in a different space from the primed del operators: a value (like J) that is dependent on the primed position variable is essentially a constant with respect to the unprimed operator and will render a zero in a derivative by the unprimed del. Moreover, unprimed del can be moved into or out of the integral, which is with respect to the primed position coordinates. This observation is profoundly important to this derivation.

BS to amp 1

The usage of the first two identities here manages to extract the cross product from the midst of the function and puts it into a manipulable position where the del is unprimed while the integral is primed, letting me move it out of the integrand if I want.

BS to amp 2

This intermediate contains another very important magnetic quantity in the form of the vector potential (A) –“A” here not to be confused with the alphabetical placeholder I used while deriving my vector identities. I may come back to vector potential later, but this is simply an interesting stop-over for now. From here, we press on toward the Maxwell-Ampere law by acting in from the left with a curl onto the magnetic field…

BS to amp 3

The Dirac delta I end with in the final term allows me to collapse r’ into r at the expense of that last integral. At this point, I’ve actually produced the magnetostatic Ampere’s law if I feel like claiming that the current has no divergence, but I will talk about this later…

BS to amp 4

This substitution switches del from being unprimed to primed, putting it in the same terms as the current vector J. I use integration by parts next to switch which element of the first term the primed del is acting on.

BS to amp 5

Were I being really careful about how I depicted the integration by parts, there would be a unit vector dotted into the J in order to turn it into a scalar sum in that first term ahead of the integral… this is a little sloppy on my part, but nobody ever cares about that term anyway because it’s presupposed to vanish at the limits where it’s being evaluated. This is a physicist trick similar to pulling a rug over a mess on the floor –I’ve seen it performed in many contexts.

BS to amp 6

This substitution is not one of the mathematical identities I created above, this is purely physics. In this case, I’ve used conservation of charge to connect the divergence of the current vector to the change in charge density over time. If you don’t recognize the epic nature of this particular substitution, take my word for it… I’ve essentially inverted magnetostatics into electrodynamics, assuring that a ‘current’ is actually a form of moving charge.

BS to amp 75

In this line, I’ve switched the order of the derivatives again. Nothing in the integral is dependent on time except the charge density, so almost everything can pass through the derivative with respect to time. On the other hand, only the distance is dependent on the unprimed r, meaning that the unprimed del can pass inward through everything in the opposite direction.

BS to amp 8

At this point something amazing has emerged from the math. Pardon the pun; I’m feeling punchy. The quantity I’ve highlighted blue is a form of Coulomb’s law! If that name doesn’t tickle you at the base of your spine, what you’re looking at is the electrostatic version of the Biot-Savart law, which makes electric fields from electric charges. This is one of the reasons I like this derivation and why I decided to go ahead and detail the whole thing. This shows explicitly a connection between magnetism and electrostatics where such connection was not previously clear.

BS to amp 9

And thus ends the derivation. In this casting, the curl of the magnetic field is dependent both on the electric field and on currents. If there is no time varying electric field, that first term vanishes and you get the plain old magnetostatic Ampere’s law:

Ampere's law

This says simply that the curl of the magnetic field is equal to the current. There are some interesting qualities to this equation because of how the derivation leaves only a single positional dependence. As you can see, there is no separate position coordinate to describe magnetic field independently from its source. And, really, it isn’t describing the magnetic field as ‘generated’ by the current, but rather that a deformation to the linearity of the magnetic field is due to the presence of a current at that location… which is an interesting way to relate the two.

This relationship tends to cause magnetic lines to orbit around the current vector.


This image from hyperphysics sums up the whole situation –I realize I’ve been saying something similar from way up, but this equation is proof. If you have current passing along a wire, magnetic field will tend to wrap around the wire in a right handed sense. For all intents and purposes, this is all the Ampere’s law says, neglecting that you can manipulate the geometry of the situation to make the field do some interesting things. But, this is all.

Well, so what? I did a lot of math. What, if anything, have I gained from it? How does this help me along the path to understanding magnets?

The Ampere Law is useful in generating very simple magnetic field configurations that can be used in the Lorentz force law, ultimately showing a direct dynamical connection between moving currents and magnetic fields. I have it in mind to show a freshman level example of how this is done in the next part of this series. Given the length of this post, I will do more math in a different post.

This is a big step in the direction of learning how magnets work, but it should leave you feeling a little unsatisfied. How exactly do the forces work? In physics, it is widely known that magnetic fields do no work, so why is it that bar magnets can drag each other across the counter? That sure looks like work to me! And if electric currents are necessary to drive magnets, why is it that bar magnets and horseshoe magnets don’t require batteries? Where are the electric currents that animate a bar magnet and how is it that they seem to be unlimited or unpowered? These questions remain to be addressed.

Until the next post…

What is a qubit?

I was trolling around in the comments of a news article presented on Yahoo the other day. What I saw there has sort of stuck with me and I’ve decided I should write about it. The article in question, which may have been by an outfit other than Yahoo itself, was about the recent decision by IBM to direct a division of people toward the task of learning how to program a quantum computer.

Using the word ‘quantum’ in the title of a news article is a sure fire way to incite click-bait. People flock in awe to quantum-ness even if they don’t understand what the hell they’re reading. This article was a prime example. All the article really talked about was that IBM has decided that quantum computers are now a promising enough technology that they’re going to start devoting themselves to the task of figuring out how to compute with them. Note, the article spent a lot of time kind of masturbating over how marvelous quantum computers will be, but it really actually didn’t say anything new. Another tech company deciding to pretend to be in quantum computing by figuring out how to program an imaginary computer is not an advance in our technology… digital quantum computers are generally agreed to be at least a few years off yet and they’ve been a few years off for a while now. There’s no guarantee that the technology will suddenly emerge into the mainstream –and I’m neglecting the DSpace quantum computer because it is generally agreed among experts that DSpace hasn’t even managed to prove that their qubits remain coherent through a calculation to actually be a useful quantum computer, let alone that they achieved anything at all by scaling it up.

The title of this article was a prime example of media quantum click-bait. The title boldly declared that “IBM is planning to build a quantum computer millions of times faster than a normal computer.” Now, that title was based on an extrapolation in the midst of the article where a quantum computer containing a mere 1000 qubits suddenly becomes the fastest computing machine imaginable. We’re very used to computers that contain gigabytes of RAM now, which is actually several billion on-off switches on the chip, so a mere 1,000 qubits seems like a really tiny number. This should be underwritten with the general concerns of the physics community that an array of 100 entangled qubits may exceed what’s physically possible… and it neglects that the difficulty of dealing with entangled systems increases exponentially with the number of qubits to be entangled. Scaling up normal bits doesn’t bump into the same difficulty. I don’t know if it’s physically possible or not, but I am aware that IBM’s declaration isn’t a major break-through so much as splashing around a bit of tech gism to keep the stockholders happy. All the article really said was that IBM has happily decided to hop on the quantum train because that seems to be the thing to do right now.

I really should understand that trolling around in the comments on such articles is a lost cause. There are so many misconceptions about quantum mechanics running around in popular culture that there’s almost no hope of finding the truth in such threads.

All this background gets us to what I was hoping to talk about. One big misconception that seemed to be somewhat common among commenters on this article is that two identical things in two places actually constitute only one thing magically in two places. This may stem from a conflation of what a wave function is versus what a qubit is and it may also be a big misunderstanding of the information that can be encoded in a qubit.

In a normal computer we all know that pretty much every calculation is built around representing numbers using binary. As everybody knows, a digital computer switch has two positions: we say that one position is 0 and the other is 1. An array of two digital on-off switches then can produce four distinct states: in binary, to represent the on-off settings of these states, we have 00, 01, 10 and 11. You could easily map those four settings to mean 1, 2, 3 and 4.

Suppose we switch now to talk about a quantum computer where the array is not bits anymore, but qubits. A very common qubit to talk about is the spin of an atom or an electron. This atom can be in two spin states: spin-up and spin-down. We could easily map the state spin-up to be 1, and call it ‘on,’ while spin-down is 0, or ‘off.’ For two qubits, we then get the states 00, 01, 10 and 11 that we had before, where we know about what states the bits are in, but we also can turn around and invoke entanglement. Entanglement is a situation where we create a wave function that contains multiple distinct particles at the same time such that the states those particles are in are interdependent on one another based upon what we can’t know about the system as a whole. Note, these two particles are separate objects, but they are both present in the wave function as separate objects. For two spin-up/spin-down type particles, this can give access to the so-called singlet and triplet states in addition to the normal binary states that the usual digital register can explore.

The quantum mechanics works like this. For the system of spin-up and spin-down, the usual way to look at this is in increments of spinning angular momentum: spin-up is a 1/2 unit of angular momentum pointed up while spin-down is -1/2 unit of angular moment, but pointed the opposite direction because of the negative sign. For the entangled system of two such particles, you can get three different values of entangled angular momentum: 1, 0 and -1. Spin 1 has both spins pointing up, but not ‘observed,’ meaning that it is completely degenerate with the 11 state of the digital register since it can’t fall into anything but 11 when the wave function collapses. Spin -1 is the same way: both spins are down, meaning that they have 100% probability of dropping into 00. The spin 0 state, on the other hand, is kind of screwy, and this is where the extra information encoding space of quantum computing emerges. The 0 states could be the symmetric combination of spin-up with spin-down or the anti-symmetric combination of the same thing. Now, these are distinct states, meaning that the size of your register just expanded from (00, 01, 10 and 11) to (00, 01, 10, 11 plus anti-symmetric 10-01 and symmetric 10+01). So, the two qubit register can encode 6 possible values instead of just 4. I’m still trying to decide if the spin 1 and -1 states could be considered different from 11 and 00, but I don’t think they can since they lack the indeterminacy present in the different spin 0 states. I’m also somewhat uncertain whether you have two extra states to give a capacity in the register of 6 or just 5 since I’m not certain what the field has to say about the practicality of determining the phase constant between the two mixed spin-up/spin-down eigenstates, since this is the only way to determine the difference between the symmetric and anti-symmetric combinations of spin.

As I was writing here, I realized also that I made a mistake myself in the interpretation of the qubit as I was writing my comment last night. At the very unentangled minimum, an array of two qubits contains the same number of states as an array of two normal bits. If I consider only the states possible by entangled qubits, without considering the phasing constant between 10+01 and 10-01, this gives only three states, or at most four states with the phase constant. I wrote my comment without including the four purely unentangled cases, giving fewer total states accessible to the device, or at most the same number.

Now, the thing that makes this incredibly special is that the number of extra states available to a register of qubits grows exponentially with the number of qubits present in the register. This means that a register of 10 qubits can encode many more numbers than a register of ten bits! Further, this means that fewer bits can be used to make much bigger calculations, which ultimately translates to a much faster computer if the speed of turning over the register is comparable to that of a more conventional computer –which is actually somewhat doubtful since a quantum computer would need to repeat calculations potentially many times in order to build up quantum statistics.

One of the big things that is limiting the size of quantum computers at this point is maintaining coherence. Maintaining coherence is very difficult and proving that the computer maintains all the entanglements that you create 100% of the time is exceptionally non-trivial. This comes back to the old cat-in-the-box difficulty of truly isolating the quantum system from the rest of the universe. And, it becomes more non-trivial the more qubits you include. I saw a seminar recently where the presenting professor was expressing optimism about creating a register of 100 Josephson junction type qubits, but was forced to admit that he didn’t know for sure whether it would work because of the difficulties that emerge in trying to maintain coherence across a register of that size.

I personally think it likely that we’ll have real digital quantum computers in the relatively near future, but I think the jury is still out as to exactly how powerful they’ll be when compared to conventional computers. There are simply too many variables yet which could influence the power and speed of a quantum computer in meaningful ways.

Coming back to my outrage at reading comments in that thread, I’m still at ‘dear god.’ Quantum computers do not work by teleportation: they do not have any way of magically putting a single object in multiple places. The structure of a wave function is defined simply by what you consider to be a collection of objects that are simultaneously isolated from the rest of the universe at a given time. A wave function quite easily spans many objects all at once since it is merely a statistical description of the disposition of that system as seen from the outside, and nothing more. It is not exactly a ‘thing’ in and of itself insomuch as collections of indescribably simple objects tend to behave in absolutely consistent ways among themselves. Where it becomes wave-like and weird is that we have definable limits to how precisely we can understand what’s going on at this basic level and that our inability to directly ‘interact’ with that level more or less assures that we can’t ever know everything about that level or how it behaves. Quantum mechanics follows from there. It really is all about what’s knowable; building a situation where certain things are selectively knowable is what it means to build a quantum computer.

That’s admittedly pretty weird if you stop and think about it, but not crazy or magical in that wide-eyed new agey smack-babbling way.

A Physicist Responds to “The Three Body Problem” part 2

To start with, this post will be almost pure spoiler. I’m assuming, if you got through part 1, that you’ve read Cixin Liu’s book.

I’ve gotten partway through the second book in the trilogy myself, meaning that I’ve had some additional time to think about the contents of this post, but that I don’t know the ultimate outcome of the series.

This post is addressing a central conclusion of the first book, a major piece of science fiction that I didn’t address in the previous post because it is so intrinsic to the plot. This is about the idea of the Sophon induced ‘science lock-down.’ An alien race is going to invade the planet Earth in 400 years and this race is concerned that Human technology will advance in that time to be more powerful than the alien race’s own technology, so the aliens have played a trick to prevent humans from performing fundamental scientific research in order to prevent human technology from developing.

The key of this is the idea of the “Sophon.”As mentioned in the previous post, the word ‘proton’ was chosen over the name of an actual fundamental particle in order to facilitate a wordplay in Chinese… particularly the Chinese word that got translated into English as “Sophon.” This word was chosen from a modification of the word “Sophont.” As any science fiction aficionado can tell you, this word means “intelligent creature.” A Sophon is intended to be an intelligent proton, a robot the size and mass of a subatomic particle. These Sophons are capable to some extent of changing their size and shape and can communicate back to the aliens instantaneously. Sophons can also travel, as subatomic particles, at very nearly the speed of light.

You can see right from that paragraph the first place where the Sophon (and therefore the idea of science lock-down) are broken. Sophons communicate with the aliens instantaneously by means of quantum entanglement. If you’ve read anything else I’ve written, you know how I feel about the cliche of the ‘Ansible.’ Entanglement can’t be used to pass information: the Quantum mechanics doesn’t allow for this, no matter how you misinterpret it. Entanglement means correlation, not necessarily communication. This quantum mechanical effect is an interesting and very real phenomenon, but to understand what it actually means, you need to understand more about the rest of what quantum is… the story of ‘Three Body Problem’ never goes there. I won’t go there either except to suggest learning about the Bell Inequality.

The reason that Sophons are capable of producing science lock-down is because they can falsify data coming out of particle accelerators. Sophons can fly through the sensors in particle detectors and trigger them falsely, creating intelligently designed noise. At the surface, this is a horrible prospect, making it impossible for Humans to probe the deep structure of matter and therefore attain the understandings necessary to build Sophons ourselves. Do not pass go, no ‘correct’ results means no good science!

Obviously, this looks really bad. Very interesting science fiction idea. On the other hand, it also demands a bit of discussion, both about how particle accelerators work and on how science works.

Particle accelerators are the wrecking ball of the scientific enterprise. They generate data almost entirely by accelerating charged particles up to substantial fractions of the speed of light and slamming them into each other and into stationary targets. Particle physicists are all about impact cross sections and statistical probabilities of outcomes. The gold standard of a discovery in particle physics is a 5-sigma observation. ‘Sigma’ is, of course, standard deviation, which is a statistical standard by which scientists use the Gaussian statistical distribution to judge probability of occurrence –it’s the Bell Curve. Average is the peak of this curve, while one standard deviation is either one sigma to the left or right of average. Particle physics is set up around a simple statistical weight tabulation which can be couched as a question: “How likely is it that my observation is false/true?” If an event observed in the accelerator is spurious –that is, if the event is noise– the statistical machinery of particle physics places it close to the peak of the Bell Curve, that is at the average, which is to say that the event observed is ‘not different’ from noise. A 5-sigma event is an event which has been so well observed statistically that the difference from noise is five standard deviations from the peak of the Bell curve out into the tail (99.9999% of the curve’s area is captured within this extent of the tail!) This is essentially like saying that a conclusion is better than 99% certain to be NOT false.

Do you know how big a particle accelerator data set is? They include billions of events. Particle accelerators run for months to years on end, collecting data automatically 24 hours a day. And, the whole enterprise is based on the assumption that every observation independently might be a false outcome. Statistical weight determines the correctness of an observation. Physical theory exists to model both the trends and noise of an experiment.

As I said above, the purpose of the Sophons is to produce false results within the sensors of an accelerator’s detector apparatus. The most major detection devices in the modern systems are calorimeters and photomultipliers. Calorimeters simply detect heat deposition within the sensor volume while photomultipliers give a small current when they are perturbed by a passing electric charge. Usually, detector assemblies contain layers of sensors wrapped around the collision target where photomultipliers form multiple inner layers and calorimeters reside around the outside of the whole assembly. There are usually also magnetic fields applied through the detector so that charged particles will tend to follow curving paths as they pass outward through the different layers away from the collision site. There are other detector technologies and refinements of these ideas, but this gives a basic taste.

Here is the ALTAS detector at the Large Hadron Collider:


Using this layered design, photomultipliers can resolve the path of outward flying particles, determining their charges based upon their path curvature through the magnetic fields established by the solenoids and then the calorimeters determine how much energy was in the particle when that particle heats the calorimeter upon crashing into it. Certain particles types penetrate shields differently, necessitating layers of calorimeters with different structural characteristics in order to resolve different particle types. Computers correlate detection traces between the layers and tabulate what heat depositions relate to which flight paths. Particle physicists can then do simple arithmetic  to count up all the heats and all the charges on all the particles detected for one collision event and deduce which subatomic particles appeared during a particular collision. Momentum and energy/mass get conserved relativistically while charge is directly conserved and you simply add up what went in in order to account for what comes out during a collision.

In order to falsify data within such a detector, the smart subatomic particle, the Sophon, would need to fly back and forth through the detector layers, switching its charge polarity between passes and somehow dumping heat into calorimeters without being destroyed or lost in some way. How the Sophons get their kinetic energy is somewhat opaque in the story and I spent some time abortively rereading the TBP trying to figure this out, but it can be assumed that they possess a self-contained power supply which enables them to either recharge themselves from their surroundings, or simply dip into a long term battery reserve whenever they need it. They are clearly able to accelerate to highly relativistic velocities in a self-contained manner since they flew across the void from the alien homeworld to Earth, and then slowed down without external assistance at Earth. You could presume that they are able to write completely fake collision events into the detector, pretending to travel wrong velocities and masquerading as false charges and masses.

Now, like I said, this is terrible! The experiments can’t always give reliable results. Never mind that the real experiments must always be filtered for the fact that false results exist in the data set anyway.

In the paragraph above, I said “can’t always give reliable results” because the real data set of collision events still exists behind the fake data set. The Sophon flying back and forth can’t prevent real particle collisions from occurring and also interacting with the detector. The particle physicists would actually know right away that something isn’t right with the systematic structure of the experiment because they know how many particles are in their particle beams and also know the cross-sections of interaction, meaning that they start the experiment knowing statistically how many collision events to expect in a unit of time: Sophon interference with the experiment would only increase over the expected number. What you get is two overlapping data sets, one that is false and one that’s true. If the false data is much different from the true data, you inevitably bin them as distinct results because they would create a bimodal distribution to your data set… some measurements add up to five-sigma toward one result while a distinct set will ultimately add up as five-sigma toward something distinctly different. Then, you just let the theorists work out what’s what.

In the story, the scientists just throw up their hands and declare ‘sophon barrier’ saying that science ‘can’t advance’ because it can’t discern correctness.

This prospect has really kind of sat in the back of my mind, nagging me. I’m not completely certain that the author understands the overall scientific mindset or philosophy. Science starts out assuming that all results might be false! Having a falsehood layered on top of other potential falsehoods is really not that deterring to me, particularly since the scientists know the Sophon interference is present by the end of the story. Science as a process is intrinsically concerned with error checking and finding systematic interference, even intelligent fabrication of data within the scientific community –you think the Sophons are bad: somebody simply altering the data set as they see fit, completely independent of the experiment, is worse. And, we deal with this in reality! At least with the Sophons, a real data set must sit behind the mixture of false events. If the data set is merely bimodal or multimodal with statistics backing up each conclusion, you design experiments to address each… at some point, consistency of a result must ultimately dominate. Sorting out this noise would take time, but it would be unable to stop progress overall, especially since the scientists know the noise is present!

Now, giving false data is actually somewhat different than prohibiting data collection. This facet is somewhat unclear to me by the story –my memory fails. You can imagine that the Aliens realize that the humans know about the tampering and rather than leaving humans with a data set that contains some good data, they would simply have their Sophons swamp the detectors. In this, the Sophons fly back and forth within the detector giving so many false events that they prohibit the detector from being able to trigger for the resolution of real events. They could simply white us out!

While this would indeed be a bad thing, it would have a sort of a perverse effect on a real scientist. Consider: you know how fast your instrument triggers and you know the latency required for it to recover… this gives you a measure for how quickly and in what frequency the Sophon must act! You can just imagine the particle beam physicist salivating at the prospect of his Nobel prize in the nascent field of Sophon physics. Imagine the flood of grant proposals around the subject of baiting a Sophon into a particle beam line by the performance of basic science only to try to turn the particle beam against the Sophon in order to smash it apart and see how it works!

Really, if you were a high energy physicist and you knew unequivocally that a smart particle was flying around inside your instrument, how could you not be trying to figure out a way to probe it? It’s like getting Maxwell’s demon handed to you on a shiny platter!

A realistic outcome here is actually not the prohibition of science. It would be an arm-wrestling match with the Aliens: at the very best, leaving us with a partial data set that we can ultimately advance with, or giving us the chance to probe the Sophons directly.

The prospect of probing the Sophons directly contains the danger that it would be hard to distinguish engineered results from real ones, but every demonstration by the Sophons of some other confusing behavior is in fact data itself. The author made a huge argument in “Three Body Problem” that Sophons are typically point-like and would probably subscribe to the notion that they can’t be probed since they would essentially have no collision cross-section; I would resist this idea because it either violates or misunderstands quantum mechanics, which I detailed a bit in the previous post. The author might even suggest that Sophons can’t be probed because they can dodge collisions with other particles in the collider, but I would doubt that simply because of the inability for the Sophon to know things about other particles due to simple quantum mechanics and the affect of relativity altering the rates of information flow: the decision would need to be made very quickly and it would have a built in imprecision from Uncertainty! Moreover, the more time the Sophons spend performing confusing behavior in order to foil their own direct examination, the less time they can spend faking data in the experiments directed at basic research. As you may be aware, machines like the LHC are actually devoted to many lines of research simultaneously and physicists are remarkably adept at piggybacking one experiment on top of another in order to conserve resources and obtain additional bang for the same buck.

One final aspect of the “science lock-down” which I take some umbrage with is the notion that only particle accelerators are responsible for fundamental research. They aren’t. There is a huge branch of physics and chemistry probing quantum mechanics based on spectroscopy. Lasers are unequivocally a quantum mechanical device and much probing into basic quantum mechanics is performed by some variation on the theme of lasing. The Nobel prize winning discovery of the Bose-Einstein condensed matter phase did not occur in a super-collider, it occurred on an optical bench. Most super precise clock mechanisms used by the human race at this point are optical devices and have absolutely nothing to do with particle accelerators –optical gratings and optical metrology are driving the expansion of precision measurement! The leaps which are in the process of producing quantum computers (one device the author specifically prohibits in book 2 under the science lockdown!) are not being made at particle accelerators at all: they are being made in optical lattice traps on lab benches and in photo-etched masks used to produce nano-scale solid state resonators. We are currently in the process of building analog quantum computers for the purposes of simulating quantum chromodynamic systems using optical and nano-resonator devices… and this development has nothing to do with particle accelerators, except as a means of reproducing results! The author made the argument that humans couldn’t build massive super-collider accelerators, Synchrotrons and Linacs, fast enough to match the production capacity that the Aliens have for making the sophons needed to foil these instruments, but the author never even touched on the rapidly expanding field of plasma wake field acceleration, which uses lasers to accelerate particles to relativitistic speeds in bench-top apparatuses for a fraction of the price of a super-collider.

The bleeding edge of physics is very multi-pronged; the Higgs boson discovery carried out in a synchrotron may someday be reproduced by a bench-top plasma wake field accelerator for a tiny fraction of the price. Can ‘locking down’ big particle accelerators like the LHC prohibit the extensive physical exploration that is occurring due to a mostly unrelated black swan technological development like lasers? I really don’t think it can. Tying one arm behind your back leaves you with the other arm. It’s true that the mothballing of the superconducting super-collider in the United States prevented humans from definitively discovering the Higgs boson for more than a decade, but that isn’t to say that there aren’t other avenues to the same discovery.

Do I think that science lockdown is possible by the means suggested by the author? Not really. And, especially not for devices like quantum computers, which is one critical development that the author suggests is prohibited by sophon interference in the second book.

Don’t get me wrong, this is a good piece of science fiction and it’s a wonderful thought experiment, but like many thought experiments, it’s arguable.

edit 2-16-17

I saw a physics colloquium yesterday delivered by a Nobel prize winner. His lab is currently working on a molecular spectroscopy experiment directed at measuring the electric dipole moment of the electron. A precision measurement of this value ties directly to the existence (or not) of supersymmetric particle theory… which is one candidate expansion of the Standard Model of particle physics. This experiment is not being done in a super collider, but on an optics bench for a fraction of the price. Experiments like this one completely invalidate the thesis of Three Body Problem: that by locking down colliders that there is no other way for particle physics to advance. There are other ways that are comparatively cheap and requiring less resources and manpower. Physics would find a way.

A Physicist Responds to “The Three Body Problem”

I’ve not had much motivation to post recently: it seems like I read another article every week or so where some fool is making the same wrong conclusions about Quantum Mechanics or Relativity or AI, or all of the above, simultaneously. It gets exhausting to read. I also haven’t had time for constructing a post on my recent problem work in part because I’m prepping for a major exam.

But, I need some time to take a break and change my focus. So, I decided to write a bit about some things I saw in Liu Cixin’s “The Three-Body Problem” of which I read Ken Liu’s translation. If you’re not familiar with this book, I would highly recommend it. This book deservedly won the Nebula and Hugo awards –both– and it is one piece of science fiction that is truly worth going through.

One of my non-spoiling responses here is that it shows how another culture, namely the Chinese culture, can go to extremes with how it treats Scientists and Intelligentsia and all the different ways that this relationship can oscillate back and forth. It shows too the humanity of scientists, both for better and worse. Based on the structure of the story, it’s clear to me that the author has respect for the scientific disciplines which is usually not so present in western literature anymore. I was also quite happy that characters were not meaninglessly fed to the meat grinder in the way they are too often in many western books in the supposed name of ‘authenticity.’

With that said and my badge of worthiness placed, we will get to the actually purpose of this post… some places where Liu Cixin’s Science fiction Authoritis shows through.


The great problem with many science fiction writers is that they know just enough to be dangerous, but not enough to be right. Where they fall apart is when they start to over-explain the phenomenology of what’s happening in their stories in order to ‘make it work.’ There are two places I will talk about where this happened in ‘3BP’.

The first is the Zither.

To start with, I loved the idea of the zither. It was a very classy, ingenious use for the cliche of the monofilament wire. Note first that this is a cliche (a ‘trope’ maybe, but I detest that word for its cliche overuse). In the form that appeared in 3BP, nanomaterial monofilament 1/1000th the thickness of hair is strung in strings like a zither between pilings across a straight section of the Panama canal as an ambush trap for an oil tanker being used by the villains. The strings are strung between the banks of the canal attached to chains that can be raised and lowered so that ships which aren’t the target can be allowed through the canal unhindered. When the target ship approaches, the monofilaments are pulled up across the canal by tightening the chains such that the filaments are held in an invisible web of horizontal strands above water line, spaced from each other by only a few feet, like a big hardboiled egg slicer. The author even makes allowances for how the monofilaments can be attached to the chains so as not to shred the anchoring when the target ship pushes against them. When the ship hits the zither, it sails silently through and continues on until the engine of the ship rips itself to pieces and causes the whole boat to slide apart in sections.

You have to admit, it’s a nifty trap. The monofilament in question is described as a material intended for use building orbital elevators and is dubbed ‘nanotechnology’ by the story.

The great stumbling point most people have about nanotechnology is that it is not tiny without limit since it exists in a scale gap of less than 1 micron and more than 1 nanometer. For comparison, hair is about 100 microns and the length of a carbon-carbon sigma bond is about 0.1 nanometers; the zither monofilaments are therefore about 100 nanometers. This is sort of a crossover regime where building structures by top-down bulk techniques, like photo etching, becomes hard, while building from bottom-up by chemistry is also hard. In general, this is into a big enough scale where quantum mechanical effects become small and statistical mechanics tends to dominate manipulation. At the nanoscale, everything we understand about how the basic level of material stuff holds together remains true. In a way, nanoscale is small, but not so small that objects are markedly described by quantum mechanics, but also not so big that they behave like bulk objects. That’s why ‘nano’ is difficult: it sits at an uncomfortable seam between classical and quantum universes where the tools for one or the other aren’t quite right for doing what needs to be done.

Cutting material is by a process called scission. The act of ‘scission’ is, by definition, the breakage of a long chain molecule into two shorter chain molecules. It means separating at least one chemical bond in order to free a unified mass into two independent parts. And, a chemical bond always has at least two electrons since the bond state must consist of spin-up and spin-down parts in order to cancel out angular momentum… and that’s pretty much the theme of chemistry: stable states mostly have angular momentum canceled. There are some special exceptions, but these do not define the rule. Still, since you can’t subdivide an electron, splitting a bond means intact electrons residing somewhere who are no longer in a quantum mechanical ground state and also nuclei lacking complete valence shells. This means that the system, immediately after scission, will have a strong desire to rearrange by chemical reaction into a more stable state. What will it react with? Whatever is close by… in this case, the monofilament wire! This kind of process is part of why blades dull over time: for a conventional metal knife cutting a metal structure, the structure is literally ‘cutting’ the knife too and blunting its edge. With a nanofiber, there isn’t much mass to wear away.

This is one of the difficulties in scaling up nanotechnology: they usually become fragile!

Overlooking this fragility issue, one can argue that the process of making this nanofiber yielded a structure that is exceptionally strong and perhaps robust to chemical processes occurring around it. This is presumably what you would want in such a material that would be useful for building orbital elevators. If you want a tether from Earth up into orbit, you could bundle many of these fibers together and add coatings on the surface to help render them inert to chemistry. Many materials used in construction of advanced structures work in a manner like this: you’ve certainly heard of “Composites!”

Now then, singling one of these fibers out and stringing it across the Panama canal produces a second major issue. The energy necessary to allow the zither to slice apart the ship comes from the kinetic energy of ship coasting along the water way: the ship hits the zither and the monofibers of the zither redirect parts of the ship infinitesmally from each other so that their tensile strength is not great enough to resist going different directions from each other… causing them to rend apart microscopically. This redirection is arrested because the parts separated from one another can’t pass through the bulk materials holding them in place. This ‘motion’ is then completely incoherent and can only be tabulated as heat deposited into the material bulk at the location of the nanofilament. So, part of the kinetic energy of the ship’s motion is deposited as heat around the monofilament cut. This might not be quite a huge problem but that the monofilament has an intrinsically tiny mass and therefore a miniscule heat capacity: its electrical structure has relatively few valence modes where it can stuff higher energy vibrational states. Moreover, the fiber is located at the origin of the heat and the materials heating up surround it from all sides, so there is no other place where the fiber can dump heat except linearly along its own body. If the heat doesn’t dissipate through the hull of the ship fast enough, how hot can the fiber get before its electrical structure starts sampling continuum states? However tough the fiber is, if it can’t dump the heat somewhere, its temperature might well rise until it literally ionizes into a plasma. For such tiny mass, only a little heat input is a substantial thing.

This is a difficulty, but one a clever writer can probably still explain away (maybe better left as a black box). You might argue that the fiber can cope with this abuse by conducting the heat along its length and then radiating it into the air or emitting it as light. That might work, I suppose, but it would mean increasing complexity in the structure of the nanomaterial. Not an impossibility, but now the fiber glows at least as a black body and is no longer invisible! For anybody familiar with super-resolution microscopy, emission of light can make visible objects tinier than the optical resolution limits.

Maybe the classiest way would be to convert the fiber into a thermoelectric couple of some sort and get rid of the heat using an electrical current. Some of the well known modern nanofibers, the fullerenes and such, are also very good electrical conductors because of their bonding structures. In reality, this would also probably limit the cutting rate: the rate of heat deposition in the line must not exceed the rate at which the cooling mechanism can suck heat away! An unfortunate fact about very thin conductors is that their resistance tends to be high, meaning that the conduction rate goes up as the channel of the conductor is thickened… and you are unfortunately crippled by using a nanofiber, which is very skinny indeed. I won’t mention superconductors except to say that they have a limited range of temperatures where they can superconduct… using a superconductor in a thermoelectric couple is asking for trouble.

My big complaint about the zither boils down to that: heat and wear. Because of the difference of the applications, a material which is suitable to the purpose of building an orbital elevator is not necessarily suitable to building a monofilament cutter. I would also offer that a real monofilament cutter would be specifically engineered to the task and not a windfall of a second technology. The applications are just too different and don’t boil down to merely ‘strength’ and ‘tiny size.’

Having addressed the zither, I’ll talk about a second major point which suffered from too much description and too little plausibility. I’ll try to describe this part of the story without giving away a major plot point.

In this section of the story, someone is trying to use a colossal factory hovering in orbit above the planet to take a proton and expand it from a point-like object into a three dimensional structure. The author makes the case that a simple object, like a proton, which is essentially point-like when viewed from our place in spacetime, is actually an object with extensive higher dimensional structure and that some technological application can be carried out where this higher dimensionality can be expanded so that it can be manipulated in our three dimensional space. He even makes the case that these higher dimensions contain considerable volume and may be big enough to harbor entire universes. As he repeatedly emphasizes, a whole universe of complexity, but only a proton’s worth of mass.

To start with, I have little to say about the string theory. For one thing, I don’t really understand it. A major argument in string theory is that the tiniest bits of space in our universe can actually have seven or eight additional dimensions hidden away where we three dimensional creatures can’t see them. Perhaps that’s true, but as yet, string theory has made no predictions that have been verified by experiment. None!

From the standpoint of a person, it’s certainly true that a proton might seem point-like, but this is actually false! Unlike an electron, which is truly dimensionally point-like for all that physics currently understands of it, a proton has a known structure that occupies a definable three dimensional volume. The size here is tiny, at only about 10^-15 meters, but it is a volume with a few working parts. A proton is constructed of two “Up” quarks and a “Down” quark that are held together by nuclear strong force (making the proton a baryon with spin 1/2, and so abiding Fermi statistics).

I have considered that perhaps the application of a ‘proton’ in the story is perhaps a missed translation and that the author really wanted a dimensionless particle like a quark (which are never observed outside of particulate sets of two or three) or an electron (which can be a free particle). After writing the previous sentence, I spent some time looking at translator notes for this book and I found that the choice of the Chinese word for ‘proton’ facilitated a word play in the author’s native language that did not quite translate to English. I won’t detail this word play because it gives away a plot point of the book that is beyond the scope of what I wish to write about. A lesson here is that the author’s loyalty is definitely toward his literature above scientific truth.

One significant issue that must be brought up here is that ‘point-like’ is a relative description when you start talking about particles like these. An electron is fundamentally point-like, but it is also quantum mechanical, meaning that they tend to occupy finite volumes in space that vary quite strongly depending on the shape and boundaries of that space, as given by the wave function. Reaching in and ‘grabbing’ the electron reveals what appears to be a point, but that ‘point’ can be distributed in non-intuitive ways across the volume it occupies. We have no real capacity to describe that it has a shape and one might certainly consider that ‘point-like’ dimensionless object to be a singularity in exactly the same way that a black hole is a singularity. I have half a mind to say that the only reason an electron is not a black hole is because the diameter of the volume it occupies as described by the uncertainty principle is larger than its Schwarzschild radius. This statement is limited by the fact that Quantum Mechanics doesn’t play well with General relativity and the limits of the Schwarzchild radius may not coincide with the limits of the Uncertainty principle –both are physically true, but they each have a context where they are most valid and no unifying math exists to link one case directly to the other.

Now then, in 3BP, a point-like elementary particle with the mass and dimensionality of such a particle is shifted by a machine so that its higher dimensional properties are exhibited as a proportionate volume or geometric shape in three dimensions. In the first flawed experiment, the particle expands into a one dimensional thread which snaps off and comes wafting down everywhere onto the planet in nearly weightless tufts that annoy everybody. After the author spent such a time laboring over the invisible nature of a monofilament wire, he decided that a one dimensional thread could be visible! Note, a monofilament wire has a small but finite width, while a one dimensional line has no width at all! Which is ‘thinner?’ The 1D line is thinner by an infinite degree!

In the next flawed experiment, the higher dimensions of the point-like particle turn out to contain a super-intelligent civilization which realizes that the particle where they reside is about to be destroyed during the experiment. This civilization distends the structure of their particle into a huge mirror which they then use to focus the sunlight as a weapon onto the surface of the planet in order to attack their enemy, who they recognize to be the scientists running the experiment, and they start leveling cities! This is creative writing, but the author makes the explicit point that the mirror-structure formed from the elementary particle, while big, has only the mass of that particle, which is infinitesimal. If you’re versed in physics, you’ll see the first problem: light has momentum (Poynting vector!). When you reflect a beam of light, you change the direction of the momentum in that light. Conservation of momentum then requires the existence of a force causing the mirror to rebound. Reflecting enough light to thermally combust a city is a large intensity of light, easily megawatts per square meter. An electron has a minuscule mass at about ~10^-31 kilograms (0r 10^-27 if you insist on it being a proton). Force equals mass times acceleration and pressure equals force per area where light intensity can be easily converted to pressure and pressure to force. When you rearrange Newton’s second law to solve for acceleration, the big ‘force’ number ends up on top while tiny ‘mass’ number ends up on the bottom of the ratio, giving a catastrophically huge number for the value of the acceleration (conservatively on the order or 10^20 or 10^30 m/s^2 given intensity on the scale of only watts/m^2 where the mirror is only a square meter). That’s right, the huge mirror with the mass of a ‘proton’ accelerates away from the planet at a highly relativistic rate the instant light bounces off of it!

Yeah, I know, physicists and science fiction authors don’t often get along even though they both pretend to love each other.

I had significant problems with the idea of making a single electric charge into a reflective surface, but I’ve rewritten this point twice without being satisfied that the physics are at all instructive to my actual objection. In a real reflective surface, like a mirror, the existence of the reflected light wave can be understood as coherent bulk scattering from many scattering centers, which are all themselves individual charged particles. In this sort of system, the amount of reflected wave quite obviously depends on the amount of charged surface present to interact with incoming waves. The amount of surface available to reflect is conceptually dodgy when you’re talking about only a single charge, no matter how big of an area this charge is spread out to cover. This is why a half-silvered mirror reflects less intensity than a fully silvered mirror. Though I have failed in my own opinion to encapsulate the physical argument well, an individual charge has a finite average rate at which it can exchange information with the universe around it and reflecting photons en mass is an act of exchanging a great deal of information for such a tiny coupling. Since the quantum mechanics of scattering depends on a probability of overlap, the probability of simultaneously overlapping with many photons is small for only a single charge. The number densities are overwhelmingly different.

All said, the mirror is likely a very transparent mirror unless it has more than one charged particle’s worth of charge.

Despite all this analysis, I don’t believe that it detracts from the story. I really didn’t mind the flight of fancy in a well written piece of fiction. It’s unlikely that the casual reader will ever care.

The Difference Between Trees and Rocks

This post is in response to a Flat Earther youtube video entitled “There are no forests on Flat Earth Wake Up.” I won’t link directly to this video because I refuse to help provide it with traffic.

I first happened across a description of this video in an article from The Atlantic. At the time, I sort of sat there and fulminated as I read it. That article in and of itself was not enough to stimulate a response from me because there’s really not much to say. Flat Earth believers are a train wreck of misconception and arrogance. They do not deserve acknowledgement for their ideas except to say that they are not merely wrong, but willfully contrarian to reality.

There is no arguing with a Flat Earther.

Fact is that such a person is so invested in a bad idea that they cannot be dissuaded from it. There are so many things that happen or are happening around you all the time that provide evidence against the flat earth that you need only open your eyes to see them. It takes a willful investment in the avoidance of reality to believe in a flat earth. You can look back at my response to a set of flat earth claims to know my general thoughts.

The video I mentioned above goes a step beyond the usual flat earth nonsense and makes the rather extravagant claim that there used to be forests on earth where the trees are miles tall and that land features like mesas or volcanic plugs like Devil’s Tower are stumps left from these huge trees. And, further, at some point those trees were all toppled and that the ‘man’ has a conspiracy going to cover up that they ever existed. Scientists are apparently actively complicit in hiding ‘the truth’ by distorting findings about fossils.


Devil’s Tower is a striking piece of landscape. I’ve seen it for myself and it is visceral and impressive. The structure is sort of biological after a fashion, I will admit. It does look like a tree stump. However, making the claim that an object has a biological form is not the same as claiming the object is biological. Nature has an incredible repertoire of mechanisms for producing complicated patterns that are absolutely not biological.

How was the following pattern constructed?


Tell me what you think this is! I know what it is, but I’m not going to identify it right away. Is it biological? Is this in an art museum? What do you think? More than that, how would you go about figuring out what this is? Think about it while you read.

The video I mentioned above goes on and on about things looking like other things actually being the other thing. That video is an hour and a half of blanket assertion. I admittedly could only stomach about 20 minutes of the video before it became completely clear that I wasn’t about to encounter anything resembling reality at any point along the way. Watching it all the way through is a waste of time… it should chill one to the bone that the number of ‘likes’ on this video is in the hundreds of thousands. Do that many people really get stuck on this topic?

The first thing you’ll note about that video is that the narrator very frequently says “This is bullshit” or “That’s bullshit!” Does an assertion of falsehood uproot a truth? He characterizes claims made by scientists using the words “Contrary to all laws of Physics, Chemistry and Biology.” What are those laws? What does science actually say? How do you know when a scientist is contradicting the ‘laws of science?’ You have to know what the science is, right? He goes on at length showing goofy pictures of apparently inept scientists while attacking the notion of fossilization, that a biological relic can be subsumed into a route of decomposition where the carbon structure is replaced by a long-term silicon structure.

Of course, in order to justify his mile-tall trees, he needs to completely throw out the window basically everything known about geology. His mile-tall trees weren’t actually carbon, but silicon (never mind that his entire treatise started out on the assertion that everything that’s left of these trees is carbon trapped in ice: carbon, silicon, carbon, iron, apparently self-consistency isn’t required in the rarefied atmosphere he inhabits)… and that relics of these huge trees are stumps formed by mesa-like mountains or that fossil trees from petrified forests are actually branches from some huge silicon tree. Early on, he makes the claim that trees produce a constant current of electricity (which is false) and that there was a silicon era (never mind that there is no such thing as silicon based life… that we know of on Earth. And, no, diatoms are not silicon based).

Coming back to Devil’s tower, he spends a huge amount of time claiming that there’s no way the structure of the tower could be naturally occurring without the patterning provided by life because it’s far too regular. If you look closely at the tower, it has this fascinating hexagonal columnar structure that almost looks built rather than deposited.


As he was marveling at Devil’s Tower and how the structure is inexplicable, I turned him off…

Let’s consider this one particular claim and distinguish how an actual scientist thinks in contrast to the nonsense put forth by this crank. The claim is that there’s no way a non-biological process can produce regular hexagonal column structures of the size seen at Devil’s tower. Claims by geologists that these structures are rock formed from lava are therefore ‘bullshit.’ I do hear scientists use the word ‘bullshit’ once in a while, but here’s the difference. The crank says ‘the structures are too big and too regular, therefore they had to have been made from a tree.’ On the other hand, a scientist would say this: ‘These structures are very big and very regular, I do not accept that they were made without the patterning provided by life, but I would change my mind about this if I could find an example of this kind of structure where I know the patterning is by a non-living process.’

Jumping to the money shot, one obvious candidate is crystallization. This process is well known to make geometrical inorganic shapes and it is understood that it happens spontaneously. Crystallization has a hefty contact to physics, chemistry and biology and there is huge literature of it outside of scientific fields. This is, of course, where gemstones come from. The objects in Devil’s Tower look very much like crystals. Can crystals become that large? Can they bend like the fluting of a tree trunk?

With Devil’s Tower in mind, I went to Google and performed an image search looking for ‘large industrially produced crystals.’ How big can crystals be made? This turned up a company by the name of Cleveland Crystals which produces large crystals:


So, first off, crystals can be made that are ‘big.’ How big is big enough? Can it be scaled up without limit? There’s no reason to think not. The website for the company says pretty clearly that there is a correlation between the size of the crystal and the time it took to form.

Now, second, if crystals are ‘made’ by a company, does that mean that nature can’t also make crystals? Certainly a valid question since humans almost certainly caused the structures in the picture above to exist. Maybe nature can’t make them that big.

I therefore did an image search for ‘large natural crystals.’ Which produced this:


This is found in a mine in Mexico.

Do I believe that crystals can be big? Clearly they can be. But, are those things in Devil’s Tower crystals?

I then started to search for natural crystals that are hexagonal in cross section that look like rocks:


This is a mineral called aquamarine. One rapidly descends into mineralogy at some point, necessitating at least some cursory respect for geology.

Now, I have big hexagonal crystals. But do they bend like the gentle curvature seen in Devil’s Tower? I mean, crystals are renown for their geometric straightness, so maybe the failure would be if crystals don’t bend.

A quick search gave me this example in Quartz:


As it turns out, crystal lattices do have the ability to deform their dimensions over long distances.

What I have now is this. There’s a process called ‘crystallization’ which is totally non-living that produces big, patterned objects that can have hexagonal, geometric cross sections that can be slightly bent all while still looking like rock. Crystallization is well known to be spontaneous and to not depend on the presence of life, even if it can occur in a factory. ‘Crystallization’ is a bit of a leap because I was simply fishing for non-living processes that can produce large, geometrically patterned objects. A bundle of crystals could conceivably be piled together into a formation like a tree stump.

So then, is Devil’s Tower a crystal formation? If it’s from a living thing, you should be able to walk over to it and break off a piece to look for biological cells… in reality, if you look at a piece of Devil’s Tower under the microscope, you would find no cells and if you put it into a mass spectrometer, you would find minerals, maybe like the ones above. There is even a testable model for how a structure like Devil’s Tower might form… it would be like a much longer term version of the conditions that happen in the factory at Cleveland Crystals, but just sitting out in the world. You could melt rock of similar chemical composition to Devil’s Tower in a crucible shaped like a tree stump and then set the crucible in conditions that support crystallization. Would it then spontaneously crystallize so that the crystals filled a volume shaped like a stump?

Notice, there are details that can be chased as long as you keep asking logical questions. A scientist will say, “I know this and this and this, but I’m not quite sure about that.”

Here’s the big difference between the scientist and the crank. The crank decided ahead of time that the formation was too *whatever* to have occurred by any means other than his preferred crankery. The scientist may start with a similar idea to the crank, but he’s got to include ‘falsification’ in his process (either directly by his own hand, or by peer review). Falsification is a loop hole that you must always add which gives you some way of being able to change your mind if better evidence or explanations come along. What evidence would I have to find in order to prove this theory wrong? A big part of the scientific method is deliberately trying to knock a theory down, to falsify it. In the case of Devil’s Tower, a crystal forming process might well have created the observed pattern, so the Tower isn’t necessarily a biological product. Since other processes exist which can produce the same outcome, the “huge tree” hypothesis is in immediate jeopardy as one among competing theories –Occam’s razor would give an adequate coup de gras to finish the argument right here since the “huge tree” theory can’t support all the evidence that the full field of geology can throw at it. But, if you’re stubborn and absolutely certain that the Tower is biological in origin, you would have to look and see if it has a biological fabric… if it has no fundamental biological structure, like evidence of cells, then it can’t be a living product and the hypothesis that it’s the stump of some huge tree must be discarded. Eventually, the combined weights of Biology and Geology would crush this fanciful little pet theory.

This may confuse some people. I’m saying that a necessary core of the scientific method is that you must go out and look for evidence that disproves your thesis. With a lot of science, it doesn’t look like this is happening anymore, which is why certain science is called ‘settled.’ The creationist will say “I’m trying to attack a hypothesis: I’m offering evidence that shows that Evolution is wrong.” The Flat Earther who made the video will say “Everything in geology is bullshit: don’t you see all the explanations I’m offering?” Even an antivaxxer will say “If you’re so confident in vaccines, why aren’t you still testing to see if they cause autism?” To many cranks, science looks like this united party who thoughtlessly discards every challenge to the hallowed orthodoxy. If science is based on tearing down accepted theories, why won’t they test my version?

In some ways, certain parts of science take on the aura of a hallowed ground. This is the result of the last generation of active theories weathering all the assaults waged against them… scientists have tried for decades to knock old theories down and offered modifications to strengthen those theories wherever an attack succeeded. As a result, the old theories became the modern theories and their weaknesses vanished. The fights occurring between scientists to falsify modern theories happen at a level above where most of the public and laymen are competent to contribute. You have to pick your fights, and if you’re smart, you understand not to pick a losing fight! In most cases, cranks are not seeing that the relevant fights have already been long since fought. The young earth creationist is typically attacking science where the fight was settled about a hundred years ago: any scientifically justifiable modification to the modern theories that would work better than Darwin’s evolution inevitably still looks too much like evolution to do anything but offend creationist sensibilities, making it a losing fight. The Flat Earther in the video needs literally to throw out the entire geology textbook and the last five hundred years of human history to get to where he has a competent fight, which means he may as well be headbutting a 10 ton granite rock. Antivaxxers are fighting a science that is more recently settled, ten years or twenty years, but settled –at some point, you can’t keep testing a discarded hypothesis. The climatology that global warming deniers question is very fresh and still contains questions, but certain parts are as settled as heliocentricism.

To contribute to science, you must be at the level of the science! Crankery often hinges on not merely willful ignorance, but on someone not understanding the limits of what they understand.

What did you think that pattern was in the mystery picture I posted above? The material depicted is also a kind of crystal, but its a type of cholesteric liquid crystal, meaning that the pattern formed spontaneously and is not biological in nature. Did you guess what it was? How easy is it to look at a pattern and be wrong about what you’re seeing? Human perception is fragile and easily fooled.

Beyond F=ma

Every college student taking that requisite physics class sees Newton’s second law. I saw it once even in a textbook for a martial art: Force equals mass times acceleration… the faster you go, the harder you hit! At least, that’s what they were saying, never mind that the usage wasn’t accurate. F=ma is one of those crazy simple equations that is so bite-sized that all of popular culture is able to comprehend it. Kind of.

Newton’s second law is, of course, one of three fundamental laws. You may even already know all of Newton’s laws without realizing that you do. The first law is “An object in motion remains in motion while an object at rest remains at rest,” which is really actually just a specialization of Newton’s second law where F = 0. Newton’s third law is the ever famous “For every action there is an equal and opposite reaction.” The three laws together are pretty much everything you need to get started on physics.

Much is made of Newton’s Laws in engineering. Mostly, you can comprehend how almost everything in the world around you operates based on a first approximation with Newton’s Laws. They are very important.

Now, as a Physicist, freshman physics is basically the last time you see Newton’s Laws. However important they are, physicists prefer to go other directions.

What? Physicists don’t use Newton’s Laws?!! Sacrilege!

You heard me right. Most of modern physics opens out beyond Newton. So, what do we use?

Believe it or not, in the time before computer games, TVs and social media, people needed to keep themselves entertained. While Newton invented his physics in the 1600s, there were a couple hundred years yet between his developments and the era of modern physics… two hundred years even before electrodynamics and thermodynamics became a thing. In that time, physicists were definitely keeping themselves entertained. They did this by reinventing the wheel repeatedly!

As a field, classical mechanics is filled with the arcane formalisms that gird the structure of modern physics. If you want to understand Quantum Mechanics, for instance, it did not emerge from a vacuum; it was birthed from all this development between Newtonian Mechanics and the Golden years of the 20th century. You can’t get away from it, in fact. People lauding Quantum Mechanics as somehow breaking Classical physics generally don’t know jack. Without the Classical physics, there would be no Quantum Mechanics.

For one particular thread, consider this. Heisenberg Uncertainty Principle depends on operator commutation relations, or commutators. Commutators, then, emerged from an arcanum called Poisson brackets. Poisson brackets emerged from a structure called Hamiltonian formalism. And, Hamiltonian formalism is a modification of Lagrangian formalism. Lagrangian formalism, finally, is a calculus of variations readjustment from D’Alembert’s principle which is a freaky little break from Newtonian physics. If you’ve done any real quantum, you’ll know that you can’t escape from the Hamiltonians without tripping over Lagrangians.

This brings us to what I was hoping to talk about. Getting past Newton’s Laws into this unbounded realm of the great Beyond is a non-trivial intellectual break. When I called it a freaky little break, I’m not kidding. Everything beyond that point hangs together logically, but the stepping stone at the doorway is a particularly high one.

Perhaps the easiest way to see the depth of the jump is to see the philosophy of how mechanics is described on either side.

With Newton’s laws, the name of the game is to identify interactions between objects. An ‘interaction’ is another name for a force. If you lean back against a wall, there is an interaction between you and the wall, where you and the wall exert forces on one another. Each interaction corresponds to a pair of forces: the wall pushing against you and you pushing against the wall. Newton’s second law then states that if the sum of all forces acting on one object are not equal to zero, that the object will undergo an acceleration in some direction and the instantaneous forces then work together to describe the path the object will travel. The logical strategy is to find the forces and then calculate the accelerations.

On the far side of the jump is the lowest level of non-Newtonian mechanics, Lagrangian mechanics. You no longer work with forces at all and everything is expressed instead using energies. The problem proceeds by generating an energy laden mathematical entity called a ‘Lagrangian’ and then pushing that quantity through a differential equation called Lagrange’s equation. After constructing Lagrange’s equation, you gain expressions for position as a function of time. This tells you ultimately the same information that you gain by working Newton’s laws, which is that some object travels along a path through space.

Reading these two paragraphs side-by-side should give you a sense of the great difference between these two methods. Newtonian mechanics is typically very intuitive since it divides up the problem into objects and interactions while Lagrangian mechanics has an opaque, almost clinical quality that defies explanation. What is a Lagrangian? What is the point of Lagrange’s equation? This is not helped by the fact that Lagrangian formalism usually falls into generalized coordinates, which can hide some facets of coordinate position in favor of expedience. To the beginner, it feels like turning a crank on a gumball machine and hoping answers pop out.

There is a degree of menace to it while you’re learning it the first time. The teaching of where Lagrange’s equation comes from is from an opaque branch of mathematics called the “Calculus of variation.” How very officious! Calculus of variation is a special calculus where the objective of the mathematics is to optimize paths. This math is designed to answer the question “What is the shortest path between two points?” Intuitively, you could say the shortest path is a line, but how do you know for sure? Well, you compare all the possible paths to each other and pick out the shortest among them. Calculus of variations does this by noting that for small variations from the optimal path, neighboring paths do not differ from each other by as much. So, in the collection of all paths, those that are most alike tend to cluster around the one that is most optimal.

This is a very weird idea. Why should the density of similar paths matter? You can have an infinite number of possible paths! What is variation from the optimal path? It may seem like a rhetorical question, but this is the differential that you end up working with.

A recasting of the variational problem can express one place where this kind of calculus was extremely successful.


Roller coasters!

Under action of gravity where you have no sliding friction, what is the fastest path traveling from point A to point B where point B does not lie directly beneath point A? This is the Brachistochrone problem. Calculus of variations is built to handle this! The strategy is to optimize a path of undetermined length which gives the shortest time of travel between two points. As it turns out, by happy mathematical contrivance, the appropriate path satisfies Lagrange’s equation… which is why Lagrange’s equation is important. The optimal path here is called the curve of quickest descent.

Now, the jump to Lagrangian mechanics is but a hop! It turns out that if you throw a mathematical golden cow called a “Lagrangian” into Lagrange’s equation, the optimal path that pops out is the physical trajectory that a given system described by the Lagrangian tends to follow in reality –and when I say trajectory in the sense of Lagrange’s equation, the ‘trajectory’ is delineated by position or merely the coordinate state of the system as a function of time. If you can express the system of a satellite over the Earth in terms of a Lagrangian, Lagrange’s equation produces the orbits.

This is the very top of a deep physical idea called the “Principle of Least Action.”

In physics, adding up the Lagrangian at every point along some path in time gives a quantity called, most appropriately, “the Action.” The system could conceivably take any possible path among an infinite number of different paths, but physical systems follow paths that minimize the Action. If you find the path that gives the smallest Action, you find the path the system takes.

As an aside to see where this reasoning ultimately leads, Quantum Mechanics finds that while objects tend to follow paths that minimize the Action, they actually try to take every conceivable path… but that the paths which don’t tend to minimize the Action rapidly cancel each other out because their phases vary so wildly from one another. In a way, the minimum Action path does not cancel out from a family of nearby paths since their phases are all similar. From this, a quantum mechanical particle can seem to follow two paths of equal Action at the same time. In a very real way, the weirdness of quantum mechanics emerges directly because of path integral formalism.

All of this, all of the ability to know this, starts with the jump to Lagrangian formalism.

In that, it always bothered me: why the Lagrangian? The path optimization itself makes sense, but why specifically does the Lagrangian matter? Take this one quantity out of nowhere and throw it into a differential equation that you’ve rationalized as ‘minimizing action’ and suddenly you have a system of mechanics that is equal to Newtonian mechanics, but somehow completely different from it! Why does the Lagrangian work? Through my schooling, I’ve seen the derivation of Lagrange’s equation from path integral optimization more than once, but the spark of ‘why optimize using the Lagrangian’ always eluded me. Early on, I didn’t even comprehend enough about the physics to appreciate that the choice of the Lagrangian is usually not well motivated.

So, what exactly is the Lagrangian?

Lagrangian is defined as the difference between kinetic and potential energy. Kinetic energy is the description that an object is moving while potential energy is the expression that by having a particular location in space, the object has the capacity to gain a certain motion (say by falling from the top of a building). The formalism can be modified to work where energy is not conservative, but typically physicists are interested in cases where it does conserve. Energies emerge in Newtonian mechanics as an adaption which allows descriptions of motion to be detached from progression through time, where the first version of energy the freshman physicist usually encounters is “Work.” Work is the Force over a displacement times the spatial length of that displacement. It’s just a product of length times force. And, there is no duration over which the displacement is known to take place, meaning no velocity or acceleration. Potential energy and kinetic energy come next, where kinetic energy is simply a way to connect physical velocity of the object to the work that has been done on it and potential energy is a way to connect a physical situation, typically in terms of a conservative field, to how much work that field can enact on a given object.

When I say ‘conservative,’ the best example is usually the gravitational field that you see under everyday circumstances. When you lift your foot to take a step, you do a certain amount of work against gravity to pick it up… when you set your foot back down, gravity does an equal amount of work on your foot pulling it down. Energy was invested into potential energy picking your foot up, which was then released again as you put your foot back down. And, since gravity worked on your foot pulling it down, your foot will have a kinetic energy equal to the potential energy from how high you raised it before it strikes the ground again and stops moving (provided you aren’t using your muscles to slow its decent). It becomes really mind-bending to consider that gravity did work on your foot while you lifted it up, also, but that your muscles did work to counteract gravity’s work so that your foot could raise. As a quantity, you can chase energy around in this way. In a system like a spring or a pendulum, there are minimal dispersive interactions, meaning that after you start the system moving, it can trade energy back and forth from potential to kinetic forms pretty much  without limit so that the sum of all energies never changes, which is what we call ‘conservative.’

Energy, as it turns out, is one of the chief tokens of all physics. In fields like thermodynamics, which are considered classical but not necessarily Lagrangian, you only rarely see force directly… usually force is hidden behind pressure. The idea that the quantity of energy can function as a gearbox for attaching interactions to one another conceals Newton’s laws, making it possible to talk about interactions without knowing exactly what they are. ‘Heat of combustion’ is a black-box of energy that tells you a way to connect the burning of a fuel to how much work can be derived from the pressure produced by that fuel’s combustion. On one side, you don’t need to know what combustion is, you can tell that it will deliver a stroke of so much energy when the piston compresses, while on the other side, you don’t need to know about the engine, just that you have a process that will suck away some of the heat of your fire to do… something.

Because of the importance of energy, two quantities that are of obvious potential utility are 1.) the difference between kinetic and potential energy  and 2.) the sum of kinetic and potential energy. The first quantity is the Lagrangian, while the second is the so-called Hamiltonian.

There is some clear motivation here why you would want to explore using the quantity of the Lagrangian in some way. Quantities that can conserve, like energy and momentum, are convenient ways of characterizing motion because they can tell you about what to expect from the disposition of your system without huge effort. But for all of these manipulations, the clear connection between F=ma and Lagrange’s equation is still a subtle leap.

The final necessary connection to get from F=ma to the Lagrangian is D’Alembert’s Principle. The principle states simply this: for a system in equilibrium, (rather, while the system isn’t static, it’s not taking in more or less energy than it’s losing) perturbative forces ultimately do no net work. So, all interactions internal to a system in equilibrium can’t shift it away from equilibrium. This statement turns out to be another variational principle.

There is a way to drop F = ma into D’Alembert’s principle and directly produce that the quantity which should be optimized in Lagrange’s equation is the Lagrangian! May not seem like much, but it turns out to be a convoluted mathematical thread… and so, Lagrangian formalism directly follows as a consequence of a special case of Newtonian formalism.

As a parting shot, what does all this path integral, variational stuff mean? The Principle of Least Action has really profound implications on the functioning of reality as a whole. In a way, classical physics observes that reality tends to follow the lazy path: a line is the shortest path between two points and reality operates in such a way that at macroscopic scales the world wants to travel in the equivalent of ‘straight lines.’ The world appears to be lazy. At the fundamental quantum mechanical scale, it thinks hard about the peculiar paths and even seems to try them out, but those efforts are counteract such that only the lazy paths win.

Reality is fundamentally slovenly, and when it tries not to be, it’s self-defeating. Maybe not the best message to end on, but it gives a good reason to spend Sunday afternoon lying in a hammock.

Nonlocality and Simplicity

I just read an article called “How quantum mechanics could be even weirder” in the Atlantic.

The article is actually relatively good in explaining some of how quantum mechanics actually works in terms that are appropriate to laymen.

Neglecting almost everything about ‘super-quantum,’ there is one particular element in this article which I feel somewhat compelled to respond to. It relates to the following passages

But in 1935, Einstein and two younger colleagues unwittingly stumbled upon what looks like the strangest quantum property of all, by showing that, according to quantum mechanics, two particles can be placed in a state in which making an observation on one of them immediately affects the state of the other—even if they’re allowed to travel light years apart before measuring one of them. Two such particles are said to be entangled, and this apparent instantaneous “action at a distance” is an example of quantum nonlocality.

Erwin Schrödinger, who invented the quantum wave function, discerned at once that what later became known as nonlocality is the central feature of quantum mechanics, the thing that makes it so different from classical physics. Yet it didn’t seem to make sense, which is why it vexed Einstein, who had shown conclusively in the theory of special relativity that no signal can travel faster than light. How, then, were entangled particles apparently able to do it?

This is outlining the appearance of entanglement. The way that it’s detailed here, the implication is that there’s a signal being broadcast between the entangled particles and that it breaks the limits of speed imposed by relativity. This is a real argument that is still going on, and not being an expert, I can’t claim that I’m at the level of the discussion. On the other hand, I feel fairly strongly that it can’t be considered a ‘communication.’ I’ll try to rationalize my stance below.

One thing that is very true is that if you think a bit about the scope of the topic and the simultaneous requirements of the physics in order to assure the validity of quantum mechanics, the entanglement phenomenon becomes less metaphysical overall.

Correcting several common misapprehensions of the physics shrinks the loopiness from gaga bat-shit Deepak Chopra down to real quantum size.

The first tripping stone is highlighted by Schrodinger’s Cat, as I’ve mentioned previously. In Schrodinger’s Cat, the way the thought experiment is most frequently constructed, the idea of quantum superposition is imposed on states of “Life” and “Death.” A quantum mechanical event creates a superposition of Life and Death that is not resolved until the box is opened and one state is discovered to dominate. This is flawed because Life and Death are not eigenstates! I’ve said it elsewhere and I’ll repeat it as many times as necessary. There are plenty of brain-dead people whose bodies are still alive. The surface of your skin is all dead, but the basement layer is alive. Your blood cells live three days, and then die… but you do not! Death and Life in the biological sense are very complicated states of being that require a huge number of parameters to define. This is in contrast with an eigenstate which literally is defined by requiring only one number to describe it, the eigenvalue. If you know the eigenvalue of a nondegenerate eigenstate, you know literally everything there is to know about the eigenstate –end of story! I won’t talk about degeneracy because that muddies the water without actually violating the point.

Quantum mechanical things are objects stripped down to such a degree of nakedness that they are simple in a very profound way. For a single quantum mechanical degree of freedom, if you have an eigenvalue to define it, there is nothing else to know about that state. One number tells you everything! For a half-spin magnetic moment, it can exist in exactly two possible eigenstates, either parallel or antiparallel. Those two states can be used together to describe everything that spin can ever do. By the nature of the object, you can’t find it in any other disposition, except parallel or antiparallel… it won’t wander off into some undefined other state because its entire reality is to be pointing in some direction with respect to an external magnetic field… meaning that it can only ever be found as some combination of the two basic eigenstates. There is not another state of being for it. There is no possible “comatose and brain-dead but still breathing” other state.

This is what it means to be simple. We humans do not live where we can ever witness things that are that simple.

The second great tripping stone people never quite seem to understand about quantum mechanics is exactly what it means to have the system ‘enclosed by a box’ prior to observation. In Schrodinger’s Cat, your intuition is lead to think that we’re talking about a paper box closed by packing tape and that the obstruction of our line of vision by the box lid is enough to constitute “closed.” This is not the case… quantum mechanical entities are a combination of so infinitesimal or so low in energy that an ‘observation’ literally usually means nothing more than bouncing a single corpuscle of light off of it. An upshot of this is that as far as the object is concerned, the ‘observer’ is not really different from the rest of the universe. ‘Closed’ in the sense of a quantum mechanical ‘box’ is the state where information is not being exchanged between the rest of the universe and our quantum mechanical system.

Now, that’s closed!

If a simple system which is so simple that it can’t occupy a huge menu of states is allowed to evolve where it is not in contact with the rest of the universe, can you expect to see anything in that system different from what’s already there? One single number is all that’s needed to define what the system is doing behind that closed door!

The third great tripping stone is decoherence. Decoherence is when the universe slips between the observer and the quantum system and talks to it behind our backs. Decoherence is why quantum computers are difficult to build out of entangled quantum states. So the universe fires a photon into or pulls a photon out of our quantum mechanical system, and suddenly the system doesn’t give the entangled answers we thought that it should anymore. Naturally: information moved around. That is what the universe does.

With these several realizations, while it may still not be very intuitive, the magic of entanglement is tempered by the limits of the observation. You will not find a way to argue that ‘people’ are entangled, for instance, because they lack this degree of utter simplicity and identicalness.

One example of an entangled state is a spin singlet state with angular momentum equal to zero. This is simply two spin one-half systems added together in such a way that their spins cancel each other out. Preparing the state gives you two spins that are not merely in superposition but are entangled together by the spin zero singlet. You could take these objects and separate them from one another and then examine them apart. If the universe has not caused the entanglement to decohere, these spins are so simple and identical that they can both only occupy expected eigenstates. They evolve in exactly the same manner since they are identical, but the overarching requirement –if decoherence has not taken place and scrambled things up– is that they must continue to be a net spin-zero state. Whatever else they do, they can’t migrate away from the prepared state behind closed doors simply because entropy here is meaningless. If information is not exchanged externally, any communication by photons between the members of the singlet can only ever still produce the spin singlet.

If you then take one of those spins and determine its eigenstate, you find that it is either the parallel or antiparallel state. Entanglement then requires the partner, separated from it no matter how far, to be in the opposite state. They can’t evolve away from that.

What makes this so brain bending is that the Schrodinger equation can tell you exactly how the entangled state evolves as long as the box remains unopened (that is that the universe has not traded information with the quantum mechanical degree of freedom). There is some point in time when you have a high probability of finding one spin ‘up’ while the other is ‘down,’ and the probability switches back and forth over time as the wave function evolves. When you make the observation to find that one spin is up, the probability distribution for the partner ceases to change and it always ends up being down. After you bounce a photon off of it, that’s it, it’s done… the probability distribution for the ‘down’ particle only ever ends up ‘down.’

This is what they mean by ‘non-locality.’ That you can separate the entangled states by a great distance and still see this effect of where one entangled spin ‘knows’ that the other has decided to be in a particular state. ‘Knowledge’ of the collapse of the state moves between the spins faster than light can travel, apparently.

From this arises heady ideas that maybe this can be the basis of a faster-than-light communication system: like you can tap out Morse code by flipping entangled spins like a light switch.

Still, what information are we asking for?

The fundamental problem is that when you make the entangled state, you can’t set a phase which can tell you  which partner starts out ‘up’ and which starts out ‘down.’ They are in a superposition of both states and the jig is up if you stop to see which is which. One is up and one is down in order to be the singlet state, but you can’t set which. You make a couplet that you can’t look at, by definition! The wave function evolves without there being any way of knowing. When you stop and look at them, you get one up and one down, but no way of being able to say “that one was supposed to be ‘up’ and the other ‘down.'”

You can argue that they started out exactly as they ended up on only a single trial. As I understand it, the only way to know about entanglement is literally by running the experiment enough times to know about the statistical distributions of the outcome, that ‘up’ and ‘down’ are correlated. If you’re separated by light years, one guy finds that his partner particle is ‘up’… he can’t know that the other guy looked at his particle three days ago to find ‘down’ and was expecting the answer in the other party’s hands to be ‘up.’ So much for flipping a spin like a switch and sending message! When was it that the identities of ‘up’ and ‘down’ were even picked?

But these things are very simple, uncomplicated things! If neither party does anything to disrupt the closed box you started out with, you can argue that the choice of which particle ends with which spin was decided before they were ever separated from one another and that they have no need after the separation to be anything but very identical and so simple that you can’t find them in anything but two possible states. No ‘communication’ was necessary and the outcome observed was preordained to be observed. You didn’t look and can’t look, so you can’t know if they always would have given the same answer that they ultimately give. If the universe bumps into them before you can look, you scream ‘decoherence’ and any information preserved from the initial entanglement becomes unknowable. Without many trials, how do you ever even know with one glance if the particles decohered before you could look, or if a particle was still in coherence? That’s the issue with simple things that are in a probability distribution. Once you build up statistics, you see evidence that spins are correlated to a degree that requires an answer like quantum entanglement, but it’s hard to look at them beforehand and know what state they’re in –nay: by definition, it’s impossible. The entangled state gives you no way of knowing which is up or down, and that’s the point!

As such, being unable to pick a starting phase and biasing that one guy has ‘up’ and the other ‘down,’ there is no way to transmit information by looking –or not– at set times.

Since I’m not an experimentalist that works with entangled states, there is some chance that I’ve misunderstood something. In the middle of writing this post, I trolled around looking for information about how entanglement is examined in the lab. As far as I could tell, the information about entanglement is based upon statistics for the correlation of entangled states with each other. The statistics ultimately tell the story.

I won’t say that it isn’t magical. But, I feel that once you know the reality, the wide-eyed extravagance of articles like the one that spawned this post seem ignorant. It’s hard not to crawl through the comments section screaming at people “No, no, no! Dear God, no!”

So then, to take the bull by the horns, I made an earlier statement that I should follow up on explicitly. Why doesn’t entanglement violate relativity? The conventional answer is that the information about knowing of the wave function collapse is useless! The guy who looked first can’t tell the guy holding the other particle that he can look now. Even if the particles know that the wavefunction has collapsed, the parties holding those particles can’t be sure whether or not the state collapsed or decohered. Since the collapse can’t carry information from one party to the other, it doesn’t break relativity. That’s the standard physicist party line.

My own personal feeling is that it’s actually a bit stiffer than that. Once the collapse occurs, the particles in hand seem as if they’ve _always_ made the choice you finally learn them to contain. They don’t talk: it’s just the concrete substrate of reality determined before they’re separated. The on-line world talks about this in two ways: either information can be written backward in time (yeah, they do actually say that) or reality is so deterministic as to eliminate all free will: as if that the experiment you chose to carry out is foreordained at the time when the spin singlet is created, meaning that the particles know what answer they’ll give before you know that you’ve been predestined to ask.

This is not necessarily a favored interpretation. People don’t like the idea that free will doesn’t exist. I personally am not sure why it matters: life and death aren’t eigenstates, so why must free will exist? Was it necessary that your mind choose to be associated with your anus or tied to a substrate in the form of your brain? How many fundamental things about your existence do you inherit by birth which you don’t control? Would it really matter in your life if someone told you that you weren’t actually choosing any of it when there’s no way at all to tell the difference from if you were? Does this mean that Physics says that it can’t predict for you what direction your life will go, but that your path was inevitable before you were born?

At some level one must simply shrug. What I’m suggesting is not a nihilistic stance or that people should just give up because they have no say… I’m suggesting that, beyond the scope of your own life and existence, you are not in a position to make any claims about your own importance in the grand scheme of the universe. The wrr and tick of reality is not in human hands.

If you wish to know more about entanglement, the EPR paradox and this stuff about non-locality and realism, I would recommend learning something about Bell’s inequality.