The Quantum Mechanics in the Gap

A big cat’s paw of mine is trying to fill the space between my backgrounds to understand how one thing leads to another.

When a biochemist learns quantum mechanics (QM), it happens from a background where little mathematical sophistication is required; maybe particle-in-a-box appears in the middle of a low grade Physical Chemistry class and many results of QM are qualitatively encountered in General Chemistry or perhaps in greater detail in Organic Chemistry. A biochemist does not need to be perfect at these things since the meat of biochemistry is a highly specialized corner of organic chemistry dealing with a relatively small number of molecule types where the complexity of the molecule tends to force the details into profound abstraction. Proteins and DNA, membranes and so on are all expressed mostly as symbols, sequences or structural motifs. Reactions occur symbolically where chemists have worked out the details of how a reaction proceeds (or not) without really saying anything very profound about it. This created a situation of deep frustration for me once upon a time because it always seemed like I was relying on someone else to tell me the specifics of how something actually worked. I always felt helpless. Enzymatic reaction mechanisms always drove me crazy because they seem very ad hoc; no reason they shouldn’t since evolution is always ad hoc, but the symbology used always made it opaque to me as to what was happening.

When I was purely a biochemist, an undergraduate once asked me whether they could learn QM in chemistry and I honestly answered “Yes” that everything was based on QM, but withheld the small disquiet I felt that I really didn’t believe that I understood how it fit in. Background that I had in QM being as it was at that point, I didn’t truly know a quantum dot from a deviled egg. Yes, quantum defines everything, but what does a biochemist know of quantum? Where does bond geometry come from? Everything seems like arbitrary tinker toys using O-Chem models. Why is it that these things stick together as huge, solid ball-and-stick masses when everything is supposed to be puffy wave clouds? Where is this uncertainty principle thing people vaguely talk about in hushed tones when referring to the awe inspiring weirdness that is QM? You certainly would never know such details looking at model structures of DNA. This frustration eventually drove me to multiple degrees in physics.

In physics, QM takes on a whole other dimension. The QM that a physicist learns is concerned with gaining the mathematical skill to deal with the core of QM while retaining the flexibility to specialize in a needed direction. Quantum Theory is a gigantic topic which no physicist knows in entirety. There are two general cousins of theory which move in different directions with Hamiltonian formalisms diverging from the Lagrangian. They connect, but have power in different situations. Where you get very specific on a topic is sometimes not well presented –you have to go a long way off the beaten path to hit either the Higgs Boson or General Relativity. Physicists in academia are interested in the weird things lying at the limits of physics and focus their efforts on pushing to and around those weirdnesses; you only focus efforts on specializations of quantum mechanics as they are needed to get to the untouched things physicists actually care to examine. This means that physicists sometimes focus little effort on tackling topics that are interesting to other fields, like chemistry… and the details of the foundations of chemistry, like the specifics of the electronic structure of the periodic table, are under the husbandry of chemists.

If you read my post on the hydrogen atom radial equation, you saw the most visible model atom. The expanded geometries of this model inform the structure of the periodic table. Most of the superficial parts of chemistry can be qualitatively understood from examining this model. S, P, D, F and so on orbitals are assembled from hydrogenic wave equations… at least they can be on the surface.

Unfortunately, the hydrogenic orbitals can only be taken as an approximation to all the other atoms. There are basically no analytic solutions to the wave functions of any atom beyond hydrogen.

Fine structure, hyper fine structure and other atomic details emerge from perturbations of the hydrogenic orbitals. Perturbation is a powerful technique, except that it’s not an exact solution. Perturbations approach solutions by assuming that some effect is a small departure from a much bigger situation that is already solved. You then do an expansion on which successive terms tend to approach the perturbative part more and more closely. Hydrogenic orbitals can be used as a basis for this. Kind of. If the “perturbation” becomes too big relative to the basis situation, the expansion necessary to approximate it becomes too big to express. Technically, you can express any solution for any situation from a “complete” basis, but the fraction of the basis required for an accurate expression becomes bigger than the “available” basis before you know it if the perturbation is too large compared to the context of the basis.

When I refer to “basis” here, I’m talking about Hilbert spaces. This is the use of orthogonal function sets as a method to compose wave equations. This works like Fourier series, which is one of the most common Hilbert space basis sets. Many Hilbert spaces contain infinitely many basis functions, which is bigger than the biggest number of functions any computer can use. The reality is that you can only ever actually use a small portion of a basis.

The hydrogen situation is merely a prototype. If you want to think about helium or lithium or so on, the hydrogenic basis becomes merely one choice of how to approach the problem. The hamiltonians of other atoms are structures that can in some cases be bigger than is easily approachable by the hydrogenic basis. Honestly, I’d never really thought very hard about the other basis sets that might be needed, but technically they are a very large subject since they are needed for the 120 odd other atoms on the periodic table beyond hydrogen. These other atoms have wave functions that are kind of like those of hydrogen, but are different. The S-orbital of hydrogen is a good example of S-orbitals found in many atoms, even though the functional form for other atoms is definitely different.

This all became interesting to me recently on the question of how to get to molecular bonds as more than the qualitative expression of hydrogenic orbital combinations. How do you actually calculate bond strengths and molecular wave functions? These are important to understanding the mechanics of chemistry… and to poking a finger from quantum mechanics over into biochemistry. My QM classes brushed on it, admittedly, deep in the quagmire of other miscellaneous quantum necessary to deal with a hundred different things. I decided to take a sojourn into the bowels of Quantum Chemistry and develop a competence with the Hartree-Fock method and molecular orbitals.

The quantum mechanics of quantum chemistry is, surprisingly enough, mechanically more simple than one might immediately expect. This is splitting hairs considering that all quantum is difficult, but it is actually somewhat easier than the difficulty of jumping from no quantum to some quantum. Once you know the basics, you pretty much have everything needed to get started. Still, as with all QM, this is not to scoff at; there are challenges in it.

This form of QM is a Hamiltonian formalism where the first mathematics originated in the 1930s. The basics revolve around the time independent Schroedinger equation. Where it jumps to being modern QM is in the utter complexity of the construct… simple individual parts, just crazily many of them. This type of QM is referred to as “Many Body theory” because it involves wave equations containing dozens to easily hundreds of interactions between individual electrons and atomic nuclei. If you thought the Hamiltonian I wrote in my hydrogen atom post was complicated, consider that it was only for one electron being attracted to a fixed center… and not even including the components necessary to describe the mechanics of the nucleus too. The many body theory used to build up atoms with many electrons works for molecules as well, so learning generalities about the one case is learning about it the other case too.

As an example of how complicated these Schrodinger equations become, here is the time independent Schrodinger equation for Lithium.

Lithium Schrodinger

This equation is simplified to atomic units to make it tractable. The part describing the kinetic energy of the nucleus is left in. All four of those double Del operators open up into 3D differentials like the single one present in the hydrogen atom. The next six terms describe electrostatic interactions between the three electrons among themselves and with the nucleus. This is only one nucleus and three electrons.

As I already mentioned, there are no closed-form analytical solutions for structures more complicated than hydrogen, so many body theory is about figuring out how to make useful approximations. And, because of the complexity, it must make some very elegant approximations.

One of the first useful features of QM for addressing situations like this I personally overlooked when I initially learned it. With QM, most situations that you might encounter have no exact solutions. Outside of a scant handful of cases, you can’t truly “solve” anything. But, for all the histrionics that goes along with that, the solutions, what are called the eigenstates, are a special case of lowest possible energy for the given circumstance. If you make a totally random guess about the form of the wave function which solves a given Hamiltonian, you are assured that the actual solution has a lower energy. Since that’s the case, you can play a game: if I make a some random guess about the form of the solution, another guess that has a lower energy is a better guess regarding the actual form. You can minimize this, always making adjustments to the guess such that it achieves a lower energy, where eventually it won’t go any lower. The actual solution still ends up being lower, but maybe not very far. Designing such energy minimizing guesses inevitably converges toward the actual solution and is usually accomplished by systematic mathematical minimization. This method is called “Variation” and is one of the most major methods for constructing approximations of an eigenstate. Also, as you might expect, this is a numerical strategy and it makes heavy use of computers in the modern day since the guesses are generally very big, complicated mathematical functions. Variational strategies are responsible for most of our knowledge of the electronic structure of the periodic table.

Using computers to make guesses has been elevated to a high art. Literally, a random function with a large number of unknown constants is tried against the Hamiltonian; you then take a derivative of the energy to see how it varies as a function of any one constant and then adjust that constant until the energy is at a minimum, where the derivative is near zero and where the second derivative shows an inflection indicative of a minimum. Do this over and over again with all the available constants in the function and eventually the trial wave function converges to the actual solution.

Take that in for a moment. We understand the periodic table mainly by guessing at it! A part of what makes these wave functions so complicated is that the state of any one electron in any system more complicated than hydrogen is dependent on every other electron and charged body present, as shown in the Lithium equation above. The basic orbital shapes are not that different from hydrogen, even requiring spherical harmonics to describe the angular shape, but the specific radial scaling and distribution is not solvable. These electrons influence each other in several ways. First, they place plain old electrostatic pressure on one another –all electrons push against each other by their charges and shift each other’s orbitals in subtle ways. Second, they exert what’s called “exchange pressure” on one another. In this, every electron in the system is indistinguishable from every other and electrons specifically deal with this by requiring that the wave function be antisymmetric such that no electron can occupy the same state as any other. You may have heard this called the Pauli Exclusion Principle and it is just a counting effect. In a way, this may be why quantum classes tend to place less weight on the hydrogen atom radial equation: even though it holds for hydrogen, it works for nothing else.

Multi-atom molecules stretch the situation even further. Multiple atoms, unsolvable in and of themselves, are placed together in some regularly positioned array in space, with unsolvable atoms now compounded into unsolvable molecules. Electrons from these atoms are then all lumped together collectively in some antisymmetric wave function where the orbitals are dependent on all the bodies present in the system. These orbitals are referred to in quantum chemistry as molecular orbitals and describe how an electron cloud is dispersed among the many atoms present. Covalent electron bonds and ionic bonds are forms of molecular orbital, where electrons are dispersed between two atoms and act to hold these atoms in some fixed relation with respect to one another. The most basic workhorse method for dealing with this highly complicated arrangement is a technique referred to as the Hartree-Fock method. Modern quantum chemistry is all about extensions beyond Hartree-Fock, which often use this method as a spine for producing an initial approximation and then switch to other variational (or perturbative) techniques to improve the accuracy of the initial guess.

Within Hartree-Fock, molecular orbitals are built up out of atomic orbitals. The approximation postulates, in part, that each electron sits in some atomic orbital which has been contributed to the system by a given atom where the presence of many atoms tends to mix up the orbitals among each other. To obey exchange, each electron literally samples every possible contributed orbital in a big antisymmetric superposition.

Hartree-Fock is sometimes referred to as Self Consistent Field theory. It uses linear superpositions of atomic orbitals to describe the molecular orbitals that actually contain the valence electrons. In this, the electrons don’t really occupy any atomic orbital, but some combination of many orbitals all at once. For example, a version of the stereotypical sigma covalent bond is actually a symmetric superposition of two atomic S-orbitals. The sigma bond contains two electrons and is made antisymmetric by the solitary occupancy of electron spin states so that the spatial part of the S-orbitals from the contributing atoms can enter in as a symmetric combination –this gets weird when you consider that you can’t tell which electron is spin up and which is spin down, so they’re both in a superposition.

Sigma bond

The sigma bond shown here in Mathematica was actually produced from two m=0 hydrogenic p-orbitals. The density plot reflects probability density. The atom locations were marked afterward in powerpoint. The length of the bond here is arbitrary, and not energy minimized to any actual molecule. This was not produced by Hartree-Fock (though it would occur in Hartree-Fock) and is added only to show what molecular bonds look like.

From completeness, here is a pi bond.

Pi bond

At the start of the Hartree-Fock, the molecular orbitals are not known where the initial wave function guess is that every electron is present in a particular atomic orbital within the mixture. Electron density is then determined throughout the molecule and used to furnish repulsion and exchange terms among the electrons. This is then solved for energy eigenvalues and spits out a series of linear combinations describing the orbits where the electrons are actually located, which turns out to be different from the initial guess. These new linear combinations are then thrown back into the calculation to determine electron density and exchange, which is once more used to find energy eigenvalues and orbitals, which are once again different from the previous guess. As the crank is turned repeatedly, the output orbitals converge onto the orbitals used to calculate the electron density and exchange. When these no longer particularly change between cycles, the states describing electron density will be equal to those associated with the eigenvalues –the input becomes self consistent with the output, hence giving the name to the technique by production of a self-consistent field.

Once the self consistent electron field is reached, the atomic nuclei can be repositioned within it in order to minimize the electrostatic stresses on the nuclei. Typically, the initial locations of the nuclei must be guessed since they are themselves not usually exactly known. A basic approximation of the Hartree-Fock method is the Born-Oppenheimer approximation where massive atomic nuclei are expected to move on a much slower time scale than the electrons, meaning that the atomic nuclei create a stationary electrostatic field which arranges the electrons, but then are later moved by the average dispersion of the electrons around them. Minimizing the atomic positions necessitates re-calculation of the electron field, which in turn may require that atomic positions again be readjusted until eventually the electron field does not alter the atomic positions, whereby the atomic positions facilitate the configuration of the surrounding electrons. With the output energy of the Hartree-Fock method minimized by rearranging the nuclei, this gives the relaxed configuration of a molecule. And, from this, you automatically know the bonding angles and bond lengths.

The Born-Oppenheimer approximation is a natural simplification of the real wave function which splits the wave functions of the nuclei away from the wave functions of the electrons; it can be considered valid predominantly because of the huge difference in mass (a factor of ~100,000) between electrons and nuclei, where the nuclei are essentially not very wave-like relative to the electrons. In Lithium, above, it would simply mean removing the first term of the Schrodinger equation involving the nuclear kinetic energy and understanding that the total energy of the molecule is not E. Most of the shape of a molecule can treat atomic nuclei as point-like while electrons and their orbitals constitute pretty much all of the important molecular structure.

As you can see by the description, there are a huge number of calculations required. I’ve described them very topically. Figuring out the best way to run Hartree-Fock has been an ongoing process since the 1930s and has been raised to a high art nearly 90 years later. At the superficial level, Hartree-Fock approximation is hampered by the not placing the nuclei directly in the wave function and by not allowing full correlation among the electrons. This weakness is remedied by usage of variational and perturbative post-Hartree-Fock techniques that have come to flourish with the steady increase of computational power during the advancement of Moore’s Law in transistors. That said, the precision calculation of overlap integrals is so computationally demanding on the scale of molecules that the hydrogen atom eigenstate solutions are impractical as a basis set.

This actually really caught me by surprise. Hartree-Fock has a very weird and interesting basis set type which is used in place of the hydrogen atom orbitals. And, the reason for the choice is predominantly to reduce a completely intractable computational problem to an approachable one. When I say “completely intractable,” I mean that even the best supercomputers available today still cannot calculate the full, completely real wave functions of even small molecules. With how powerful computers have become, this should be a stunning revelation. This is actually one of the big motivating factors toward using quantum computers to make molecular calculations; the quantum mechanics arise naturally within the quantum computer enabling the approximations to strain credulity less. The approximation used for the favored Hartree-Fock basis sets is very important to conserving computational power.

The orbitals built up around the original hydrogen atom solution to approximate higher atoms have a radial structure that has come to be known as Slater orbitals. Slater orbitals are variational functions that resemble the basic hydrogen atom orbital which, as you may be aware, is an exponential-La Guerre polynomial combination. Slater orbitals are basically linear combinations of exponentials which are then minimized by variation to fit the Hamiltonians of higher atoms. As I understand it, Slater orbitals can be calculated through at least the first two rows of the periodic table. These orbitals, which are themselves approximations, are actually not the preferred basis set for molecular calculations, but ended up being one jumping off point to produce early versions of the preferred basis set.

The basis set that is used for molecular calculations is the so-called “Gaussian” orbital basis set. The Gaussian radial orbitals were first produced by use of simple least-squares fits of Slater orbitals. In this, the Slater orbital is taken as a prototype and several Gaussian functions in a linear combination are fitted to it until Chi-square becomes as small as possible… while the Slater orbital can be exactly reproduced by use of an infinite number of Gaussians, it can be fairly closely reproduced by typically just a handful. Later Gaussian basis sets were also produced by skipping the Slater orbital prototype and jumping to Hartree-Fock application directly on atomic Hamiltonians (as I understand it). The Gaussian fit to the Slater orbital is pretty good across most of the volume of the function except at the center where the Slater orbital has a cusp (from the exponential) when the Gaussian is smooth… with an infinite number of Gaussians in the fit, the cusp can be reproduced, but it is a relatively small part of the function.

Orbitals comparison

Here is a comparison of a Gaussian orbital with the equivalent Slater orbital for my old hydrogen atom post. The scaling of the Slater orbital is specific to the hydrogen atom while the Gaussian scaling is not specific to any atom.

The reason that the Gaussian orbitals are the preferred model is strictly because of a computational efficiency issue. Within the application of Hartree-Fock, there are several integral calculations that must be done repeated. Performing these integrations is computationally very very costly on functions like the original hydrogen atom orbitals. With Gaussian radial orbitals, superpositions of the gaussians are themselves gaussians and the integrals all end up having the same closed forms, meaning that one can simply transfer constants from one formula to another without doing any numerical busy work at all. Further, the Gaussian orbitals can be expressed in straight-forward cartesian forms, allowing them to be translated around space with little difficulty and generally making them easy to work with (I dare you: try displacing a hydrogen orbital away from the origin while it remains in spherical-polar form. You’ll discover you need the entire Hilbert space to do it!). As such, with Gaussians, very big calculations can be performed extremely quickly on a limited computational budget. The advantage here is a huge one.

One way to think about it is like this: Gaussian orbitals can be used in molecular calculations roughly the same way that triangles are used to build polyhedral meshes in computer graphics renderings.

Gaussians are not the only basis set used with Hartree-fock. I’ve learned only a little yet about this alternative implementation, but condensed matter folk also use the conventional Fourier series basis set of sines and cosines while working on a crystal lattice. Sines and cosines are very handy in situations with periodic boundaries, which you would find in the regimented array of a crystal lattice.

Admittedly, as far as I’ve read, Hartree-Fock is an imperfect solution to the whole problem. I’ve mentioned some of the aspects of the approximation above and it must always be remembered that the it fails to capture certain aspects of the real phenomenon. That said, Hartree-Fock provides predictions that are remarkably close to actual measured values and the approximation lends itself well to post-processing that further improves the outcomes to an impressive degree (if you have the computational budget).

I found this little project a fruitful one. This is one of those rare times when I actually blew through a textbook as if I was reading a novel. Some of the old citations regarding self-consistent field theory are truly pivotal, important papers: I found one from about the middle 1970s which had 10,000 citations on Web of Science! In the textbook I read, the chemists goofed up an important derivation necessary to produce a workable Hartree-Fock program and I was able to hunt down the 1950 paper detailing said calculation. Molecular Orbital theory is a very interesting subject and I think I’ve made some progress toward understanding where molecular bonds come from and what tools are needed to describe how QM produces molecules.

(Edit 11-6-18):

One cannot walk away from this problem without learning exactly how monumental the calculation is.

In Hartree-fock theory, the wave equations are expressed in the form of determinants in order to encapsulate the antisymmetry of the electron wave equation. These determinants are an antisymmetrized sum of permutations over the orbital basis set. Each permutation ends up being its own term in the total wave equation. The number of such terms goes as a factorial of the number of electrons contained in the wave. Moreover, probability density is the square of the wave equation.

Factorials become big very quickly.

Consider a single carbon atom. This atom contains 6 electrons. From this, the total wave equation for carbon has 6! terms. 6! = 720. The probability density then is 720^2 terms… which is 518,400 terms!

That should make your eyes bug out. You cannot ever write that in its full glory.

Now, for a simple molecule, let’s consider benzene. That’s six carbons and six hydrogens. So, 6×6+6 = 42 electrons. The determinant would contain 42! terms. That is 1.4 ×10^51 terms!!!! The probability density is about 2×10^102 terms…

Avogadro’s number is only 6.02×10^23.

If you are trying to graph the probability density with position, the cross terms are important to determining the value of the density at any location, meaning that you have 10^102 terms. This assures that you can never graph it in order to visualize it! If you integrate that across all of space for the spaces of each electron (an integral with 42 3D measures), every term with an electron in two different states dies, killing cross terms. And, because no integral can survive if it has even one zero among its 42 3D measures, only the diagonal terms survive in 42 cases, allowing the normalized probability to simply evaluate to the number of electrons in the wave function. Integrating the wave function totally cleans up the mess, meaning that you can basically still do integrals to find expectation values thinking only about sums across the 42 electrons. This orthogonality issue is why you can do quantum chemistry at all: for an operator working in a single electron space, every overlap that doesn’t involve that electron must only be 1 for a given term to survive, which is a vast minority of cases.

For purposes of visualization, these equations are unmanageably huge. Not merely unmanageably, but unimaginably so. So huge, in fact, that they cannot be expressed in full except in the most simplistic cases. Benzene is only six carbons and it’s effectively impossible to tackle in the sense of the total wave equation. The best you can do is look for expressions for the molecular orbitals… which may only contain N-terms (as many as 42 for benzene.) Molecular orbitals can be considered the eigenstates of the molecule, where each one can be approximated to contain only one electron (or one pair of electrons in the case of closed shell calculations). The fully quantum weirdness here is that every electron samples every eigenstate, which is basically impossible to deal with.

For anyone who is looking, some of the greatest revelations which constructed organic chemistry as you might know it occurred as early as 1930. Linus Pauling wrote a wonderful paper in 1931 where he outlines one way of anticipating the tetragonal bond geometry of carbon… performed without use of these crazy determinant wavefunctions and with simple consideration of the vanilla hydrogenic eigenstates. Sadly, these are qualitative results without resorting to more modern methods.


The Difference between Quantity and Quality

I decided that I felt some need to speak up about a recent Elon Musk interview I saw on YouTube. You probably know the one I mean since it’s been making the rounds for a few days in the media over an incident where Mr. Musk took a puff of weed on camera. This is the interview between Mr. Musk and Joe Rogan.

I won’t focus on the weed. I will instead focus on some overall impressions of the interview and on something that Musk said in the context of AI.

I admit that I watch Joe Rogan’s podcast now and then. I don’t agree with some of his outlooks regarding drug use (had it been me on camera instead of Musk, I would have politely turned down the pot) but I do feel that Rogan is often a fairly discerning thinker; he advocates pretty strongly for rational inquiry when you would expect him to just be another mook. That said, I usually only watch clips rather than entire podcasts. God help me, media content would fill my life more than it already does if I devoted the 2.5 hours necessary to consume it.

Firstly, I must say that I really wasn’t that pleased with how Joe Rogan treated Elon Musk. He might well have just reached across the table and given the poor man a hand job with how much glad handling he started with. He very significantly played up Musk’s singularity, likening him –not unfavorably– to Nikolai Tesla. Later, he said flat out that “it’s as if Musk is an alien,” he’s so singular. Rogan jumped into talking about a dream where there were “a million” Nikolai Tesla’s, or some such, and speculated how unbelievable the world would be if there were a million Elon Musks, how much innovation would be achieved. In response to that, I think he’s over-blowing what is possible with innovation and not thinking that clearly about how Elon Musk got into the position he’s in.

I do not diminish Elon Musk as an innovator, to start with. The likelihood of my hitting it the way he has is not good, so I can’t say that he isn’t as singular as one might make him out to be. He is in a rarefied air of earning potential with the money he has to throw around; just a handful of people in the same room. A part of what made Elon Musk was an innovation that is shared across a few people, namely the money made from creating Paypal, for which Musk can’t take exclusive credit. Where Musk is now depends quite strongly on this foundation: the time which bootstrapped him into the stratosphere he current occupies was the big tech boom of the Dotcom era, where the internet was quite rapidly expanding, where many people were trying many new ideas and where the entire industry was in a phase of exponential growth. Big ideas were potentially very low hanging fruit, which are not possible to retread now. For instance, it would take a lot to get somewhere with a Paypal competitor today since you would have to justify your infrastructure as preferable somehow to Paypal, which has now had twenty years to entrench and fortify. It’s unlikely social networks will ever produce another Mark Zuckerberg without there being some unoccupied space to fill, which is more difficult to find with everyone trying to create yet another network. Musk is not that different; he landed on the field at a time when the getting was very good. Perhaps someone will hit it with an AI built in a garage and make a trillion dollars, but my feeling is that such an AI will emerge from a foundation that is already deep and hard to compete with, such as Google, which is itself an example of an entity that came into being when the soil was very ripe and would be difficult to retread, or compete with, twenty years later. It is this environment that grew Elon Musk.

Elon Musk won his freedom in an innovation that he cannot take exclusive credit for. Having gained a huge amount of money, he’s no longer beholden by the same checks that hold most everyone else in place. I think that were it not for this preexisting wallet, Musk would not be in the position to make the innovations he’s getting credit for today. This isn’t a bad thing, but you must hold it in context. The environment of the Dotcom era produced one Elon Musk and a bunch of others, like Pichai and Brin and Bezos, because there were a million people competing for those goals… and the ones that hit at the right time and worked hardest won out. This is why there can’t be a million Elon Musks; there aren’t really a million independent innovations worth that much money which won’t just cannibalize each other in the market place. Musk slipped through, as did Bezos, who is wielding as much if not more power for a similar reason (Steve Jobs was another of this scope, but he’s no longer on the field and Apple is simply coasting on what Jobs did.) There are not many checks holding Elon Musk back at this point because he has the spending power to more or less do whatever he feels like. This power counts for a lot. I would suggest that there are plenty of people existent right now who are capable of roughly the same thing as Musk did, who haven’t hit a hole that lifts them quite so far.

As in the video, one can certainly focus on the idea mill that Elon Musk has in his head, but a distinguishing feature of Musk is not just ideas; he is definable by an incredible work ethic. Would you pull 100 hour work weeks? Somebody who is holding down more than 2 forty hour a week jobs is probably earning at least twice as much as you can earn for forty hours a week! I would point out that Elon Musk has five kids and I’ve got to wonder if he even knows their names. My little angel is at least forty hours of my week that I am totally happy to give, but it means I’ve only got like forty hours otherwise to work;-)

Is he an alien? No. He’s a smart guy who worked his ass literally off at great, huge, personal expense and managed to hit a lucky spot that facilitated his freedom. Maybe he would have made it just as well if misplaced in time say forward or backward ten years, but my feeling is that the space currently occupied by his innovations would likely be occupied by someone else of similar qualities to Musk. The environment would have produced someone by simple selection. The idea mill in his head is also of dubious provenance given that Sci Fi novelists have been writing about things he’s trying to achieve since at least forty years prior to when Musk arrived on the scene: propulsive rocket landings were written about by Robert Heinlein and Ray Bradbury and executed first by NASA in the 1960s to land on the moon… SpaceX is doing something amazing with it now, but it isn’t an original idea in and of itself. Musk’s hard work is amazing hard work to actualize the concept, even though the concept isn’t new. Others should probably get some credit for the inspiration.

Joe Rogan glad-handling Elon Musk for his singularity overlooks all of this. I do not envy Musk his position and I can’t really imagine what he must’ve been thinking being on the receiving end of that.

I feel that Musk has put himself in an unfortunate position of being a popularizer. He’s become a go-to guru culturally for what futurism should be. This has the unfortunate side effect of working two directions: Musk is in a position where he can say a lot and have people listen, at the expense of the fact that people are paying attention to him when he would probably rather they not be. Oh dear God, Elon Musk just took a puff of that marijuana! The media is grilling him for that moment. How many people are smoking it up, nailing themselves in an exposed vein with a needle and otherwise sitting on a street corner somewhere, masturbating in public right this very second that the media is not focused on?

For Musk, in particular, I think the pressure of his position is starting to chafe. He may not even be able to see it in himself. Musk has so much power that he’s subject to Trumpian exclusivism; actual reality has been veiled to him behind a mask of yes-men, personal assistants and synchophants to such a degree that Musk is beginning to buy (or has already completely bought) the glad-handling. Elon Musk can fire anyone who doesn’t completely fit within the mold he envisions that this employee should. There is a power differential that insulates him most of the time and he’s gotten used to wielding it. For instance, Elon Musk relates a story while talking about the dangers of AI to Joe Rogan where he says that “nobody listened to him.” Who was he talking to? “Nobody” is Barack Obama. “Nobody” is senators and Capital Hill. As he said it, you can pretty clearly see that Elon Musk expected that these people should have listened to him! Not to say that someone like Obama should have ignored him about the existential threat posed by AI, but that Elon Musk felt that he personally should have been the standard bearer. Think about that. The mindset there is really rather amazing. The egotism is enormous. Egotism can certainly take you a long way by installing confidence, but it has a nasty manner of insulating a person from his or her own shortcomings. As a man who works 100 hour work weeks, one has to wonder if Musk is anyone but the CEO. Can he deal with reality not bending to his will when he says “You’re fired”? Musk decided to play superhero with the Thai soccer team cave crisis when he built the personal-sized submarine to try to help out. Is it any wonder that he didn’t respond too well being told that the concept wouldn’t have worked? I have no doubt he was being magnanimous and I feel bad that he certainly feels slighted for offering the help but being rebuffed. I don’t know that he was actually seeking the spotlight in the news so much as that he felt obligated to be the superhero that glad-handlers are conditioning him to believe that he is. Elon Musk has gotten used to the notion that when he breathes, the wind is felt on the other side of the world and he draws sustenance from people telling him on Twitter that they feel the air moving somewhere over there.

Beware the dangers of social media. It will intrinsically surface the extreme responses because it is designed to do exactly that. If you can’t handle the haters, stay clear of the lovers. Some fraction of the peanut gallery that you will never meet will always have something to say that you won’t like hearing…

(Yes, I am aware of the irony of being a member of the anonymous internet peanut gallery heckling the stage. Who will listen? Who knows; I’m comfortable with my voice being small. If Barack Obama reads what I’m saying, maybe he’ll read it to completion. If so, thanks!)

All that said, I think that Elon Musk is in a very difficult position psychologically. He spends nights sleeping on the floor of his office at Tesla (supposedly) working very very hard at managing people and projects, expecting that the things he says to do and is busy implementing go exactly as he says they should. For a 100 hour work week, this is tremendous isolation. He’s at the top locked in a box where his outlet, social media, always tells him that he is the man sitting on the top of the mountain, and then heckling him when he takes a second out to… do X, help rescue some children, take a puff on a joint, look away from the job at hand. Would you break? I’m happy I spend forty hours a week with my little angel. I’m happy my wife tells me when I’m full of shit. I couldn’t handle Elon Musk’s position. Can you imagine the fear of having the whole world looking over your shoulder, just waiting for one of your ideas to completely implode? Social isolation is profoundly dangerous in all its forms.

In answer to Joe Rogan, Elon Musk is not an alien and he isn’t singular. Maybe you don’t believe me, but I actually say this as a kindness to Elon Musk, in some hope that he finds a way around his isolation. He should find a better outlet than what he currently uses, or the pressure is going to break him. There are other people in this world whose minds are absolutely always exploding, who lay awake at night and struggle to keep it under control. I have no doubt that this takes different shapes for different people who feel it, but I definitely understand it as a guy who lies awake at night struggling to turn off the music, turn off the equations, turn off the visions. Some people do see things that lie just beyond where everyone else does and you don’t hear from them. They may work much smaller jobs and may not have a big presence on social media, but this doesn’t mean they don’t have clear vision. Poor old Joe Rogan, toking up on his joint, turns off the parts of himself that might work that way… he more or less admits that he can’t face himself and smokes the pot to shed the things he doesn’t like! Mr. Rogan went cold turkey on pot for a month and related a story during that time about having vivid dreams. What is your chance at vision? Is it like mine? Do you shuffle it under the rug?

Anyway, that’s a part of my response to how the interview was carried out. I want also to respond a little bit to some of the content that was said. For reference, here’s the relevant clip that has them talking about AI.

There is a section of that clip that has Elon Musk talking about some of the rationale for the startup Neuralink. He speaks about what he calls the “human bandwidth problem.” The idea here, as he relates it, is that one of the reasons humans can’t complete with AI is because we don’t acquire the breadth of information that a computer based AI can as quickly. In this, a picture is worth more than a thousand words because a picture can deliver more information to the human brain in a much shorter space of time than other possible means by which a human can import information. The point of Neuralink then is to increase human bandwidth. An example that Musk gives is that smartphones imbue their users with superhuman abilities and information access; the ability to navigate traffic or find hotels or restaurants without previously knowing of these things. He asserts that possession of a smartphone already makes people cyborgs. He then reasons that by making a link that circumvents the five senses and places remote information access and control straight into the human mind, humans gain some parity on AI, since AI will be able to gain access to information without having the delay associated with seeing or hearing an input.

I think Elon Musk is being somewhat naive about this. Bandwidth is not the only problem we face here in light of what AI might potentially be capable of. Yes, AI in a computer has a tremendous advantage in being able to parse information with speed; this is fundamentally what computers are good for, taking huge amounts of information and quickly executing a simple, repetitious and very fast methodology in order to sort the depths. A smart computer program starts with the advantage of being faster than people. Elon Musk sort of asserts in what he says that humans can become better than we are by breaking the plane and putting essentially a smartphone interface straight into our heads, that speeding up our ability to get hold of the information would put us at an advantage.

I don’t really agree with him.

Having access to a smartphone has revealed a number of serious problems with the capacity for humans to deal with greater bandwidth. Texting and driving together has become a way for people to die since the advent of cellphones. Filter silos occur because people simply don’t have enough time to absorb (and I mean “absorb” in the sense of “to Grok” rather than in the sense of Read or Watch, and the subtlety means the universe in this case) the amount of information that the internet places at our disposal. Musk has voiced the assessment that if only we could get past our meagre rate of information uptake that we might somehow be at a better advantage. Having access to all the information in the world has not stopped fake news from becoming a problem; it has made people confident that they can get answers quickly without installing in them an awareness that maybe they don’t understand the answer they got. Getting to answers ever more quickly won’t change this problem.

Humans are saddled with a fundamental set of limits in our ability to process the information that we uptake. Getting to information faster does not guarantee that anyone makes better decisions with that information once they have it. Would people spend all day stuck in social media, doing nothing of use but literally contemplating their own navel lint in the next big time waster app-game, if they could get to that app more quickly? I don’t think they would. Getting to garbage information faster does not assure anything but reaching the outcomes of bad decisions more quickly.

AI has the fundamental potential to simply circumvent this entire cognitive problem by getting rid of everything that is human from the outset. In fact, the weight of what we currently judge as “valuable AI” is a machine that fundamentally makes good decisions based on the data it acquires in a computer’s time frame. By definition, the AI we’re trying to construct doesn’t make bad decisions that a human would otherwise make and would self-optimize to make better decisions than it initially started out making.

What Elon Musk is essentially suggesting with Neuralink is that a computer could be made to regulate the bandwidth of what is going into someone’s skull without there being a tangible intermediary, but that says nothing about the agent that is necessary on the outside to pick and choose what information is sent down the pipe into someone’s head by the hypothetical link. Even if you replaced the soft matter in someone’s head with a monolithic computer chip that does exactly the same thing as a wet brain, you are saddled with the fact that the brain you duplicated is only sometimes making good decisions. The AI we might create, from inception, is going to be built to make more good decisions than the equivalent human brain. Why include a brain at all?

This reveals part of the problem with Neuralink. The requirement that we make better decisions than we do suggests that by placing links into our brains from the outside, we need to include some artificial agent that ultimately has to judge for us whether our brain will make the best decision based upon whatever information the agent might pipe to that brain –time is money and following a wrong path is wasted time. This is required in order for us to remain competitive. That is fundamentally a super intelligence that circumvents our ability to decide what is in our own best interest since people are verifiably not always capable of deciding that: would people be ODing on pain meds so frequently if they made better decisions? Moreover, our brain doesn’t even necessarily need to know what decisions the super intelligence governing our rate of information uptake is making on our behalf. The company that employs the stripped down super-intelligence is more efficient than the one which might make bad decisions based upon the brain that super-intelligence is plugged into. The logical extent of this reasoning is that the computer-person interface is reduced to a person’s brain more or less just being kept occupied and happy while an overarching machine makes all the decisions.

I don’t really like what I see there. It’s a very happy pleasurable little prison which more or less just ultimately says that we’re done. If this kind of super intelligence is created, very likely, we won’t be in a position to stop it, even if we plug our brains into it and pretend we’re catching a ride on the rocket.

I don’t believe that Elon Musk hasn’t thought of it this way. If we are just a boot drive for something better at our niche than us, I don’t see that as different from how things have been throughout the advent of life. If humans as we are go extinct, maybe the world our successor inhabits will be a green, clean heaven. Surely, it will make better decisions than us.

I do understand why Musk is making the effort with Neuralink. Maybe something can be done to place us in a position where, if we create this thing, we will be able to benefit at some level. I suppose that would be the next form of the Bill and Melinda Gates Foundation…

(Edit 9-12-18)

As I am wont to do, I’ve been thinking about this post a bit for several days since I posted it. I feel now that I have a relevant extension.

When I responded to what Elon Musk had said about neuralink, I interpreted his implication is such a way that would definitely not place a living brain on the same page as AI. It seemed to me, and still seems on looking back, that there is a distinct architectural division between the entity of the brain and the link being placed into it.

I think there is perhaps one way to blur the line a bit more. The internal machine link must be flexible and broadly parallel enough at interacting with the brain in such a way that the external component can become interleaved at the level of a neural network. It cannot be a separate neural network; there can be no apparent division for it to work. In such, the training of the brain itself would have to be in parallel to an external neural network in such a way that the network map smoothly spans between the two. In this case, “thinking together,” would have no duality. What it means is that you could probably only do it at this level with an infant whose brain is still rapidly growing and who doesn’t actually have a cohesive enough neural network to really have a full self.

I’m not sure this hybrid has a big advantage over a pure machine. The one possibility that could be open here is that the external part of the amalgamated neural network is open-ended; even though there is finite flexibility in the adult flesh-and-blood brain, awareness would have to be decentralized across the whole network, where the machine part continues to be flexible later in that person’s life. In this way, awareness could smoothly transition to additions into the machine neural network later.

Problem here is that I don’t know of any technology currently available that could build this sort of physical network. The interlinking of neurons in the brain are so casually parallel and flexible that they do not resemble the means by which neural networks are achieved in computers. I don’t believe it can happen by monolithic silicon; there would need to be something new. Given maturity of the technology, could such a thing be expanded to adults? I don’t know.

Science fiction is all well and good, but I think we’re probably not there yet. Maybe at the end of the century of biology using a combination of genetically tamed bacteria and organic semiconductors.

(edit 9-30-18):

One thing to add that I learned a bit earlier this week and maybe poke another little hole in the Cult of Elon. Please note that I never refer to him as “Elon”, I’ve never met him, I’m not on a first name basis with him and I definitely do not know him –to me, he’s Elon Musk or Mr. Musk, but not Elon. I will give him respect by not pretending familiarity with him. I do respect him, in as much as I can respect a celebrity whose exploits I hear and read about in the popular media, but I’m not a member of the Cult of Elon.

Elon Musk gets tremendous credit for Tesla the car company. He runs the company and is given a huge amount of credit for their existence. He does deserve credit for his hard work and his role in Tesla, but beware thinking of Tesla as his child or his creation. Elon Musk did not found Tesla.

Tesla was founded by Martin Eberhard and Marc Tarpenning. Elon Musk was apparently among the major round one investors of the company and ended up as chairman on the company board since he put down a controlling investment share. Musk did not become CEO of Tesla until he help oust Martin Eberhard from that role when Tesla apparently floundered. Eberhard and Tarpenning have since both departed from Tesla and it sounds as if the relationship is an acrimonious one with Eberhard claiming that Musk was rewriting history.

Who can say what claims are completely true, but if you read about Elon Musk, it seems like he doesn’t play very well with others if he isn’t in charge. And, being in charge, he gets a lion’s share of the credit for the vision and execution. Stan Lee gets this kind of credit too and is perhaps imbued with similar vision. It definitely overwrites the creativity of those other talented people who also had a hand in actualizing the creation.

Fact of Tesla is that someone other than Musk started the vision and Musk used his tremendous financial leverage to buy that vision. He now gets credit for it. I’ll let the reader decide how much credit he actually deserves.

Another thing I thought to spend a moment writing about is the reason why I chose the original title to this post. Why “Quality versus Quantity?” In the last part of the original blog post, I mentioned the dichotomy between humans being able to access information as quickly as AI and humans being able to make as good of decisions as AI. I think that making people faster does not equate to making people better. This is one of the potentially powerful (and dangerous) aspects of AI: the point is that AI could be made ab initio to convey human-like intelligence without incorporating the intrinsic, baked-in flaws in human reasoning that are the result of us being the evolved inhabitants of the African savanna rather than the engineered product of a lab.

The tech industry may not be thinking too carefully about this, but the AI that is being created right now is very savant-like; it incorporates mastery acquired in a manner that humans can also “sort of” achieve. Note, I say “sort of” because this superhumanity is achieved by humans at the expense of the parts of humanity that are recognizably human: Autistic savants are not typical people and do not relate to typical people as a typical person would. I believe this kind of intelligence is valuable because many people exhibit qualities of it to the benefit of the rest of the human race, but I think these people are often weak in other regards that place them out of sorts with what is otherwise “human.” Machines duplicating this intelligence are not headed toward being more human because the human parts in the equation slow down the genius. There is an intrinsic advantage to building the AI without the humanity because the parts that are recognizable as human fundamentally do not make the choices which would be a coveted characteristic of a high quality AI. This is not to say that such an AI would be unable to relate to people in manner that humans would be able to regard as “human-like”… to the contrary; I think that these machines can be made so that the human talking to one would be unable to tell the difference, but it would be a mistake to claim that the AI thinks as a human does just because it sounds like a person.

If people given cybernetic interfaces with computers are able to make deep decisions many times more quickly than unaltered humans, does this make them as good as an AI? The quantity of decisions attempted will be offset by the number of times those quickly made decisions turn out to be failures. On the other hand, the AI that people aspire to create is defined by the specifically selected capacity to make successful decisions more frequently than people can. You can see this in the victory of Deep Go over human opponents: the person and the machine made choices at the same rate, alternating turns at choices so that their decision rate was 1:1, but the machine made right choices more frequently and tended to win. Would the person have been better if they had made choices faster? If the AI makes one decision of sufficient foresight and quality that humans are required to stumble through ten decisions in order to just keep up, what point is there in humans being faster than they are? While the AI is intrinsically faster just by being a machine, this does not begin to touch the potential that the AI need not be intrinsically faster. It just needs to be able to make that one decision that the fastest person had no hope of ever seeing. Smarter is not always faster.

That’s what I mean by quality versus quantity. Put another way, would Elon Musk have made his notorious “funding secured” Tweet, which has since gotten him sued by the SEC, and lost him his position as chairman of the Tesla board, if he had a smartphone plugged straight into his brain? His out of control interface with his waistband mounted internet box is what caused him problems in the first place, would an even more intimate interface have improved matters? Where an AI could’ve helped is by interceding, recognizing that the decision would run afoul with the SEC in two months and prevented the Tweet from being carried out.

Think about that. It should scare the literal piss out of you.

Magnets, how do they work? (part 4)

(footnote: here lies Quantum Mechanical Spin)

This post continues the discussion of how ferromagnets work as considered in part 1, part 2 and part 3. The previous parts dealt with the basics of electromagnetism, introducing the connections from Maxwell’s equations to the magnetic field, illustrating the origin of the magnetic dipole and finally demonstrating how force is exerted on a magnetic dipole by a magnetic field.

In this post, I will extend in a totally different direction. All of the previous work was highlighting magnetism as it occurs with electromagnets, how electric currents create magnetic field and respond to those fields. The magnetic dipoles I’ve outlined to this point of time are loops of wire carrying electric current. Bar magnets have no actual electrical wires in them and do not possess any batteries or circuitry, so the magnetic field coming from them must be generated by some other means. The source of this is a cryptic phenomenon that is in its nature quantum mechanical. I did hint at it in part 3, but I will address it now head on.

In 1922, Walther Gerlach and Otto Stern published an academic paper where they brought to light a weird new phenomenon which nobody had seen prior (it’s actually the third paper in a series that describes the development of the experiment, with the first appearing in 1921). That paper may be found here if you aren’t stuck behind a pay wall. Granted, the paper is in German and will require you to find some means of translation, but that is the original paper. The paper containing the full hypothesis is here.

In their experiment Stern and Gerlach built an evaporator furnace to volatilize silver. Under a low pressure vacuum, as good as could be attained at the time, silver atomized from the furnace was piped through a series of slits to collimate a beam of flying silver atoms. This beam of silver atoms was then passed through the core of a magnetic field generated by an electromagnet in a situation much as mentioned previously in the context of Lorentz force.


As illustrated here, one would expect a flying positive charge ‘q’ with velocity ‘v’ to bend one way upon entering magnetic field ‘B’, while a negative charge bends the other. Without charge, there is no deflection due to Lorentz force. In the Stern-Gerlach experiment, the silver atom beam passing through the magnetic field then impinges on a plate of glass, where the atoms are deposited. This glass plate could be taken and subjected to photographic chemistry to “develop” and enhance the intensity of the silver deposited on the surface, enabling the experimenters to see more clearly any deposition on the surface of the glass. According to the paper, the atom beam was cast through the magnetic field for 8 hours in a stretch before the glass plate was developed to see the outcome.

The special thing about the magnetic field in the Stern Gerlach experiment is that, unlike the one in the figure above, it was intended to have inhomogeneity… that is, to be very non-uniform.

For the classical expectations, a silver atom is a silver atom is a silver atom, where all such atoms are identical to one another. From the evaporated source, the atoms are expected to have no charge and would be undeflected by a magnetic field due to conventional Lorentz force, as depicted above. So, what was the Stern-Gerlach experiment looking for?

Given the new quantum theory that was emerging at the time, Stern and Gerlach set out to examine quantization of angular momentum of a single atom. Silver is an interesting case because it has a full d-orbital, but only a half-filled s-orbital. In retrospect, s-orbitals are special because they have no orbital angular momentum themselves. This in addition to the other closed shells in the atom would suggest no orbital angular momentum for this atom. In 1922, the de Broglie matter wave was not yet proposed and Schrodinger and Heisenberg had not yet produced their mathematics; quantum mechanics was still “the old quantum” involving ideas like the Bohr atom. In the Bohr atom, electron orbits are allowed to have angular momentum because they explicitly ‘go’ around, exactly like the current loop that was used for calculations in the previous parts of this series. The idea then was to look for quantized angular momentum by trying to detect magnetic dipole moments. A detection would be exactly as detailed in part 3 of this series; magnetic moments are attracted or repelled depending on their orientation with respect to an external magnetic field.

In their experiment, Stern and Gerlach did what scientists do: they exposed a glass plate to the silver beam with the electromagnet turned off, and then they turned around and did the same experiment with the magnet turned on. It produced the following set of figures:

Stern gerlach figure 2 and 3

The first circle, seen at left, is Figure 2 from the paper, where there is no magnetic field exerted on the beam. The second circle, with the ruler in it, is Figure 3, where a magnetic field has now been turned on. In the region at the center or the image, the atom beam is clearly split into two bands relative to the control exposure. The section of field in the middle of the image contains a deliberate gradient, where the field points horizontally with respect to the image and changes strength going from left to right. One population of silver diverts left under the influence of the magnetic field while a second population diverts right.

Why do they deviate?

What this observation means is that the S-orbital electron in an evaporated silver atom, having no magnetic dipole moment due to the orbital angular momentum of going around the silver atom nucleus, has an intrinsic dipole moment in and of itself that can feel force under the influence of an external magnetic field gradient. This is very special.

The figure above is an example of a quantum mechanical “observation” where what has appeared is “eigenstates.” As I’ve repeated many times, when you make an observation in quantum mechanics, you only ever actually see eigenstates. In this case, it is a very special eigenstate with no fully classical analog, Spin. For fundamental spin, especially the spin of a silver atom with a single unpaired S-orbital, there are only two possible spin states, called now spin-up and spin-down. Spin appears by providing a magnetic dipole moment to a “spinning” quantum mechanical object. The electron, having a charge and a spin, has a magnetic dipole moment and is therefore responsive to magnetic field gradient. The population of silver atoms passing into the magnetic field deflect relative to this tiny electron dipole moment, where the nucleus is being dragged by the “S-orbital” electron state due to the electrostatic interaction between the electrons and the nucleus. The dipole moment is repelled or attracted in the magnetic field gradient exactly as described in part 3, and since this dipole is quantum mechanical, it samples only two possible states: oriented with the external field or oriented against it, giving two bands in the figure above.

The conventional depiction of the magnetic dipole formed by a wire loop can be adopted to the quantum mechanical phenomenon of spin by adding a scale adjustment called the gyromagnetic ratio. This number enables the angular momentum actually associated with the spin quantum number to be scaled slightly to account for the strength of the magnetic dipole produced by that spin. This is necessary since a particle carrying a spin is not actually a wire loop –the great peculiarity of spin is that if it is postulated as the internal rotation of a given particle, the calculated distribution of the object in question tends to break relativity in order to generate the appropriate angular momentum, leading most physicists to consider spin to be a quantum mechanical phenomenon that is not actually the object ‘spinning’. For all intents and purposes, spin is very like actual rotational spin and it shows up in a way that is very similar to electric charges running around a wire loop.

spin magnetic moment

The math in this figure is quick and fairly painless; it converts magnetic dipole moment from a wire loop into a magnetic dipole moment that is due to spin angular momentum. The equation at the start is classical. The equation at the end is quantum mechanical. One thing that you often see in non-relativistic quantum mechanics is that classical quantities adopt into quantum mechanics as operators, so the thing at the very end is the magnetic dipole moment operator. This quantity can be recast various ways, including with the Bohr magneton and in various adjustments of g while the full operator is useful in Zeeman splitting and in NMR.

The existence of spin gives us a very interesting quantity; this is a magnetic dipole moment that is intrinsic to matter in the same way as electric charge. It simply exists. You don’t have to create it, as in the wire loop electromagnet, because it is already just there. There is no requirement for batteries or wires. Spin is one candidate source for the magnetic dipole moment that is required to produce a bar magnet.

It is completely possible to attribute the magnetism of bar magnets to spin, but saying it this way is actually something of a cop-out. How are atoms organized so that the spin present in atoms of iron becomes large enough to create a field that can cause a piece of metal to literally jump out of your hand and go sliding across the table? Individual electronic and atomic spins are really very tiny and getting them to organize in such a  way that many of them can reinforce each other’s strengths is difficult. I’ve said previously that chemistry is wholly dependent on angular momentum closures and one will note that atomic orbitals fill or chemically bond in such a way as to negate angular momentum: for example, S-orbitals (and each and every available orbital) are filled by two electrons, one spin-up and one spin-down, so that no individual orbital is left with angular momentum. Sigma bonds and Pi bonds are formed so that unpaired electrons in any atom may be shared out to other atoms in order for participants to cancel their spin angular momentum. While there are exceptions, like radicals, nature generally abhors exposed spin. Even silver, the atoms of which are understood to have detectable spin, is not ferromagnetic: you can’t make a bar magnet out of silver! What conspires to make it possible for spin to become macroscopically big in bar magnets? This is the one big puzzle left unanswered.

As an interesting aside, in their paper, Stern and Gerlach add an acknowledgement thanking a “Mr. A. Einstein” for helping provide them with the electromagnet used in their experiment from the academic center he headed at the time.

Magnets, how do they work? (part 3)

In this section I intend to detail the source of magnetic force, particularly as experienced by loops of wire in the form of magnetic dipoles. The intent here is to address ultimately how compass needles turn and how ferromagnets attract each other.

I should start by asking for forgiveness. I’ve recently defended my PhD. While the weight is off now, the experience has hobbled my writing voice. It really should be easier at this point but there’s a hollowness that gnaws at me every time I sit down to write. Please forgive the listless undercurrent I’m trying to shake off. The protracted effort of finishing an advanced degree is not small by itself, but it was combined in this case with the first couple months of my daughter’s life. If you’ve ever tried to finish a PhD and survive the first six weeks of an infant’s life simultaneously, you will perhaps know the scope of this strain. I feel thin. But, I’m surviving. This post has lingered for a few months with me going back and forth trying to find the strength to soldier through.

If you will recall the previous sections I posted, part 1 and part 2, you’ll remember that I’m pursuing the lofty goal of explaining how magnets work. In part 1, I detailed some of the very basic equations for magnetism, including connections from Biot-Savart to Ampere’s Law, producing some of the basic definitions of the magnetic field. In part 2, I tackled the construct of the magnetic dipole in the form of a loop of wire. My ultimate goal is to explain how it is that an object like a compass needle can possess and respond to magnetic fields without anything like a loop of wire present. The goal today is to tackle where magnetic force comes from, that is how an object like a magnetic dipole can be dragged through space or rotated so that it changes its orientation in a magnetic field.

As you may already know, the fundamental equation describing magnetic force is the Lorentz force equation.

Lorentz force

This particular version combines electric force (from the E-field) with magnetic force (from the B-field). In this equation ‘F’ is force, ‘q’ is electric charge, ‘v’ is the velocity of that charge, ‘E’ is the E-field and ‘B’ is the B-field. The electric field part of the equation is not needed and we can focus solely on the magnetic part. Magnetic force is a cross product, signified by the ‘x’, which means that the force of the interaction occurs at right angles to the magnetic field acting on the object and the path that object is traveling. If you stop and think about it, this is kind of weird since it means that an electrically charged object must be moving in order to feel a magnetic force. But magnets appear to feel force even if they aren’t moving, right?

A fundamental part of what makes electronics special is that, while the mass of the circuitry stays firmly fixed in position, the electric charges within the wires are able to move. The electricity inside moves even while the computer sits stupidly on the desk. I know this comes as a surprise to no one, but electricity is definitely a something that moves even though the object it moves through appears to remain stationary.

One typical way to deal with the magnetic part of the Lorentz force equation is to cast it in a form conducive to electric current (defined as ‘moving charge’) rather than to directly consider ‘a charge that is moving.’ To do this, you fragment force as a whole into just a piece of force as exerted on a fragment of the charge present in the electric current.

equation 1 lorentz eqn rejigger

In this recast, the force is considered to be due to that tiny fraction of charge. Velocity opens up into length traveled per time where the length contains the fragment of charge ‘dq’. The differential for time is shifted from the length to the charge, creating a current present within the length, “electric current” being defined as “amount of charge passing a point of measurement during a length of time.” In the final form, the fragment of force is due to a current in a length of wire as crossed into the B-field. You could add up all the lengths of a wire containing the current and find the sum of all magnetic force on that wire. One thing to note is that the sign on the current by convention follows the vector direction associated with the length, where the current is considered to be moving positive charges traveling along the length. The direction on the differential length is residual from the velocity. In reality, for real electric current, the current ‘I’ carries a negative sign for the ‘minus’ value of electric charge, creating a negative sign on the force. Negative current will behave as if it is positive current traveling backward.

From previous work, all the elements now exist for dealing with an electromagnet, where the magnetic field comes from and how force is exerted. As illustrated in the previous post, a magnetic field is a mathematical object which is produced in a region of space around a moving charge. As demonstrated here, a basic force is felt by a moving charge when it passes through a magnetic field. These are empirical observations which can be adjusted to say simply that one moving charge has a way of exerting force on a second moving charge, where the force between them is strictly dependent on their movement. If neither such charge were moving, they would feel no force and, if only one charge or the other was moving, they would also feel no force. The idea of magnetic force is in this way a profoundly alien thing, we can understand it only to be a result of the basic precondition of having electric charges moving with respect to one another.

The savvy, science-literate reader may stop and think hard about this and say “Wait a minute, neutron stars, objects made of material lacking any electric charge, have a very powerful magnetic field.” To this, I would smile and refer you to quantum mechanics. The fact that an electrically uncharged neutron can possess or respond to a magnetic field is one of the pieces of evidence that suggests that the protons and neutrons in atomic nuclei are themselves divisible into smaller objects, quarks. One of the great successes of Quantum Electrodynamics was precision calculation of the gyromagnetic ratio of the electron, connecting the magnetic dipole moment of a stationary electron to that electron’s quantum mechanical spin. Spin can be regarded very simply as true to its name: a motion undertaken by an object that does not shift the location of that object’s center of mass. Therefore, magnetic field resulting from spin is still a product of some sort of motion. I probably will never talk very deeply about the Quantum Electrodynamics because I don’t believe I have a very good understanding of it.

There is also a mathematical trick that one can play using Einstein’s Special Relativity to unify electric force and magnetic force, showing that magnetic field is a frame of reference effect and that electric fields are essentially the same thing as magnetic fields, but I will speak no more of this in the current post.

The bottom line, though, is simply this: magnetic force and field, in terms put forward by the Lorentz force law written above and the Biot Savart Law written previously, are due to the motion of charges as currents, either fractional (quarks) or integer charges (electrons, protons and ions) both. This motion can be either translational, such that the charge moves in some direction, or rotational, such that an apparently stationary charge sits there “spinning” sort of like a top.

How these moving currents exert force can be illustrated using the math derived above. The most basic assembly that usually appears in physics classes is the example of two metal wires, each conducting an electric current.

two wiresIn this image, I’ve sketched the basic situation where two wires exist in a cartesian space. The arrangement is in forced perspective because I felt like trying to be artistic. These wires are parallel to each other and the separation between them is constant everywhere along their lengths. Both wires contain an electric current of positive sign that is moving parallel to the z-direction with both currents moving in the same direction. We will assume for simplicity that the separation is much larger than the cross-sectional width of the wire so that we don’t have to do more math than is necessary… in other words, the current is traveling along a line placed along the center of the wire. Here, both wires will produce magnetic fields and, conversely, the currents inside both wires will feel force exerted on them by the magnetic field produced by the other wire.

Electric currents remain trapped within wires because these objects stay electrically neutral: a moving electron is held from leaving the wire by the force of oppositely charged atoms arranged in the crystal lattice of the wire. Force exerted on the current by the magnetic field is transferred to the mass of the wire by these electrical interactions. In a metal wire, “loose” electrons reside in a quantum mechanical structure called a “conduction band” that only exists within the lattice of the crystalline host. Electrons are able to flow freely within this conduction band and cannot leave unless they have been provided with enough energy to jump out of the crystal, this amount of energy called the work function, as illustrated –for instance– by the photoelectric effect. Even under magnetic force, which is felt by the moving charges within the wire and not directly by the mass of the wire, moving charges don’t suddenly jump out of the stationary wire. Magnetic forces on such a current carrying wire can cause the entire wire to move, where the magnetically responsive current drags the entire mass of the wire with it by electrostatic interactions. If enough energy is supplied to loose charges within the wire bulk, these charges can be forced to jump out of the wire, but they usually won’t since most interactions do not provide them with sufficient energy to exceed the work function. Einstein won his Nobel prize for essentially predicting this in the form of the photoelectric effect.

These details not withstanding, the magnetic field produced by one wire can be calculated using Ampere’s Law generated in the previous post.

Loop integral

This magnetic field is the magnetic field of the wire. The only thing you truly need to know here is that the magnetic field will wrap around the wire in the direction of the arrow in the figure above, assuming that the current with positive sign is coming straight out of the page at you. It is noteworthy that the field strength will tend to fall off something like 1/distance moving away from the wire.

Here is the force on the second wire given the magnetic field (from above) imposed on it from the first wire.

attracting wires

With the currents pointed parallel, the wires will tend to experience forces that are directed inward between them. They will tend to pull together.

Suppose we flip the direction of the current in wire 2…

repelling wires

Here, the situation is reversed. The forces are outward such that the wires tend to repel each other. Consider that I’ve done a very soft calculation to see this: all I did was use the direction of the magnetic field at one wire as generated by the opposite wire, filled in the direction of the current for the relevant wire and worked the cross product in my head. There is a subtlety due to the fact that real currents in real wires have the negative charge of real electrons, but the result doesn’t change: parallel currents going the same direction tend to attract while parallel currents going in opposite directions tend to repel.

With the simple construct of two parallel wires, we have the basic tools necessary to go crazy and build us one of these:


Here, we’ve got two parallel wires with current running in opposite directions where we place a third wire perpendicular, in a current arc, between the two. Here is the arrangement:

railgun diagram

In this case, the Lorentz force on the third wire is directed parallel to the first two wires. If the third wire is just a sliding bridge, the magnetic force will accelerate it parallel to the direction of the first two wires: given very high currents and a long accelerating path, this could produce very high velocities.

The advantage is actually quite remarkable in the case of a railgun. For a conventional gun, the muzzle velocity is limited by the detonation rate of the gunpowder, so that the projectile can’t ever go faster than the explosion of the gunpowder expands. For a railgun, there is no such limit. Further, this suggests some architectural requirements in the railgun: the two rails are parallel to each other and have current running in opposite directions, meaning that the rails of the railgun push outward against each other, so that the railgun wants to explode apart. The barrel of the railgun must therefore be built strongly enough to prevent this explosion from occurring. This device is ridiculously simple, but has been militarily difficult to realize because nobody has had a compact or powerful enough electrical generator to realize velocities higher than gunpowder alone that could be transported with the mechanism.

The railgun is really just a momentary curiosity in this post to show that the basic idea of magnetic force has a tangible realization. The next objective it to pursue the compass needle…

For this, we come back to the notion of a current loop as seen in the magnetic dipole post. To begin with, you could fabricate a simplified version of the current loop by simply expanding the model used for the railgun.

self force of loop

In this construction, the wires are all physically connected to each other with the current of wire 1 spilling into wire 4, then from 4 into 2 and so on, going around. The currents in each wire would therefore all be equal. Further, the magnetic field would also be equal on each wire and pointed upward normal to the plane of the loop –if you look back at the images of the magnetic field produced by a wire loop as in the previous post, you can convince yourself that this is the case. The cross product would therefore cause the force to be pointing outward at every location in the plane of the loop. Since the magnitudes of the forces are all equal and the directions are all in opposition, there would be no net force on the object. This is not to say no force; the forces just all balance. For a current loop, as in the railgun, the self-forces are making the loop want to explode outward. The magnetic field of a loop on itself therefore can’t cause that object to translate, but if you increase the current high enough, the force would exceed the tensile strength of the loop and cause it to explode apart.

As I’ve previously mentioned, the wire loop is an analog to the magnetic dipole. I will once again assert totally without proof that a compass needle is essentially a magnetic dipole and will have the same behaviors as a magnetic dipole. If we learn how a current carrying wire loop moves, we will have shown how a compass needle also moves.

Consider first the wire loop immersed in an external magnetic field. This magnetic field will be at an angle to the loop and will be uniform everywhere, which is to say that the strength of the external field is the same on all parts of the loop. Once again, the loop will carry a circulating current of ‘I’.

Current loop in field at angle

First, we could calculate the net force exerted on this wire loop by the external field. You may have an intuition about it, but I’ll calculate it anyway.

Here, I will set up the Lorentz force so that I can calculate each element of the loop and them sum them up by integral. This will ultimately lead me to finding the net force of a uniform magnetic field on a current loop.

Integrating force on loop p1

This converts the force into a cartesian form that can be calculated in a polar geometry, integrating only over the angle Phi in the x-y plane.

Integrating force on loop p2

After working through the cross product, of which only four terms survive, careful examination of these terms shows that there are only two unique integrals in terms of Phi. When you see which they are, since I’m integrating over the full circle, you should know instantly what will happen…

Integrating force on loop p3

Despite the fact that there’s an angle in this calculation, a uniform magnetic field on a current loop will not cause the loop to translate since there is no net force, meaning that the loop cannot be dragged in any direction.

Even though I explicitly ran the calculation so that the astute observer notes where the structure collapses to zero, a little bit of simple logic should also reveal the truth. For the ring of current, there are always two points along the ring which can be selected which are diametrically opposed: these points always experience the same force, but in opposite directions. Therefore, for any set of two such points selected on the ring, the forces cancel to zero, even though the magnetic field is at an angle to the ring, which covers every location along the ring. This depends on the fact that the magnetic field is everywhere uniform. If the strengths of the B-field had been dependent of Phi in the calculation above, there could have been four unique terms, of which maybe none would have integrated to zero.

I’ve concluded here that the ring cannot be dragged in any direction. Note, I did not say that the ring doesn’t move! A more interesting case is to consider what happens if we look instead for torque on the ring. Remember that torque is the rotational equivalent of force, which can cause an object to turn without actually dragging it in any direction.

For convenience, I will calculate the torque from an origin at the center of the ring. I can place my origin anywhere in space that I like, but I’ll fix it to a location which removes a few mathematical steps. I would also note that the magnetic field and the differential length element for a section of ring also have the same forms that I found for them above.

Integrating torque on loop p1

The vector identity I’ve used here is a very simple one which removes the intricacy of the cross product and leaves me with just a vector dot product. I’ve used the fact that the vector describing the location of the unit length of the ring is perpendicular to that unit length at every location where this calculation would ever be made, so long as I calculate torque from the center of the ring.

I already found ‘B’ and ‘dl’ above, so I just need to find a compatible form for the position vector ‘r.’

Integrating torque on loop p2

With this I can finally put all the elements together and start integrating.

Integrating torque on loop p3

After cleaning up the vectors and performing a bit of algebra to consolidate terms, we see that there are only three integrals sitting inside that mess. I chose the limits of integration because I want to work the integral through 360 degrees of the current loop, so 0 to 2Pi. I will work each in turn, but they are easy integrals.

torq integral 1

The first integral simply goes to zero, meaning that the first term in the torque will die. What about the next integral?

torq integral 2

This integral didn’t die. It gave me a piece of pi. The next integral works in a similar manner.

torq integral 3

So, we substitute these three results into the torque equation.

Integrating torque on loop p4

If you squint at the vector portion of that final result there, you might realize that it looks very much like a cross product.

Integrating torque on loop p5

So, a current loop does experience torque when immersed in a magnetic field. Moreover, the vector quantity in that cross product that I left unpacked should look eerily familiar. You might look back at that previous post I did on the magnetic dipole in order to recognize the magnetic dipole moment.

Integrating torque on loop p6

I have achieved a compact expression that says that the current loop will experience a torque within a magnetic field. If the magnetic field is uniform in strength everywhere over the loop, the loop will not be dragged in any direction, but it can be rotated since it will experience a torque. The nature of this rotation can be predicted from the form of the cross product.


If a plane is formed between the magnetic dipole moment of the loop and the magnetic field, the loop will tend to rotate around an axis perpendicular to that plane. Also, because of the form of the cross product, the torque is maximum if the angle between the dipole and the field is 90 degrees; if the vectors point in the same direction (or in exactly opposite directions), the torque goes to zero given the sine. So, the magnetic dipole will tend to want to oscillate around pointing the same direction as the magnetic field and if the action involves friction –so that energy imparted by work done from the torque can be dispersed– these vectors will tend to point in the same direction.

Does this description remind you of anything?


Image from wikipedia

If the needle of a magnetic compass contains a magnetic dipole that points along the needle’s axis, this equation perfectly describes how that needle behaves.

Magnetic dipoles tend to rotate to point along magnetic fields.

There is a non-trivial provision in this statement. The rotation effect I’ve described will occur if the current or moving charge has a trivially small angular momentum with respect to the total rotational inertia of the rotating object. If the angular momentum is large, something very different will happen: the magnetic dipole moment will actually try to precess around the axis of the magnetic field… that is, it will tend to move more like a gyroscope instead of a compass needle. I won’t back this statement up right now, but I hope instead to write a bit more about NMR, of which the classical view involves magnetic precession (magnetic precession fits into the quantum mechanical view of NMR as well, but the effect is much more difficult to see).

This bit of physics also explains why bar magnets tend to rotate in magnetic fields, which is one of the original objectives of this series of posts. This is how magnets (and I use that in the ICP sense of the word) tend to rotate.

How bar magnets move in a magnetic field can be accessed with just a bit more work.

After having collapsed away the directionality of the vectors to produce a scalar version of magnetic torque that shows only the magnitude of torque (so that you can see the sine in the equation), it’s possible to construct a magnetic energy involving the magnetic dipole moment and the field by simply finding the work performed in rotation. The rotational analog of work is torque imposed over a rotation, yielding another integral.


The potential here is a very special one because it’s also the Hamiltonian for spin in a magnetic field in quantum mechanics. I’ll stop short of jumping into the quantum and simply manipulate classical physics. One thing to note here is that I earlier stated that a magnetic dipole experiences no net force if the magnetic field is uniform. What if the magnetic field is no longer uniform?

This sort of potential depends not only on the angle between the vectors, but on the form of the vectors themselves. One way to return to directional force from a potential is to simply take the (spatial) gradient of the potential: it’s important to note that the vectors above are in a dot product, reducing the combination to a scalar… working the gradient of this dot product goes backward through the calculus which produces work from force, instead producing vectoral force from a scalar potential.

Force on dipole

It’s initially difficult to see what this will do, so I’m going to create a situation of simple constructs to demonstrate it. Suppose we have a magnetic dipole sitting in a magnetic field where the dipole and field are pointing in the same direction. Now, suppose that the intensity of this magnetic field gets weaker in some direction, conveniently along the axis that is shared by both vectors.

force on dipole 2

In this particular case, the magnetic dipole will tend to feel a force, as indicated, running opposite the z-axis. It is literally running toward where the magnetic field gets stronger. Note, if you flip the direction of the magnetic dipole, you also flip the sign on the force, making the dipole want to accelerate toward where the magnetic field is weaker. What is this in terms of “toward” or “away from” when considering a real magnetic dipole? Recall my fancy picture of the dipolar magnetic field from three neighboring dipoles:


Here, the colors show the intensity of the field, with red as strong and blue as weak. The fields are red where the dipoles are located and blue further away, meaning that the intensity of the magnetic fields decrease as you go away from a magnetic dipole. In the demonstration of magnetic force above, if the dipole is oriented so that it is in the same direction as the field, it will want to accelerate toward stronger field…. or toward the source of that field if that field is from another dipole. Conversely, if the dipole is oriented so that it faces where the field gets stronger, it will be pushed toward weaker field. In the case of a dipole pointed parallel to the z-axis and positioned at (0,0), the directions of the field look like this:

magnetic dipole

Where the intensity of the field will decrease going away from the origin. A second dipole positioned at location (0,5) and pointed along the z-axis will want to accelerate toward the origin (be attracted), but if rotated to point -z, it will accelerate away (be repelled).

This actually sums up all the behaviors of the bar magnets. In the case of bar magnets, the ends are assigned polarity as the north and south poles. If the magnets are faced with their north ends pointed at each other, the magnets tend to repel, while north end facing south end, they tend to attract. If two magnets are allowed to accelerate toward each other when the south end is pointed to north end, they impact and stick. Meanwhile if they are positioned to repel, north to north, they tend to accelerate away from one another, unless the orientation of one is bumped, whereby one magnet abruptly rotates around 180 degrees (given the non-zero torque mentioned above), and both magnets attract each other again and may accelerate toward each other to stick.

Wow, huh? That sums up how bar magnets work.

So, why doesn’t a compass needle jump out of your hand and accelerate toward one of the poles of planet Earth? Both are dipoles, right. It’s mainly because the field of the Earth is nearly uniform at the location where the compass needle experiences it and therefore with such small gradient, it can’t pull the compass out of your hand.

One subtlety a physics student may note here is that magnetic fields are universally understood to do no work. But, two magnets accelerating across the table and sticking to each other sounds a lot like work. The force I’ve provided as the source of this work is actually due to a spatial derivative of the magnetic field, a gradient, which turns out to be an electric field of a sort. What? Yeah, I know. Weird, but true.

Keep in mind that I haven’t actually solved the final problem of the original post: all of my magnetic dipoles to this point are generated by electrical currents in wires. I still need to show where the magnetic dipole comes from in a metal like iron since there aren’t any batteries in a bar magnet or a compass needle. This is actually a very hard question that dips directly into quantum mechanics and I will end this post here because quantum is its own arena.

Disagreeing with “Our Mathematical Universe”

My wife and I have been listening to Max Tegmark’s book “Our Mathematical Universe: My Quest for the Ultimate Nature of Reality” as an audiobook during our trips to and from work lately.

When he hit his chapter explaining Quantum Mechanics and his “Level 3 multiverse” I found that I profoundly disagree with this guy. It’s clear that he’s a grade A cosmologist, but I think he skirts dangerously close to being a quantum crank when it comes to multi-universe theory. I’ve been disagreeing with his take for the last couple driving sessions and I will do my best to try to sum for memory the specific issues that I’ve taken. Since this is a physicist making these claims, it’s important that I be accurate about my disagreement. In fact, I’ll start with just one and see whether I feel like going further from there…

The first place where I disagree is where he seems to show physicist Dunning-Kruger when regarding other fields in which he is not an expert. Physicists are very smart people, but they have a nasty habit of overestimating their competence in neighboring sciences… particularly biology. I am in a unique position in that I’ve been doubly educated; I have a solid background in biochemistry and cell molecular biology in addition to my background in quantum mechanics. I can speak at a fair level on both.

Professor Tegmark uses an anecdote (got to be careful here; anecdotes inflate mathematical imprecision) to illustrate how he feels quantum mechanics connects to events at a macroscopic level in organisms. There are many versions, but essentially he says this: when he is biking, the quantum mechanical behavior of an atom crossing through a gated ion channel in his brain affects whether or not he sees an oncoming car, which then may or may not hit him. By quantum mechanics, whether he gets hit or not by the car should be a superposition of states depending on whether or not the atom passes through the membrane of a neuron and enables him to have the thought to save himself or not. He ultimately elaborates this by asserting that “collapse free” quantum mechanics states that there is one universe where he saved himself and one universe where he didn’t… and he uses this as a thought experiment to justify what he calls a “level 3” multiverse with parallel realities that are coherent to each other but differ by the direction that a quantum mechanical wave function collapse took.

I feel his anecdote is a massive oversimplification that more or less throws the baby out with the bath water. Illustration of the quantum event in question is “Whether or not a calcium ion in his brain passes through a calcium gate” as connected to the macroscopic biological phenomenon of “whether he decides to bike through traffic” or alternatively “whether or not he decides to turn his eye in the appropriate direction” or alternatively “whether or not he sees a car coming when he starts to bike.”

You may notice this as a variant of the Schrodinger “Cat in a box” thought experiment. In this experiment, a cat is locked in a perfectly closed box with a sample of radioactive material and a Geiger counter that will dump acid onto the cat if it detects a decay; as long as the box is closed, the cat will remain in some superposition of states, conventionally considered “alive” or “dead” as connected with whether or not the isotope emitted a radioactive decay or not. I’ve made my feelings of this thought experiment known before here.

The fundamental difficulty comes down to what the superposition of states means when you start connecting an object with a very simple spectrum of states, like an atom, to an object with a very complex spectrum of states, like a whole cat. You could suppose that the cat and the radioactive emission become entangled, but I feel that there’s some question whether you could ever actually know whether or not they were entangled simply because you can’t discretely figure out what the superposition should mean: alive and dead for the cat are not a binary on-off difference from one another as “emitted or not” is for the radioactive atom. There are a huge number of states the cat might occupy that are very similar to one another in energy and the spectrum spanning “alive” to “dead” is so complicated that it might as well just be a thermal universe. If the entanglement actually happened or not, in this case, the classical thermodynamics and statistical mechanics should be enough to tell you in classically “accurate enough” terms what you find when you open the box. If you wait one half-life of a bulk radioactive sample, when you open the box, you’ll find a cat that is burned by acid to some degree or another. At some point, quantum mechanics does give rise to classical reality, but where?

The “but where” is always where these arguments hit their wall.

In the anecdote Tegmark uses, as I’ve written above, the “whether a calcium ion crossed through a channel or not” is the quantum mechanical phenomenon connected to “whether an oncoming car hit me or not while I was biking.”

The problem that I have with this particular argument is that it loses scale. This is where quantum flapdoodle comes from. Does the scale make sense? Is all the cogitation associated with seeing a car and operating a bike on the same scale as where you can actually see quantum mechanical phenomena? No, it isn’t.

First, all the information coming to your brain from your eyes telling you that the car is present originate from many many cells in your retina, involving billions of interactions with light. The muscles that move your eyes and your head to see the car are instructed from thousands of nerves firing simultaneously and these nerves fire from gradients of Calcium and other ions… molar scale quantities of atoms! A nerve doesn’t fire or not based on the collapse of possibilities for a single calcium ion. It fires based on thermodynamic quantities of ions flowing through many gated ion channels all at once. The net effect of one particular atom experiencing quantum mechanical ambivalence is swamped under statistically large quantities of atoms picking all of the choices they can pick from the whole range of possibilities available to them, giving rise to the bulk phenomenon of the neuron firing. Let’s put it this way: for the nerve to fire or not based on quantum mechanical superposition of calcium ions would demand that the nerve visit that single thermodynamic state where all the ions fail to flow through all the open ion gates in the membrane of the cell all at once… and there are statistically few states where this has happened compared to the statistically many states where some ions or many ions have chosen to pass through the gated pore (this is what underpins the chemical potential that drives the functioning of the cell). If you bothered to learn any stat mech at all, you would know that this state is such a rare one that it would probably not be visited even once in the entire age of the universe. Voltage gradients in nerve cells are established and maintained through copious application of chemical energy, which is truthfully constructed from quantum mechanics and mainly expressed in bulk level by plain old classical thermodynamics. And this is merely the state of whether a single nerve “fired or not” taken in aggregate with the fact that your capacity for “thought” doesn’t depend enough on a single nerve that you can’t lose that one nerve and fail to think –if a single nerve in your retina failed to fire, all the sister nerves around it would still deliver an image of the car speeding toward you to your brain.

Do atoms like a single calcium ion subsist in quantum mechanical ambivalence when left to their own devices? Yes, they do. But, when you put together a large collection of these atoms simultaneously, it is physically improbable that every single atom will make the same choice all at once. At some point you get a bulk thermodynamic behavior and the decision that your brain makes are based on bulk thermodynamic behaviors, not isolated quantum mechanical events.

Pretending that a person made a cognitive choice based on the quantum mechanical outcomes of a single atom is a reductio ad absurdum and it is profoundly disingenuous to start talking about entire parallel universes where you swerved right on your bike instead of left based on that single calcium atom (regardless of how liberally you wave around the butterfly effect). The nature of physiology in a human being at all levels is about biasing fundamentally random behavior into directed, ordered action, so focusing on one potential speck of randomness doesn’t mean that the aggregate should fail to behave as it always does. All the air in the room where you’re standing right now could suddenly pop into the far corner leaving you to suffocate (there is one such state in the statistical ensemble), but that doesn’t mean that it will…. closer to home, you might win a $500 million Power Ball Jackpot, but that doesn’t mean you will!

I honestly do not know what I think about the multiverse or about parallel universes. I would say I’m agnostic on the subject. But, if all parallel universe theory is based on such breathtaking Dunning-Kruger as Professor Tegmark exhibits when talking about the connection between quantum mechanics and actualization of biological systems, the only stance I’m motivated to take is that we don’t know nearly enough to be speculating. If Tegmark is supporting multiverse theory based on such thinking, he hasn’t thought about the subject deeply enough. Scale matters here and neglecting the scale means you’re neglecting the math! Is he neglecting the math elsewhere in his other huge, generalizing statements? For the scale of individual atoms, I can see how these ideas are seductive, but stretching it into statistical systems is just wrong when you start claiming that you’re seeing the effects of quantum mechanics at macroscopic biological levels when people actually do not. It’s like Tegmark is trying to give Deepak Chopra ammunition!

Ok, just one gripe there. I figure I probably have room for another.

In another series of statements that Tegmark makes in his discussion of quantum mechanics, I think he probably knows better, but by adopting the framing he has, he risks misinforming the audience. After a short discussion of the origins of Quantum Mechanics, he introduces the Schrodinger Equation as the end-all, be-all of the field (despite speaking briefly of Lagrangian path integral formalism elsewhere). One of the main theses of his book is that “the universe is mathematical” and therefore the whole of reality is deterministic based on the predictions of equations like Schrodinger’s equation. If you can write the wave equation of the whole universe, he says, Schrodinger’s equation governs how all of it works.

This is wrong.

And, I find this to miss most of the point of what physics is and what it actually does. Math is valuable to the physics, but one must always be careful that the math not break free of its observational justification. Most of what physics is about is making measurements of the world around us and fitting those measurements to mathematical models, the “theories” (small caps) provided to us by the Einsteins and the Sheldon Coopers… if the fit is close enough, the regularity of a given equation will sometimes make predictions about further observations that have not yet been made. Good theoretical equations have good provenance in that they predict observations that are later made, but the opposite can be said for bad theory, and the field of physics is littered with a thick layer of mathematical theories which failed to account for the observations, in one way or another. The process of physics is a big selection algorithm where smart theorists write every possible theory they can come up with and experimentalists take those theories and see if the data fit to them, and if they do accommodate observation, such a theory is promoted to a Theory (big caps) and is explored to see where its limits exist. On the other hand, small caps “theories” are discarded if they don’t accommodate observation, at which point they are replaced by a wave of new attempts that try to accomplish what the failure didn’t. As a result, new theories fit over old theories and push back predictive limits as time goes on.

For the specific example of Schrodinger’s equation, the mathematical model that it offers fits over the Bohr model by incorporating deBroglie’s matter wave. Bohr’s model itself fit over a previous model and the previous models fit over still earlier ideas had by the ancient Greeks. Each later iteration extends the accuracy of the model, where the development is settled depending on whether or not a new model has validated predictive power –this is literally survival of the fittest applied to mathematical models. Schrodinger’s equation itself has a limit where its predictive power fails: it cannot handle Relativity except as a perturbation… meaning that it can’t exactly predict outcomes that occur at high speeds. The deficiencies of the Schrodinger equation are addressed by the Klein-Gordon equation and by the Dirac equation and the deficiencies of those in turn are addressed by the path integral formalisms of Quantum Field Theory. If you knew the state equation for the whole universe, Schrodinger’s equation would not accurately predict how time unfolds because it fails to work under certain physically relevant conditions. The modern Quantum Field Theories fail at gravity, meaning that even with the modern quantum, there is no assured way of predicting the evolution of the “state equation of the universe” even if you knew it. There are a host of follow-on theories, String Theory, Quantum loop gravity and so and so forth that vy for being The Theory That Fills The Holes, but, given history, probably will only extend our understanding without fully answering all the remaining questions. That String Theory has not made a single prediction that we can actually observe right now should be lost on no one –there is a grave risk that it never will. We cannot at the moment pretend that the Schrodinger equation perfectly satisfies what we actually know about the universe from other sources.

It would be most accurate to say that reality seems to be quantum mechanical at its foundation, but that we have yet to derive the true “fully correct” quantum theory. Tegmark makes a big fuss about trying to explain “wave function collapse” doesn’t fit within the premise of Schrodinger’s equation but that the equation could hold as good quantum regardless if a “level three multiverse” is real. The opposite is also true: we’ve known Schrodinger’s equation is incomplete since the 1930s, so “collapse” may simply be another place where it’s incomplete that we don’t yet know why. A multiverse does not necessarily follow from this. Maybe pilot wave theory is correct quantum, for all I know.

It might be possible to masturbate over the incredible mathematical regularity of physics in the universe, but beware of the fact that it wasn’t particularly mathematical or regular until we picked out those theories that fit the universe’s behavior very closely. Those theories have predictive power because that is the nature of the selection criteria we used to find them; if they lacked that power, they would be discarded and replaced until a theory emerged meeting the selection criteria. To be clear, mathematical models can be written to describe anything you want, including the color of your bong haze, but they only have power because of their self consistency. If the universe does something to deviate from what the math says it should, the math is simply wrong, not the universe. Every time you find neutrino mass, God help your massless neutrino Standard Model!

Wonderful how the math works… until it doesn’t.

Edit 12-19-17:

We’re still listening to this book during our car trips and I wanted to point out that Tegmark uses an argument very similar to my argument above to suggest why the human brain can’t be a quantum computer. He approaches the matter from a slightly different angle. He says instead that a coherent superposition of all the ions either inside or outside the cell membrane is impossible to maintain for more than a very very short period of time because eventually something outside of the superposition would rapidly bump against some component of the superposition and that since so many ions are involved, the frequency of things bumping on the system from the outside and “making a measurement” becomes high. I do like what he says here because it starts to show the scale that is relevant to the argument.

On the other hand, it still fails to necessitate a multiverse. The simple fact is that human choice is decoupled from the scale of quantum coherence.

Edit 1-10-18:

As I’m trying desperately to recover from stress in the process of thesis writing, I thought I would add a small set of thoughts in this subject in an effort to defocus and defrag a little. My wife and I have continued to listen to this book and I think I have another fairly major objection with Tegmark’s views.

Tegmark lives in a version of quantum mechanics that fetishizes the notion of wave function collapse where he views himself as going against the grain by offering an alternative where collapse does not have to happen.

For a bit of context, “collapse” is a side effect of the Copenhagen convention of quantum mechanics. In this way of looking at the subject, the wave function will remain in superposition until something is done to determine what state the wave function is in… at this point, the wave function will cease to be coherent and will drop into some allowed eigenstate, after which it will remain in that eigenstate. This is a big, dominant part of quantum mechanics, but I would suggest that it misses some of the subtlety of what actually happens in quantum mechanics by trying to interpret, perhaps wrongly, what the wave function is.

Fact of the matter is that you can never observe a wave function. When you actually look at what you have, you only ever find eigenstates. But, there is an added subtlety to this. If you make an observation, you find an object somewhere, doing something. That you found the object is indisputable and you can be pretty certain what you know about it at the time slice of the observation. Unfortunately, you only know exactly what you found; from this –directly– you actually have no idea either what the wave function was or even really what the eigenstates are. Location is clearly an eigenstate of the position operator, as quantum mechanics operates, but from finding a particle “here” you really don’t actually know what the spectrum of locations it was potentially capable of occupying actually were. In order to learn this, the experiment which is performed is to set up the situation in a second instance, put time in motion and see that you find the new particle ending up “there,” then to tabulate the results together. This is repeated a number of times until you get “here,” “there” and “everywhere.” Binning each trial together, you start to learn a distribution of how the possibilities could have played out. From this distribution, you can suddenly write a wave function, which tells the probability of making some observation across the continuum of the space you’re looking at… the wave function says that you have “this chance of finding the object ‘here’ or ‘there’.”

The wave function, however you try to pack it, is fundamentally dependent on the numerical weight of a statistically significant number of observations. From one observation, you can never know anything about the wave function.

The same thing holds true for coherence. If you make one observation, you find what you found that one time; you know nothing about the spectrum of possibilities. For that one hit, the particle could have been in coherence, or it could have been collapsed to an eigenstate. You don’t know. You have to build up a battery of observations, which gives you the ability to say “there’s a xx% chance this observation and that observation were correlated, meaning that coherence was maintained to yy degree.”

This comes back to Feynman’s old double slit experiment anecdote. For one BB passing through the system and striking the screen, you only know that it did, and not anything about how it did. The wave function written for the circumstances of the double slit provides a forecast of what the possible outcomes of the experiment could be. If you start measuring which slit a BB went through, the system becomes fundamentally different based upon how the observation is made and different things are knowable, giving the chance that the wave function will forecast different statistical outcomes. But, you cannot know this unless you make many observations in order to see the difference. If you measure the location of 1 BB at the slit and the location of 1 BB at the screen, that’s all you know.

In this way, the wave function is a bulk phenomenon, a beast of statistical weight. It can tell you observations that you might find… if you know the set up of the system. An interference pattern at the screen tells that the history was muddy and that there are multiple possible histories that could explain an observation at the screen. This doesn’t mean that a BB went through both slits, merely that you don’t know what history brought it to the place where it is. “Collapse” can only be known after two situations have been so thoroughly examined that the chances for the different outcomes are well understood. In a way, it is as if the phenomenon of collapse is written into the outcome of the system by the set-up of the experiment and that the types of observations that are possible are ordained before the experiment is carried out. In that way, the wave function really is basically just a forecast of possible outcomes based on what is known about a system… sampling for the BB at the slit or not, different information is present about the system, creating different possible outcomes, requiring the wave function to make a different forecast that includes that something different is known about the system. The wave function is something that never actually exists at all except to tell you the envelope of what you can know at any given time, based upon how the system is different from one instance to the next.

This view directly contradicts the notions in Tegmark’s book that individual quantum mechanical observations at “collapse” allow for two universes to be created based upon whether the wave function went one way or another. On a statistical weight of one, it cannot be known whether the observed outcome was from a collection of different possibilities or not. The possible histories or futures are unknown on a data point of one; that one is what it is and it can’t be known that there may have been other choices without a large conspiracy to know what other choices could have happened and what that gives you is the ability to say is “there’s a sixty percent chance this observation matches this eigenstate and a forty percent chance it’s that one.” Which is fundamentally not the same as the decisiveness which would be required for a collapse of one data point to claim “we’re definitely in the universe where it went through the right slit.”

I guess I would say this: Tegmark’s level 3 multiverse is strongly contradicted by the Uncertainty Principle. Quantum mechanics is structurally based on indecisiveness, while Tegmark’s multiverse is based on a clockwork decisiveness. Tegmark is saying that the history of every particle is always known.

This is part of the issue with quantum computers: the quantum computer must run its processing experiment repeatedly, multiple times, in order to establish knowledge about coherence in the system. On a sampling of one, the wave function simply does not exist.

Tegmark does this a lot. He routinely puts the cart ahead of the horse; saying that math implies the universe rather than that math describes the universe (Tegmark: Math therefore Universe. Me: Universe, therefore Math). The universe is not math; math is simply so flexible that you can pick out descriptions that accurately tell what’s going on in the universe (until they don’t). For all his cherry picking the “mathematical regularity of the universe,” Tegmark quite completely turns his eye to where math fails to work: most problems in quantum mechanics are not exactly solvable and most quantum advancement is based strongly on perturbation… that is approximations and infinite expansions that are cranked through computers to churn out compact numbers that are close to what we see. In this, the math that ‘works’ is so overloaded with bells and whistles to make it approach the actual observational curve that one can only ever say that the math is adopting the form of the universe, not that the universe arises from the math.

edit 1-17-18:

Still listening to this book. We listened through a section where Tegmark admits that he’s putting the cart ahead of the horse by putting math ahead of reality. He simply refers to it as a “stronger assertion” which I think is code for “where I know everyone will disagree with me.”

Tegmark slipped gently out of reality again when he started into a weird observer-observation duality argument about how time “flows” for a self-aware being. You know he’s lost it when his description fails to even once use the word “entropy.” Tegmark is under the impression that the quantum mechanical choice of every distinct ion in your brain is somehow significant to the functioning of thought. This shows an unbelievable lack of understanding of biology, where mass structures and mass action form behavior. Fact of the matter is that biological thought (the awareness of a thinking being) is not predictable from the quantum mechanical behavior of its discrete underpinning parts. In reality, quantum mechanics supplies the bulk steady state from which a mass effect like biological self-awareness is formed. Because of the difference in scale between the biological level and the quantum mechanical level, biology depends only on the prevailing quantum mechanical average… fluctuations away from that average, the weirdness of quantum, are almost entirely swamped out by simple statistical weight. A series of quantum mechanical arguments designed to connected the macroscale of thought to the quantum scale is fundamentally broken without taking this into account.

Consider this: the engine of your gas fueled car is dependent on a quantum mechanical behavior. Molecules of gasoline are mixed with molecules of oxygen in the cylinder head and are triggered by a pulse of heat to undergo a chemical reaction where the atoms of the gas and oxygen reconfigure the quantum mechanical states of their electrons in order to organize into molecules of CO2 and CO. After the reorganization, the collected atoms in these new molecules of CO2 and CO are at a different average state of quantum mechanical excitation than they were prior to the reconfiguration –you could say that they end up further from their quantum mechanical zero point for their final structure as compared to prior to the reorganization. In ‘human baggage’ we call this differential “heat” or “release of heat.” The quantum mechanics describe everything about how the reorganization would proceed, right down to the direction a CO2 molecule wants to speed off after it has been formed. What the quantum mechanics does not directly tell you is that 10^23 of these reactions happen and for all the different directions that CO2 molecules are moving after they are formed, the average distribution of their expansion is all that is needed to drive the cylinder head… that this molecule speeds right or that one speeds left are immaterial: if it didn’t, another would, and if that one didn’t still another would and so on and so forth until you achieve a bulk behavior of expansion in CO2 atmosphere that can push the piston. The statistics are important here. That the gasoline is 87 octane versus 91 octane, two quantum mechanically different approaches to the same thing, does not change that both drive the piston… you could use ethanol or kerosine or RP-1 to perform the same action and the specifics of the quantum mechanics result in an almost indistinguishable state where an expanding gas pushes back the piston head to produce torque on the crankshaft to drive the wheels around. The quantum mechanics are watered out to a simple average where the quantum mechanical differences between one firing of the piston are indistinguishable from the next. But, to be sure, every firing of the piston is not quantum mechanically exactly the same as the one before it. In reality, that piston moves despite these differences. There is literally an unthinkably huge ensemble of quantum mechanical states that result in the cylinder head moving and you cannot distinguish any of them from any other. There is literally no choice but to group them all together by what they hold in common and to treat them as if they are the same thing, even though at the basement layer of reality, they aren’t. Without what Tegmark refers to as “human baggage” there would be no way to connect the quantum level to the one we can actually observe in this case. That this particular molecule of fuel failed to react or not based on fluctuations of the quantum mechanics is pretty much immaterial.

The brain is not different. If you were to consider “thought” to be a quantum mechanical action, the specific difference between one thought and the next are themselves huge ensembles of different quantum mechanical configurations… even the same thought twice is not the same quantum mechanical configuration twice. The “units” of thought are in this way decoupled from the fundamental level since two versions of the “same thing” are actually so statistically removed from their quantum mechanical foundation as to be completely unpredictable from it.

This is a big part of the problem with Tegmark’s approach; he basically says “Quantum underlies everything, therefore everything should be predictable from quantum.” This is a fool’s errand. The machineries of thought in a biological person are simply at a scale where the quantum mechanics has salted out into Tegmark’s “human baggage”… named conceptual entities, like neuroanatomy, free energy and entropy, that are not mathematically irreducible. He gets to ignore the actual mechanisms of “thought” and “self-awareness” in order to focus on things he’s more interested in, like what he calls the foundation structure of the universe. Unfortunately, he’s trying to attach to levels of reality that are not naturally associated… thought and awareness are by no means associated with fundamental reality –time passage as experienced by a human being, for instance, has much more in common with entropy and statistical mechanics than it does with anything else, and Tegmark totally ignored it in favor of a rather ridiculous version of the observer paradox.

One thing that continues to bother me about this book is something that Tegmark says late in it. The man is clearly very skilled and very capable at what he does, but he dedicates the last part of his book to all the things he will not publish on for fear of destroying his career. He feels the ideas deserve to be out (and as an arrogant theorist, he feels that even the dross in his theories are gold), but by publishing a book about them, he gets to circumvent peer review and scientific discussion and bring these ideas straight to an audience that may not be able to sort which parts of what he says are crap from those few trinkets which are good. I don’t mean that he should be muzzled, he has the freedom of speech, but if his objective is to favor dissemination of scientific education, he should be a model of what he professes. If Tegmark truly believes these ideas are useful, he should damned well be publishing them directly into the scientific literature so that they can be subjected to real peer review. Like all people, this one should face his hubris. The first of which is his incredible weakness at stat mech and biology.

Flat Earth “Research”

You no doubt heard about this fellow in the last week with the steampunk rocket with “Flat Earth Research” written on the side. In my opinion, he was pretty clearly trolling the media; not much likelihood of resolving any issues about the shape of the Earth if the peak altitude of your rocket is only a fraction of the altitude of a commercial airline jet. He said a number of antiscience things and sort of repurposed mathematical formulae for aeronautics and fluid mechanics as “not science” as if physics is anything other than physics. The guy claimed he was using the flight as a test bed for a bigger rocket and wanted to create a media circus to announce his run for a seat in the California legislature. Not bad for a limo driver, I give him that.

Further in the background, I think it’s clear he was just after a publicity stunt; his do-it-yourself rocket cost a great deal of money, and his conversion to flat eartherism obviously helped to pay the bill. It really did make me wonder what exactly flat earthers think “research” is given that they were apparently willing to pony up a ton of money for this rocket, which won’t go high enough to resolve anything an airline ticket won’t resolve better.

My general feelings about flat earth nonsense are well recorded here and here.

A part of why I decided to write anything about this is that the guy wants to run for congress in California. This should be concerning to everyone: someone who is trusted to make decisions for a whole community had better be doing so based on a sound understanding of reality. Higher positions currently filled in the Federal government not withstanding, a disconnect seems to be forming in our self-governance which is allowing people to unhinge their decision-making processes from what is actually known about the world. I think that’s profoundly dangerous.

In my opinion also, this is not to heap blame on those who actually hold office now, but on everybody who elected to put them there. Our government is both by the people and for the people: anybody in power is at some level representative of the electorate, possessing all the same potentially fatal flaws. If you want to bitch about the government, the place to start is society itself.

Now, Flat Eartherism is one of those pastimes that is truly incredibly past its time. There are two reasons it subsists; the first is people trolling other people for kicks online, while the second is that some people are so distrusting and conspiracy-minded that they’re willing to believe just about anything if it feeds into their biases. There are some people who truly believe it. A part of why people have the ability to believe the conspiracy theories is that what they consider visual evidence of the Earth’s roundness comes through sources that they define as questionable because of their connection to ostensibly corrupt power –NASA, for all its earnest effort to keep space science accessible to the common man, has not been perfect. Further, not just anybody can go to a place where the roundness of the Earth is unambiguously visible given exactly how hard it is to get to very high altitudes over Earth in the first place. For all of SpaceX’s success, space flight still isn’t a commodity that everyone can sample. Travel into space is held under lock and key by the few and powerful.

Knowing and having worked a bit around scientists associated with space flight projects, I understand the mindset of the scientists, and it offends me very deeply to see their trustworthiness questioned when I know that many of them value honesty very highly. Part of why the conspiracy garbage circulates at all is because our society is so big that “these people” never meet “those people” and the two sides have little chance of bumping into one another. It’s easy to malign people who are faceless and its really easy to accuse someone of lying if they aren’t present to defend themselves. That doesn’t mean that either is due. This comes back to my old argument about the constitutionally defended right to spout lies in the form of “Freedom of Speech” being a very dangerous social norm.

Now, that said, another of the primary reasons I decided to write this post is because I saw a Youtube video of Eddie Bravo facing down two scientists and more or less humiliating them over their inability to defend “round eartherism.”

You may or may not know of him, but Eddie Bravo is a modern hero to the teenage boy; he’s another of these podcaster/micro-celebrity types who is widely accessible with a few keystrokes in an environment with basically zero editorial content control. He’s a visible face of the UFC (Ultimate Fighting Challenge) movement along with Joe Rogan. He’s attained wide acclaim for being a “Gracie Killer,” which is a big thing if you know anything about UFC… the Gracies being the renown Brazilian Jiu-Jutsu family who dominated the grappling world early in the UFC and brought the art of Jiu-Jutsu in its Brazilian form to the whole world. From this little history, you can easily guess why Bravo is a teenage boy hero: he’s a brash, cocky bad ass. He’s a world class Jiu-Jutsu fighter, hands down. Unfortunately, as with many celebrities, his Jiu-Jutsu street cred affords him the opportunity to open his mouth about whatever he feels like. Turns out he’s a bit of a crank magnet too, including being a flat earther.

To begin with, I don’t believe Mr. Bravo –or any other crank, for that matter– is stupid. I’ve long since seen that great intelligence can exist in people who for one reason or another don’t know better or choose not to “believe” in something for whatever reason. If he weren’t talented at some level, he wouldn’t be a hard enough worker to develop the acclaim he has attained. But, he conflates being able to shout over whoever he feels like to being able to beat them, which absolutely isn’t true in an intellectual debate.

In the Youtube clip I saw, Mr. Bravo confronts two scientists in a room full of people friendly to him. The first scientist is brought to the forefront where he introduces himself as an “Earth Scientist”… much to the rolling eyes and derision of the audience. Eddie Bravo then demands that he give the one bit of evidence which proves that the “Earth is round.” Put on the spot, this poor fellow then makes the mistake of trying to tell Mr. Bravo that science is a group of people who specialize in many different disciplines, across many different lines of research, and fails to provide Mr. Bravo with a direct answer to his question. It’s true that science is distributed, but by not answering the question, he gives the appearance of not having the answer and Eddie Bravo was completely aware that he’d said nothing to the point! When the second scientist comes forward, Eddie Bravo demands (a poorly worded demand at that, in my opinion) that since most people hold the disappearance of a ship’s mast over the horizon as the “proof” that the world is round, “why was it that people are able to take pictures of ships after they’re supposedly over the horizon?” This second scientist really did step up, I think: he tried to explain that light doesn’t necessarily travel in straight lines (which is true) and that the atmosphere can work like a fiber optic to bring images around the curve of the earth. Mr. Bravo derided this explanation, basically saying “Oh, please, that’s garbage, everybody knows you can’t see around corners.” And, at a superficial level, this will be regarded as a true response, despite the fact that the numbers always fall out the bottom of the strainer in a rhetorical confrontation. The second scientist ended up sounding like he was talking over everybody’s head with his too intricate explanation, and Eddie Bravo was able to use that to make him out as “other,” winning the popular argument at that point. Combine these incidents with a lot of shouting over the other guy, and Eddie Bravo came off well…. the video is listed as a “debate,” never mind that it was anything but.

If you are a science educator, I would recommend watching that video. Scientist #1 comes off as stupid and scientist #2 comes off as pompous.

You’ll love me for saying this, but that was all preface to the purpose of this blog post. Most modern flat earthers are Youtube trolls; they castrate their opposition by relying on the fact that evidence of the Earth’s roundness is provided by a source that is intrinsically tainted and questionable. And, the truth is that many people who believe the Earth is round really only understand this fact based on a line of evidence that people like Eddie Bravo will not accept. How do you straighten out a guy who will not accept the satellite images?

Well, how is it that we know the earth is round? We knew it before there were satellites, computer graphics and photoshop. With globalism and information society, these knowable, observable things are amplified. Flat earthers prove they are incompetent researchers every time they open their mouths and say “Well, have you researched it? I did and the earth is flat!”

Now, suppose I was a flat earth researcher, how would I go about the science of establishing the shape of the earth using a series of modern, readily available, cheap tools?

Hypothesis: The Earth is flat! It’s the stable, unmoving center of the universe and the sun and sky move over it.

1 flat earth model

One thing that we can immediately see about this model is a simple thing. When the sun is in the sky, every point on the plane can see it at the same time since there is nothing to obstruct the line of sight anywhere. In the 1800s, nobody could really travel fast enough to be able to tell whether or not this was the case: for every person in that time, it was enough to suppose that everybody on Earth wakes up from the night at the same time and goes about their day. For this flat earth modeled when seen from the side, the phenomenon of sunrise (a phenomenon as old as the beginning of the Earth, by the way) would look like this:

2 simple sunrise model

We have all seen this: the sun starts below the edge of the Eastern Horizon and pops up above it. For a majority of people on Earth, this is what the sun seems to do in the morning.

There are a number of simple tests of this model, but the simplest question to ask is this: Does everybody on Earth see the sun appear at the same time? Everybody is standing on that flat plane: when the sun comes up from below the horizon, does everybody on Earth see it at once?

3 simple sunrise model at sunrise

Notice, this is a requirement: if the Earth is flat, people all across the plane of the Earth will be able to see something big coming over the edge of that plane almost simultaneously, depending on nearby impediments, like mountains for instance.

So, here’s the experiment! If you live in California, grab your smart phone, buy an airplane ticket and fly to New York. The government has no control at all over where you fly in the continental US of A and they really won’t care if you take this trip. New York, New York is actually a kind of fun place to visit, so I recommend going and maybe catching a Broadway show while you’re there. When you get to New York, find someplace along the waterline where you can look east over the ocean and go there in the morning before sunrise. After the sun rises, wait 30 minutes and then place a phone call back to one of your buddies in California and ask him if the sun is up.

This experiment can be repeated with any two east-west related locations on Earth, though the time delays will depend on the separation so that maybe a half hour is long enough for the sun to rise in both places. Any real flat earth “researcher” should be running this experiment.

For the set-up written above, the sun comes up in New York four hours before it actually comes up in California! A California view of the sun is blocked below the horizon of the Earth for four hours after it has become visible in New York.

Now, you might argue, New York is on the east side of the US and is much closer to where the sun comes up on our hypothetical plane, so maybe the Rocky Mountains are obstructing some view of the sun in LA.

4 mountain occlusion

And that this blocking effect lasts 4 hours.

So, here’s the new experiment. Drive your car from LA to NY and watch the odometer; you can even get a mechanic you trust to assure you that the government hasn’t fiddled with it. You now know the approximate distance from LA to NY by the odometer read-out. Next, you buy a barometer and use the pressure change of the air to measure how high the Rocky Mountains are… or, you could just use a surveying scope to measure the angular height of the mountains and your car to check distances, then work a bit of trig to estimate the height of the mountains.

5 measure mountain height

The Rockies are well understood to be just a bit taller than 14,000 ft.

With these distances available, you do the following experiment with surveying scopes. When the sun appears above the horizon in LA, your friend measures the angle above ground level where it is visible (surveying scopes have bubble levels for leveling the scope). You measure the angle above the horizon at the same time using a survey scope of your own in New York. Remember, you’ve got smartphones, you can talk to each other and coordinate these measurements.

For the flat earth, the position of the sun in the sky should obey the following simple triangular model:

6 flat earth trig model

This technique is as old as the hills and is called “triangulation.” Notice, I’ve used three measurements made with cheap modern equipment: angle at LA, angle at NY and the distance from LA to NY (approximate from the odometer). What I have in hand from this is the ability to determine the approximate altitude of the sun using a bit of high school level trig. Use law of sines and it’s easy to forecast the altitude of the sun from these measurements:

7 height of sun

I won’t do the derivation this once, but you just plug in the distance and the angles, then voila, the height of the sun over the flat earth. (I’m not being snide here: Flat Earthers don’t even seem to try to use trig.)

What we know so far is that the sun comes up four hours earlier in New York than LA and that we would expect that the sun should be visible everywhere on the flat earth at the same time as it comes over the horizon. Maybe the Rockies are blocking LA from seeing the sun for four hours. This would give rise to the following situation:

9 mountain triangle

You end up with similar triangles formed by the triangle of LA to the Rocky Mountains and the triangle of LA to the sun. Knowing the height of the mountains and the distance from LA to the mountains, you get the angle that the sun must be at when it appears in LA. This gives us a relation where the angle from LA to the top of the mountains must be the same as the angle from LA to the sun when it appears. We would expect the angle to be very small since the Rockies are really not that high, so finding it nearly zero to within the noise of the instrument would be expected.

Now, LA to New York is about 2,800 miles and the distance from LA to Denver is 1,020 miles. The mountains are 14,000 feet tall. In four hours of morning, from New York, the sun will appear to be at an angle of ~60 degrees over the horizon (neglecting latitude effects… leave that for later). If you start plugging these figures into equations, the altitude of the sun must be 7.3 miles up in the sky, or 38,500 ft.


You can fly at 40,000 ft in an airliner. Easy hypothesis to test. If the sun is only 7.3 miles up and visible at 60 degrees inclination in New York, you could go fly around it with an airplane.

Has anybody ever done that?

A good scientist would keep looking at the sun through the whole day and might notice that the angular difference of the sun’s inclination observed in the spotting scopes at New York and in LA does not change. Both inclinations increase at the same rate. There is always something like 60 degree difference in inclination in the sky from where the sun rose between these two places (again, neglecting latitude effects; this argument will appear a tiny bit janky since New York and Los Angeles are not at the same latitude, but the effect should be very close to what I described).

For this flat earth model to be true, the sun would need to radically and aphysically change altitude from one part of the day to the next in order for the reported angles to be real. We know with pretty good accuracy that the sun does not just pop out of the Atlantic ocean several dozen miles off the coast every morning when it rises over the United States, whatever the flat earthers want to tell you. And, this is pretty much observable without any NASA satellites. Grab yourself a boat and go see! The other possibility is that the sun is much further away than 7 miles and that the physical obstruction between LA and New York is much larger than just the height the Rocky Mountains over sea level –and also maybe that the angles on the levels of the spotting scopes somehow don’t agree with each other.

For this alone, the vanilla flat earth model must be discarded. You cannot validate any of the predictions in the model above: LA and New York do not see the sunrise at the same time and the sun clearly is not only 7 miles high in New York. To give them some credit, most modern flat earthers, including Eddie Bravo, do not subscribe directly to this model.

For a point, I would mention that every flat earth model struggles with the observable phenomenon of time zones and jet lag. If any flat earther ever asks you what convinced you of a round Earth, just say “Time Zones” in order to forestall him or her and to not look like you’re avoiding the question. Generally speaking, time zones exist because the curve of the Earth (something that flat earthers claim shouldn’t exist) obstructs the sun from lighting every point on the surface of the Earth at the same time.

So then, now that we’ve made basically two tests of a flat earther hypothesis and seen that it fails rather dramatically in the face of simple modern do-it-yourself measurements, what model do these people actually believe in?


Most modern flat earthers believe in some version of the model above (one of the major purveyors of this is Eric Dubay. I won’t link his site because I won’t give him traffic.) In this model, you can think about the Earth as a big disc centered on an axle that passes through the north pole. The sun, the moon and the night sky spin around this axle over the Earth (or maybe the Earth spins like a record beneath the sky). The southern tips of South America, Africa and Australia are placed at extreme distances from one another and Antarctica is expanded into an ice wall that surrounds the whole disc. The model here is actually not a new one and originated some time in the 1800s.

For the image depicted here, I would point out once again that if the sun is an emissive sphere, projecting light in all directions, the model above gives a clear line of sight for every location on Earth to see the sun at all times. For this reason, the flat earthers usually insist that the sun is more like a flashlight or a street lamp which projects light in a preferred direction so that light from it can’t be seen at locations other than where the light is being projected (never mind that this prospect immediately begins to suffer for trying to generate the appropriate phases of the moon).

To generate this model, the flat earthers have actually cherry-picked a few rather interesting observations about the sky. You can find a Youtube video where Eddie Bravo tries to articulate these observations to Joe Rogan. Central among them is that the North Star, Polaris, seems to not move in the night sky and that all the stars and even the sun seem to pivot around this point. In particular, during the season of white nights above the arctic circle, the sun seems to travel around the horizon without really setting (never mind that during the winter months, the sun disappears below the horizon for weeks on end… again with that pesky horizon thing; on the flat earth, the sun is not allowed to drop below the horizon and still be visible elsewhere on the same longitude since that intrinsically implies that the Earth’s surface must curve to accomplish said feat).


Taken from, this image demonstrates the real observation of what the sun does during the season of white nights as viewed at the arctic circle. The flat earth model amplifies this into the depiction given above.

If this is our hypothetical model, we could say that the sun is suspended over the flat Earth so that it sits on a ring at the radius of the equator in its revolution around the pole.

10 disc model

This image shows you right away the first thing to test. As seen at a distance of 3/4 of the disc’s diameter away, the sun cannot ever be seen in the sky at a lower angle of inclination than is allowed by its altitude over the surface. In other words, it can never go down below the horizon or come up over it.

11 min angle of inclination

Here, theta is the minimum angle of inclination that the sun will visit in the sky. I’ve heard flat earthers quote ~3,000 miles for the height of the sun and the absolute length of the longitude would be (3/4)*24,000 miles = 18,000 miles, which gives a minimum inclination angle of about 9 degrees over the horizon. And, that’s seen from the maximum possible distance across the width of the disc, where the flat earthers claim the sunlight can’t be seen. As a result, the sun will always have to *appear* in the sky at some inclination greater than 9 degrees –just suddenly start making light– at the time when the sun supposedly rises.

The truth of that is directly observable: do you ever see the sun just appear in the sky when day breaks? I certainly haven’t.

This failure to ever reach the horizon mixed with the requirement for time zones is enough to kill the flat earth model above: it can’t produce the observations available from the world around us that can be obtained with just the tiniest bit of leg work! The model can’t handle sunrises (period). There’s a reason that the round earth was postulated in 2,500 BC; it’s based on a series of clever but damn easy measurements. And I reiterate, those measurements are easier to make with modern technology.

It is inevitable that this logic won’t satisfy someone. The altitude number for the sun, 3,000 miles, was cribbed from flat earth chatter. Suppose that this number is actually different and that they don’t actually know what it is (surprise, surprise, I don’t think I’ve ever seen evidence of any one of them doing something other than making YouTube videos or staring through big cameras trying to see ships disappear over the horizon and not understanding why they don’t. Time to get to work, guys, you need to measure the altitude of the sun over the flat earth or you’ll all just keep looking like a bunch of dumbasses staring at tea leaves!)

Now, then, in some attempt to justify this model, a measurement needs to be made of the altitude of the sun (again). You can do it basically in the same way you did it before; you mark out a base length along the surface of the Earth and station two guys with surveying scopes at either end: you count “1,2,3” over the smartphone and then both of you report the angle you measure for the inclination of the sun. In this case, I recommend that one guy be stationed south of the equator and the other guy stationed north, both off the equator by the same distance along a longitude line. The measurement should be made on either the Vernal or Autumnal equinox and it should be made at noon during the day when the sun is at its highest point in the sky. This should make calculations easier by producing an isosceles triangle. How do you know you’re on the same longitude line? The sun should rise at the same time for both of you on the equinox. And, I specify equinox because I would rather not get into effects caused by the Earth’s axial tilt, like the significance of the tropics of Cancer and Capricorn (you want to know about those, go learn about them yourself).

12 height of sun ver 2

From this measurement how do you get the height of the sun? You use the following piece of very easy trig:

13 trig height

And, note, this trig will not work unless both angles measured above are the same… but you can orchestrate this with a couple spotters, an accurate clock and a couple surveying scopes.

If you do this very close to the equator, where d is small, you will find that the sun is at some crazily high altitude. You may not be able to distinguish it because of the sizeable angular width of the sun, but it will be very high… in the millions of miles. This by itself will push the minimum allowed angular height of the sun up, not down, because it’s larger than what was taken for the calculation above. To handle the horizon problem where the sun can only appear to be higher than about 9 degrees in the sky and never cross the horizon, the height of the sun must be lower than 3,000 miles, not higher. Humans were unable to do this calculation in prehistory and used a different set of triangles to try to estimate the height of the sun.

If you are a good scientist, you will repeat this measurement a number of times with different base distances between the spotters. If the Earth is flat, every base length you choose between the spotters should produce the same height for the sun (this is an example of the scientific concept of Replication).

Here’s what you will actually find:

14 three measurements

At a latitude close to the equator, during the first measurement, the sun will appear to be very far away at a really high altitude. With the second measurement, at mid latitudes on either side of the equator, the sun will appear to be at a significantly lower altitude. During the final measurement, at distant latitudes, as far north and south as you can get, the sun will appear to actually sit down on the face of the Earth. If you coordinate this experiment with six people on group chat all at once, this is what they will all see simultaneously. Could I coordinate the measurement locations so that the sun appears to be 3,000 miles high? Sure, but who in the hell would ever take that as honest? Flat earthers blame scientists for being dishonest… what if the flat earthers are the ones being dishonest? Does it not count for them somehow?

Since the sun suddenly appears to be speeding toward the Earth, does this mean that it’s about to crash down onto the experimenters you have stationed at the equator? No. It just means that your model is completely wrong because it hasn’t produced a self-consistent measurement. A mature scientist would consider the flat earth a dead hypothesis at this point.

Why does the round earth manage to succeed at explaining this series of observations? For one thing, the round earth doesn’t assume that the spotting scopes are stationed at the same angular level.

15 round earth contrast

The leveling bubble on the spotting scope can only assume the local level. And, the angle that you end up measuring is the one between the local horizon and the sight line. On the equinox (very important) the sun will only appear to be directly overhead at noon on the equator.

If you’re still unconvinced that the flat earth is a dead hypothesis which doesn’t live up to testing and continue to focus on strange mirages seen over the surface of the ocean on warm days as evidence that the round earth can’t be right, consider the following observations.

Flat earthers use Polaris as the pivot around which the sky spins. Why is it that Polaris is not visible in the sky from latitudes south of the equator? Why is it that the Southern Cross star constellation is not visible from the northern hemisphere? Eddie Bravo, as a Gracie hunter, surely must have visited Brazil: did he ever go outside and look for the north star during a visit? Pending that, did he look for the Southern Cross from Las Vegas?

Flat earthers use the observation that the stars in the sky rotate counterclockwise around Polaris as evidence that the sky is rotating around the disc of the Earth. Have they ever gone and observed at night from the tip of Argentina in South America that the sky seems to rotate clockwise around some axis to the south? How can the sky rotate both clockwise and counterclockwise at the same time? In the flat earth model, it can’t, but in reality, it does! As an extension, why in the hell does the sun come straight up from the east and set straight in the west on equinox at the equator? When seen at the North Pole, on equinox day, simultaneously, the sun rolls around the horizon at the level of the ground and never quite rises. Use your smartphone and take the trip to see! Send a friend to Panama while you go to Juneau Alaska and talk on the smartphone to see that it happens this way in both places at once.

Don’t take my word for it, go and make the observations yourself!

How is this all possible?

I’ll tell you why.

It’s because flat earthers never test the models they put forward with the tools that are at their flipping fingertips. “Flat Earth ‘Research'” my ass.

Do I need NASA satellite pictures or rocket launches to know that the Earth is round? Pardon my french, but Fucking hell, no! Give me the combination of time zones with the fact that the sun actually pops up over the horizon when it rises and your ass is grass. Flat earth models can’t explain these observations simultaneously, they can only do one or the other.

Edit 11-28-17

Yeah, I have a tiny bit more to say.

If all of what I’ve said still does not convince you, likely you’re hopeless. But, here’s a comparison between what the sun does in the sky over the disc shaped flat earth and what it actually does.

Here’s how the sun travels across the sky on the disc-shaped earth:

16 flat earth sun track

Here’s what the sun really does depending on latitude:

17 earth sun track

This particular set of sun behaviors in the sky is actually visible year round, but the latitude where the sun travels from East, straight over the apex, to West varies North to South depending on the season when you look. At equinox, the observation is symmetric at the equator, but it shifts north and south of there as the months move on, producing the same general pattern above. In the winter, the axial tilt of the Earth prevents the sun from rising over the north pole –ever– while the same is true at the south pole during the summer of the northern hemisphere. Flat earthers seem to never make any observations about what happens in the sky to the sun south of the equator. Do they not go to Australia or South America to take a look?

As an extra, I have made the mistake of rooting through Eric Dubay’s “200 proofs” gallop. I once even thought about writing a blog post about the experience, but decided it was too exhausting. For one thing, quantity does not assure quality. Many of the 200 proofs are taken from accounts of 19th century navigation errors, and one must wonder whether such accounts hold as valid in the 21st century world. Further, some of the proofs are simple, flat out lies: among the proofs is an exhaustive observation of the lack of airline flight routes in the southern hemisphere, twisting route information to show that flights must pass through the northern hemisphere to reach destinations as far separated as the tip of South America and the tip of South Africa, which simply ignores the fact that flight routes exist for these destinations that do not go to the northern hemisphere. Are there more flight routes in the Northern hemisphere than in the southern hemisphere? Yes, most of the human population lives at or north of the equator… most of the places anybody would want to go are in the northern hemisphere. If you doubt that such a flight route exists, go to the Southern hemisphere and take an airline flight from Argentina to South Africa and use a stopwatch during the flight to see if it’s a fraction of the length Dubay would claim –commerical airline jets have a known flight profile that would be impossible to hide; the rate at which they cross distance is well-characterized. Did Dubay do this experiment? Nope. What should stun a person about Dubay is that he does not merely make wrong claims, it’s that he repeats the same wrong claims 60 times in a row to an audience that not only fawns over it, but fails to point out the giant logical gaps that are detailed above. How hard is it to see that you not only need to cope with time zones, but with sunrises too?

Pointing out a tiny detail, like not understanding how mirages work on the surface of the ocean, does not somehow validate a model that can’t handle the big ticket items, like time zones and sunrises. It only shows that you can’t understand how the small details work. I can also sort of understand that people are losing touch with the world around them as they grow more and more entrenched in the online world, but if you fail to understand that the online world does not dictate the physics of the real world, you are in big trouble.

(Edit 3-26-18:)

The steam rocket dude finally shot himself 1,800 ft into the air. Oh yeah, and “flat earth and stuff.” Tell me again how his little stunt was supposed to test anything. His interest was in launching himself in a steam powered rocket, it had nothing to do with finding out the roundness (or lack thereof) of the Earth.

If you vote for him for Governor, you deserve what you get.

For anybody actually interested in a test that did something, check this out. For the record, there are aberrations to the lenses here which do effect exactly what you see along the edges of the image, but ask yourself how the rocket can appear straight while the background appears curved. Further, if you doubt it, that test is something that can be done by someone with the limo driver’s means.