Chemical Orbitals from Eigenstates

A small puzzle I recently set for myself was finding out how the hydrogenic orbital eigenstates give rise to the S- P- D- and F- orbitals in chemistry (and where s, p, d and f came from).

The reason this puzzle is important to me is that many of my interests sort of straddle how to go from the angstrom scale to the nanometer scale. There is a cross-over where physics becomes chemistry, but chemists and physicists often look at things very differently. I was not directly trained as a P-chemist; I was trained separately as a Biochemist and a Physicist. Remarkably, the Venn diagrams describing the education for these pursuits only overlap slightly. When Biochemists and Molecular Biologists talk, the basic structures below that are frequently just assumed (the scale here is >1nm), while Physicists frequently tend to focus their efforts toward going more and more basic (the scale here is <1 Angstrom). This leads to a clear non-overlap in the scale where chemistry and P-chem are relevant (~1 angstrom). Quite ironically, the whole periodic table of the elements lies there. I have been through P-chem and I’ve gotten hit with it as a Chemist, but this is something of an inconvenient scale gap for me. So, a cat’s paw of mine has been understanding, and I mean really understanding, where quantum mechanics transitions to chemistry.

One place is understanding how to get from the eigenstates I know how to solve to the orbitals structuring the periodic table.

1200px-periodic_table_chart

This assemblage is pure quantum mechanics. You learn a huge amount about this in your quantum class. But, there are some fine details which can be left on the counter.

One of those details for me was the discrepancy between the hydrogenic wave functions and the orbitals on the periodic table. If you aren’t paying attention, you may not even know that the s-, p-, d- orbitals are not all directly the hydrogenic eigenstates (or perhaps you were paying a bit closer attention in class than I was and didn’t miss when this detail was brought up). The discrepancy is a very subtle one because often times when you start looking for images of the orbitals, the sources tend to freely mix superpositions of eigenstates with direct eigenstates without telling why the mixtures were chosen…

For example, here are the S, P and D orbitals for the periodic table:

ao

This image is from http://www.chemcomp.com. Focusing on the P row, how is it that these functions relate to the pure eigenstates? Recall the images that I posted previously of the P eigenstates:

P-orbital probabiltiy densityorbital21-1 squared2

In the image for the S, P and D orbitals, of the Px, Py and Pz orbitals, all three look like some variant of P210, which is the pure state on the left, rather than P21-1, which is the state on the right. In chemistry, you get the orbitals directly without really being told where they came from, while in physics, you get the eigenstates and are told somewhat abstractly that the s-, p-, d- orbitals are all superpositions of these eigenstates. I recall seeing a professor during an undergraduate quantum class briefly derive Px and Py, but I really didn’t understand why he selected the combinations he did! Rationally, it makes sense that Pz is identical to P210 and that Px and Py are superpositions that have the same probability distribution as Pz, but are rotated into the X-Y plane ninety degrees from one another. How do Px and Py arise from superpositions of P21-1 and P211? P21-1 and P211 have identical probability distributions despite having opposite angular momentum!

Admittedly, the intuitive rotations that produce Px and Py from Pz make sense at a qualitative level, but if you try to extend that qualitative understanding to the D-row, you’re going to fail. Four of the D orbitals look like rotations of one another, but one doesn’t. Why? And why are there four that look identical? I mean, there are only three spatial dimensions to fill, presumably. How do these five fit together three dimensionally?

Except for the Dz^2, none of the D-orbitals are pure eigenstates: they’re all superpositions. But what logic produces them? What is the common construction algorithm which unites the logic of the D-orbitals with that of the P-orbitals (which are all intuitive rotations).

I’ll actually hold back on the math in this case because it turns out that there is a simple revelation which can give you the jump.

As it turns out, all of chemistry is dependent on angular momentum. When I say all, I really do mean it. The stability of chemical structures is dependent on cases where angular momentum has tended in some way to cancel out. Chemical reactivity in organic chemistry arises from valence choices that form bonds between atoms in order to “complete an octet,” which is short-hand for saying that species combine with each other in such a way that enough electrons are present to fill in or empty out eight orbitals (roughly push the number of electrons orbiting one type of atom across the periodic table in its appropriate row to match the noble gases column). For example, in forming the salt crystal sodium chloride, sodium possesses only one electron in its valence shell while chlorine contains seven: if sodium gives up one electron, it goes to a state with no need to complete the octet (with the equivalent electronic completion of neon), while chlorine gaining an electron pushes it into a state that is electronically equal to argon, with eight electrons. From a physicist stand-point, this is called “angular momentum closure,” where the filled orbitals are sufficient to completely cancel out all angular momentum in that valence level. As another example, one highly reactive chemical structure you might have heard about is a “radical” or maybe a “free radical,” which is simply chemist shorthand for the situation a physicist would recognize contains an electron with uncancelled spin and orbital angular momentum. Radical driven chemical reactions are about passing around this angular momentum! Overall, reactions tend to be driven to occur by the need to cancel out angular momentum. Atomic stoichiometry of a molecular species always revolves around angular momentum closure –you may not see it in basic chemistry, but this determines how many of each atom can be connected, in most cases.

From the physics, what can be known about an orbital is essentially the total angular momentum present and what amount of that angular momentum is in a particular direction, namely along the Z-axis. Angular momentum lost in the X-Y plane is, by definition, not in either the X or Y direction, but in some superposition of both. Without preparing a packet of angular momentum, the distribution ends up having to be uniform, meaning that it is in no particular direction except not in the Z-direction. For the P-orbitals, the eigenstates are purely either all angular momentum in the Z-direction, or none in that direction. For the D-orbitals, the states (of which there are five) can be combinations, two with angular momentum all along Z, two with half in the X-Y plane and half along Z and one with all in the X-Y plane.

What I’ve learned is that, for chemically relevant orbitals, the general rule is “minimal definite angular momentum.” What I mean by this is that you want to minimize situations where the orbital angular momentum is in a particular direction. The orbits present on the periodic table are states which have canceled out angular momentum located along the Z-axis. This is somewhat obvious for the homology between P210 and Pz. P210 points all of its angular momentum perpendicular to the z-axis. It locates the electron on average somewhere along the Z-axis in a pair of lobes shaped like a peanut, but the orbital direction is undefined. You can’t tell how the electron goes around.

As it turns out, Px and Py can both be obtained by making simple superpositions of P21-1 and P211 that cancel out z-axis angular momentum… literally adding together these two states so that their angular momentum along the z-axis goes away. Px is the symmetric superposition while Py is the antisymmetric version. For the two states obtained by this method, if you look for the expectation value of the z-axis angular momentum, you’ll find it missing! It cancels to zero.

It’s as simple as that.

The D-orbitals all follow. D320 already has no angular momentum on the z-axis, so it is directly Dzz. You therefore find four additional combinations by simply adding states that cancel the z-axis angular momentum: D321 and D32-1 symmetric and antisymmetric combinations and then the symmetric and antisymmetric combinations of D322 and D32-2.

Notice, all I’m doing to make any of these states is by looking at the last index (the m-index) of the eignstates and making a linear combination where the first index plus the second gives zero. 1-1 =0, 2-2=0. That’s it. Admittedly, the symmetric combination sums these with a (+) sign and a 1/sqrt(2) weighting constant so that Px = (1/sqrt(2))(P21 + P21-1) is normalized and the antisymmetric combination sums with a (-) sign as in Py = (1/sqrt(2))(P211 – P21-1), but nothing more complicated than that! The D-orbitals can be generated in exactly the same manner. I found one easy reference on line that loosely corroborated this observation, but said it instead as that the periodic table orbitals are all written such that the wave functions have no complex parts… which is also kind of true, but somewhat misleading because you sometimes have to multiply by a complex phase to put it genuinely in the form of sines for the polar coordinate (and as the polar coordinate is integrated over 360 degrees, expectation values on this coordinate, as z-axis momentum would contain, cancel themselves out; sines and cosines integrated over a full period, or multiples of a full period, integrate to zero.)

Before I wrap up, I had a quick intent to touch on where S-, P-, D- and F- came from. “Why did they pick those damn letters?” I wondered one day. Why not A-, B-, C- and D-? The nomenclature emerged from how spectral lines appeared visually and groups were named: (S)harp, (P)rincipal, (D)iffuse and (F)undamental. (A second interesting bit of “why the hell???” nomenclature is the X-ray lines… you may hate this notation as much as me: K, L, M, N, O… “stupid machine uses the K-line… what does that mean?” These letters simply match the n quantum number –the energy level– as n=1,2,3,4,5… Carbon K-edge, for instance, is the amount of energy between the n=1 orbital level and the ionized continuum for a carbon atom.) The sharpness tends to reflect the complexity of the structure in these groups.

As a quick summary about structuring of the periodic table, S-, P-, D-, and F- group the vertical columns (while the horizontal rows are the associated relative energy, but not necessarily the n-number). The element is determined by the number of protons present in the nucleus, which creates the chemical character of the atom by requiring an equal number of electrons present to cancel out the total positive charge of the nucleus. Electrons, as fermions, are forced to occupy distinct orbital states, meaning that each electron has a distinct orbit from every other (fudging for the antisymmetry of the wave function containing them all). As electrons are added to cancel protons, they fall into the available orbitals depicted in the order on the periodic table going from left to right, which can be a little confusing because they don’t necessarily purely close one level of n before starting to fill S-orbitals of the next level of n; for example at n=3, l can equal 0, 1 and 2… but, the S-orbitals for n=4 will fill before D-orbitals for n=3 (which are found in row 4). This has purely to do with the S-orbitals having lower energy than P-orbitals which have lower energy than D-orbitals, but that the energy of an S-orbital for a higher n may have lower energy than the D-orbital for n-1, meaning that the levels fill by order of energy and not necessarily by order to angular momentum closure, even though angular momentum closure influences the chemistry. S-, P-, D-, and F- all have double degeneracy to contain up and down spin of each orbital, so that S- contains 2 instead of 1, P- contains 6 instead of 3, and D- from 10 instead of 5. If you start to count, you’ll see that this produces the numerics of the periodic table.

Periodic table is a fascinating construct: it contains a huge amount of quantum mechanical information which really doesn’t look much like quantum mechanics. And, everybody has seen the thing! An interesting test to see the depth of a conversation about periodic table is to ask those conversing if they understand why the word “periodic” is used in the name “Periodic table of the elements.” The choice of that word is pure quantum mechanics.

Advertisements

Powerball Probabilities

If you’ve read anything else in this blog, you’ll know I write frequently about my playing around with Quantum Mechanics. As a digression away from a natural system that is all about probabilities, an interesting little toy problem I decided to tackle is figuring out how the “win” probabilities are determined in the lottery game Powerball.

Powerball is actually quite intriguing to me. They have a website here which details by level all the winners across the whole country who have won a Powerball prize in any given drawing. You may have looked at this chart at some point while trying to figure out if your ticket won something useful. A part of what intrigues me about this chart is that it tells you in a given drawing exactly how much money was spent on Powerball and how many people bought tickets. How does it tell you this? Because probability is an incredibly reliable gauge of behavior with big samples sizes. And, Powerball quite willingly lays all the numbers out for you to do their book keeping for them by telling you exactly how many people won… particularly at the high-probability-to-win levels which push into the regime of Gaussian statistics. For big samples, like millions of people buying powerball tickets, where N=big, the errors on average values become relatively insignificant since they go as sqrt(N). And, the probabilities reveal what those average values are.

The game is doubly intriguing to me because of the psychological component that drives it. As the pot becomes big, people’s willingness to play becomes big even though the probabilities never change. It suddenly leaps into the national consciousness every time the size of the pot becomes big and people play more aggressively as if they had a greater chance of winning said money. It is true that somebody ultimately walks away with the big pot, but what’s the likelihood that somebody is you?

But, as a starter, what are the probabilities that you win anything when you buy a ticket? To understand this, it helps to know how the game is set up.

As everybody knows, powerball is one of these games where they draw a bunch of little balls printed with numbers out of a machine with a spinning basket and you, as the player, simply match the numbers on your ticket to the numbers on the balls. If your ticket matches all the numbers, you win big! And, as an incentive to make people feel like they’re getting something out of playing, the powerball company awards various combinations of matching numbers and adds in multipliers which increase the size of the award if you do get any sort of match. You might only match a number or two, but they reward you a couple bucks for your effort. If you really want, you can pick the numbers yourself, but most people simply grab random numbers spat out of a computer… not like I’m telling you anything you don’t already know at this point.

One of the interesting qualities of the game is that the probabilities of prizes are very easy to adjust. The whole apparatus stays the same; they just add or subtract balls from the basket. In powerball, as currently run, there are two baskets: the first basket contains 69 balls while the second contains 26. Five balls are drawn from the first basket while only one, the Powerball, is drawn from the second. There is actually an entire record available of how the game has been run in the past, how many balls were in either the first or second baskets and when balls were added or subtracted from each. As the game has crossed state lines and the number of players has grown, the number of balls has also steadily swelled. I think the choice in numbering has been pretty careful to make the smallest prize attainably easy to get while pushing the chances for the grand prize to grow enticingly larger and larger. Prizes are mainly regulated by the presence of the Powerball: if your ticket manages to match the Powerball and nothing else, you win a small prize, no matter what. Prizes get bigger as a larger number of the other five balls are matched on your ticket.

The probabilities at a low level work almost exactly as you would expect: if there are 26 balls in the powerball basket, at any given drawing, you have 1 chance in 26 of matching the powerball. This means that you have 1 chance in 26 of winning some prize as determined by the presence of the powerball. There are also prizes for runs of larger than three matching balls drawn from the main basket, which tends to push the probabilities of winning anything to a slightly higher frequency than 1 in 26.

For the number savvy this begins to reveal the economics of powerball: an assured win by these means requires you to spend, on average, $48. That’s 26 tickets where you are likely to have one that matches the powerball. Note, the prize for matching that number is $4. $44 dollars spent to net only $4 is a big overall loss. But, this 26 ticket buy-in is actually hiding the fact that you have a small chance of matching some sequence of other numbers and obtaining a bigger prize… and it would certainly not be an economic loss if you matched the powerball and then the 5 other balls, yielding you a profit in the hundreds of millions of dollars (and this is usually what people tell themselves as they spend $2 for each number).

The probability to win the matched powerball prize only, that is to match just the powerball number, is actually somewhat worse than 1 in 26. The probability is attenuated by the requirement that you hit no matches on any other of the five possible numbers drawn.

Finding the actual probability is as follows: (1/26)*(64/69)*(63/68)*(62/67)*(61/66)*(60/65). If you multiply that out and invert it, you get 1 hit in 38.32 tries. The first number is, of course, the chances of hitting the powerball, while the other five are the chance of hitting numbers that aren’t picked… most of these probabilities are naturally quite close to 1, so you are likely to hit them, but they are probabilities that count toward hitting the powerball only.

This number may not be that interesting to you, but lots of people play the game and that means that the likelihood of hitting just the powerball is close to Gaussian. This is useful to a physicist because it reveals something about the structure of the Powerball playing audience on any given week: that site I gave tells you how many people won with only the powerball, meaning that by multiplying that number by 38.32, you know how many tickets were purchased prior to the drawing in question. For example, as of the August 12 2017 drawing, 1,176,672 numbers won the powerball-only prize, meaning that very nearly 38.32*1,176,672 numbers were purchased: ~45,090,071 numbers +/- 6,715, including error (notice that the error here is well below 1%).

How many people are playing? If people mostly purchase maybe two or three numbers, around 15-20 million people played. Of course, I’m not accounting for the slavering masses who went whole hog and dropped $20 on numbers; if everybody did this, 4.5 million people played… truly, I can’t really know people’s purchasing habits for certain, but I can with certainty say that only a couple tens of millions of people played.

The number there reveals quite clearly the economics of the game for the period between the 8/12 drawing and the one a couple days prior: $90 million was spent on tickets! This is really quite easy arithmetic since it’s all in factors of 2 over the number of ticket numbers sold. If you look at the total prize pay-out, also on that page I provided, $19.4 million was won. This means that the Powerball company kept ~$70 million made over about three days, of which some got dumped into the grand prize and some went to whatever overhead they keep (I hear at least some of that extra is supposed to go into public works and maybe some also ends up in the Godfather’s pocket). Lucrative business.

If you look at the prize payouts for the game, most of the lower level prizes pay off between $4 and $7. You can’t get a prize that exceeds $100 until you match at least 4 balls. Note, here, that the probability of matching 4 balls (including the powerball) is about 1 in 14,494. This means, that to assure yourself a prize of $100, you have to spend ~$29,000. You might argue that in 14,494 tickets, you’ll win a couple smaller prizes ($4 prizes are 1 in 38, 1 in 91, and $7 prizes are 1 in 700 and 1 in 580) and maybe break even. Here’s the calculation for how much you’ll likely make for that buy-in: $4*(14,494*(1/38 + 1/91)) + $7*(14,494*(1/700 + 1/580))… I’ve rounded the probabilities a bit… =$2482.65. For $29,000 spent to assure a single $100 win, you are assured to win at most $2500 from lesser winnings for a total loss of $27,500. Notice, $4 on a $44 loss is about 10%, while $2500 on $27,500 is also about 10%… the payoff does not improve at attainable levels! Granted, there’s a chance at a couple hundred million, but the probability of the bigger prize is still pretty well against you.

Suppose you are a big spender and you managed to rake up $29,000 in cash to dump into tickets, how likely is it that you will win just the $1 million prize? That’s five matched balls excluding the powerball. The probability is 1 in 11,688,053. By pushing the numbers, your odds of this prize have become 14,500/11,688,053, or about 1 chance in 800. Your odds are substantially improved here, but 1 in 800 is still not a wonderful bet despite the fact that you assured yourself a fourth tier prize of $100! The grand prize is still a much harder bet with odds running at about 1 in 20,000, despite the amount you just dropped on it. Do you just happen to have $30,000 burning a hole in your pocket? Lucky you! Lots of people live on that salary for a year.

Most of this is simple arithmetic and I’ve been bandying about probabilities gleaned from the Powerball website. If you’re as curious about it as me, you might be wondering exactly how all those probabilities were calculated. I gave an example above of the mechanical calculation of the lowest level probability, but I also went and figured out a pair of formulae that calculate any of the powerball prize probabilities. It reminded me a bit of stat mech…

prob without powerball

prob with powerball

number for hits

I’ve colored the main equations and annotated the the parts to make them a little clearer. The final relation just shows how you can see the number of tries needed in order to hit one success, given a probability as calculated with the other two equations. The first equation differs from the second in that it refers to probabilities where you have matched numbers without managing to match the powerball, while the second is the complement, where you match numbers having hit the powerball. Between these two equations, you can calculate all the probabilities for the powerball prizes. Since probabilities were always hard for me, I’ll try to explain the parts of these equations. If you’re not familiar with the factorial operation, this is what is denoted by the exclamation point “!” and it denotes a product string counting up from one to the number of the factorial… for example 5! means 1x2x3x4x5. The special case 0! should be read as 1. The first part, in blue, is the probability relating to either hitting on missing the powerball, where K = 26, the number of balls in the powerball basket. The second part (purple) is the multiplicity and tells you how many ways that you can draw a certain number of matches (Y) to fill a number of open slots (X), while drawing a number of mismatches (Z) in the process, where X=Y+Z. In powerball, you draw five balls, so X=5 and Y is the number of matches (anywhere from 0 to 5), while Z is the number of misses. Multiplicity shows up in stat mech and is intimately related to entropy. The totals drawn (green) is perhaps mislabeled… here I’m referring to the number of possible choices in the main basket, N=69, and the number of those that will not be drawn M = N – X, or 64. I should probably have called it “Main basket balls” or something. The last two parts determine the probabilities related to the given number of hits (Y) (orange) and the given number of misses (Z) (red) and I have applied the product operator to spiffy up the notation. Product operator is another iterand much like the summation operator and means that you repeatedly multiply successive values, much like a factorial, but where the value you are multiplying is produced from a particular range and given a set form. In these, the small script m and n start at zero (my bad, this should be under the Pi) and iterate until they are just less than the number up top (Y – 1 or Z – 1 and not equal to). At the extreme cases of either all hits or all misses, the relevant product operator (either Miss or Hit respectively) must be set equal to one in order to not count it.

This is one of those rare situations where the American public does a probability experiment with the values all well recorded where it’s possible to see the outcomes. How hard is it to win the grand prize? Well, the odds are one in 292 million. Consider that the population of the United States is 323 million. That means that if everybody in the United States bought one powerball number, about one person would win.

Only one.

Thanks to the power of the media, everybody has the opportunity to know that somebody won. Or not. That this person exists, nobody wants to doubt, but consider that the odds of winning are so scant that you not only won’t win, but you pretty likely will never meet anyone who did. Sort of surreal… everything is above board, you would think, but the rarity is so rare that there’s no assurance that it ever actually happens. You can suppose that maybe it does happen because people do win those dinky $4 prizes, but maybe this is just a red herring and nobody really actually wins! Those winner testimonials could be from actors!

Yeah, I’m not much of a conspiracy theorist, but it is true that a founding tenant of the idea of a ‘limit’ in math is that 99.99999% is effectively 100%. Going to the limit where the discrepancy is so small as to be infinitesimal is what calculus is all about. It is fair to say that it very nearly never happens! Everybody wants to be the one who beats the odds, which is why Powerball tickets are sold, but the extraordinarily vast majority never will win anything useful… I say “useful” because winning $4 or $7 is always a net loss. You have to win one of the top three prizes for it to be anywhere near worth anything, which you likely never will.

One final fairly interesting feature of the probability is that you can make some rough predictions about how frequently the grand prize is won based on how frequently the first prize is won. First prize is matching all five of the balls, but not the powerball. This frequency is about once per 12 million numbers, which is about 26 times more likely than all 5 plus the Powerball. In the report on winnings, a typical frequency is about 2 to 3 winners per drawing. About 1 time in 26 a person with all five manages to get the powerball too, so, with two drawings per week and about 2.5 first prize winners per drawing, that’s five winners per week… which implies that the grand prize should be won at a frequency of about once every five to six weeks –every month and a half or so. The average here will have a very large standard deviation because the number of winners is compact, meaning that the error is an appreciable portion of the measurement, which is why there is a great deal of variation in period between times when the grand prize is won. The incidence becomes much more Poissonian and stochastic, and allows some prizes to get quite big compared to others and causes their values to disperse across a fairly broad range. Uncertainty tends to dominate, making the game a bit more exciting.

While the grand prize is small, the number of people winning the first prize in a given week is small (maybe none or one), but this number grows in proportion to the size of the grand prize (maybe 5 or 6 or as high as 9). When the prize grows large enough to catch the public consciousness, the likelihood that somebody will win goes up simply because more people are playing it and this can be witnessed in the fluctuating frequency of the wins of lower level prizes. It breathes around the pulse of maybe 200 million dollars, lubbing at 40 million (maybe 0 to 1 person winning the first prize) and dubbing at 250 million (with 5 people or more winning the first prize).

Quite a story is told if you’re boring and as easily amused as me.

In my opinion, if you do feel inclined to play the game, be aware that when I say you probably won’t win, I mean that the numbers are so strongly against you that you do not appreciably improve your odds by throwing down $100 or even $1,000. The little $4 wins do happen, but they never pay and $1,000 spent will likely not get you more than $100 in total of winnings. It might as well be a voluntary tax. Cherish the dream your $2 buys, but do not stake your well-being on it. There’s nothing wrong with dreaming as long as you understand where to wake up.

(edit 8-24-17)

There was a grand prize winner last night (Wednesday 8-23-17). The outcomes are almost completely as should be expected: the winner is in Massachusetts… the majority of the country’s population is located in states on either the east or west coast, so this is unsurprising. There were 40 match 5 winners, so you would anticipate at least one to be a grand prize winner, which is exactly what happened (1 in 26 difference between 5 with powerball and 5 without). There were about 5.9 million powerball-only winners, so 38.32*5.9 is 226 million total powerball numbers sold in the run-up to last night’s drawing… with grand prize odds of 1 in 292 million, this is approaching parity. This means that more than $452 million was spent since Saturday on powerball lottery numbers (calculation excludes the extra dollar spent on multipliers). About five times as many ticket numbers were sold for this drawing as when I made my original analysis a week ago. With that many tickets sold, there was almost assuredly going to be a winner last night. This is not to say there shouldn’t have been a winner before this –probability is a fickle mistress– but the numbers are such that it was unlikely, but not impossible, for the prize to grow bigger. The last time the powerball was won was on 6-10-17, about two months and thirteen days ago… you can know that this is an unusually large jackpot because this period is longer than the usual period between wins (I had generously estimated 6 weeks based on the guess of 2 match 5 winners per drawing, but I think this might actually be a bit too high).

There was only one grand prize winning number out of 226 million tickets sold (not counting all the drawings that failed to yield a grand prize winner prior to this.) Think on that for a moment.

Parity symmetry in Quantum Mechanics

I haven’t written about my problem play for a while. Since last I wrote about rotational problems, I’ve gone through the entire Sakurai chapter 4, which is an introduction to symmetry. At the moment, I’m reading Chapter 5 while still thinking about some of the last few problems in Chapter 4.

I admit that I had a great deal of trouble getting motivated to attack the Chapter 4 problems. When I saw the first aspects of symmetry in class, I just did not particularly understand it. Coming back to it on my own was not much better. Abstract symmetry is not easy to understand.

In Sakurai chapter 4, the text delves into a few different symmetries that are important to quantum mechanics and pretty much all of them are difficult to see at first. As it turns out, some of these symmetries are very powerful tools. For example, use of the reflection symmetry operation in a chiral molecule (like the C-alpha carbon of proteins or the hydrated carbons of sugars) can reveal neighboring degenerate ground states which can be accessed by racemization, where an atomic substituent of the molecule tunnels through the plane of the molecule and reverses the chirality of the state at some infrequent rate. Another example is translation symmetry operation, where a lattice of identical attractive potentials serves to hide a near infinite number of identical states where a bound particle can hop from one minimum to the next and traverse the lattice… this behavior essentially a specific model describing the passage of electrons through a crystalline semiconductor.

One of the harder symmetries was time reversal symmetry. I shouldn’t say “one of the harder;” for me time reversal was the hardest to understand and I would be hesitant to say that I completely understand it yet. Time reversal operator causes time to translate backward, making momenta and angular momenta reverse. Time reversal is really hard because the operator is anti-unitary, meaning that the operation switches the sign on complex quantities that it operates on. Nevertheless, time reversal has some interesting outcomes. For instance, if a spinless particle is bound to a fixed center where the state in question is not degenerate (Only one state at the given energy), time reversal says that the state can have no average angular momentum (it can’t be rotating or orbiting). On the other hand, if the particle has spin, the bound state must be degenerate because the particle can’t have no angular momentum!

A quick digression here for the laymen: in quantum mechanics, the word “degenerate” is used to refer to situations where multiple states lie on top of one another and are indistinguishable. Degeneracy is very important in quantum mechanics because certain situations contain only enough information to know an incomplete picture of the model where more information is needed to distinguish alternative answers… coexisting alternatives subsist in superposition, meaning that a wave function is in a superposition of its degenerate alternative outcomes if there is no way to distinguish among them. This is part of how entanglement arises: you can generate entanglement by creating a situation where discrete parts of the system simultaneously occupy degenerate states encompassing the whole system. The discrete parts become entangled.

Symmetry is important because it provides a powerful tool by which to break apart degeneracy. A set of degenerate states can often be distinguished from one another by exploiting the symmetries present in the system. L- and R- enantiomers in a molecule are related by a reflection symmetry at a stereo center, meaning that there are two states of indistinguishable energy that are reflections of one another. People don’t often notice it, but chemists are masters of quantum mechanics even though they typically don’t know as much of the math: how you build molecules is totally governed by quantum mechanics and chemists must understand the qualitative results of the physical models. I’ve seen chemists speak competently of symmetry transformations in places where the physicists sometimes have problems.

Another place where symmetry is important is in the search for new physics. The way to discover new physical phenomena is to look for observational results that break the expected symmetries of a given mathematical model. The LHC was built to explore symmetries. Currently known models are said to hold CPT symmetry, referring to Charge, Parity and Time Reversal symmetry… I admit that I don’t understand all the implications of this, but simply put, if you make an observation that violates CPT, you have discovered physics not accounted for by current models.

I held back talking about Parity in all this because I wanted to speak of it in greater detail. Of the symmetries covered in Sakurai chapter 4, I feel that I made the greatest jump in understanding on Parity.

Parity is symmetry under space inversion.

What?

Just saying that sounds diabolical. Space inversion. It sounds like that situation in Harry Potter where somebody screws up trying to disapparate and manages to get splinched… like they space invert themselves and can’t undo it.

The parity operation carries all the cartesian variables in a function to their negative values.

parity operation

Here Phi just stands in for the parity operator. By performing the parity operation, all the variables in the function which denote spatial position are turned inside out and sent to their negative value. Things get splinched.

You might note here that applying parity twice gets you back to where you started, unsplinching the splinched. This shows that parity operator has the special property that it is it’s own inverse operation. You might understand how special this is by noting that we can’t all literally be our own brother, but the parity operator basically is.

parity2.jpg

Applying parity twice is like multiplying by 1… which is how you know parity is its own inverse. This also makes parity a unitary operator since it doesn’t effect absolute value of the function. Parity operation times inverse parity is one, so unitary.

parity3 or parity4

Here, the daggered superscript means “complex conjugate” which is an automatic requirement for the inverse operation if you’re a unitary operator. Hello linear algebra. Be assured I’m not about the break out the matrices, so have no fear. We will stay in a representation free zone. In this regard, parity operation is very much like a rotation: the inverse operation is the complex conjugate of the operation, never mind the details that the inverse operation is the operation.

Parity symmetry is “symmetry under the parity operation.” There are many states that are not symmetric under parity, but we would be interested in searching particularly for parity operation eigenstates, which are states that parity operator will transform to give back that state times some constant eigenvalue. As it turns out, the parity operator can only ever have two eigenvalues, which are +1 and -1. A parity eigenstate is a state that only changes its sign (or not) when acted on by the parity operator. The parity eigenvalue equations are therefore:

parity5a

All this says is that under space inversion, the parity eigenstates will either not be affected by the transformation, or will be negative of their original value. If the sign doesn’t change, the state is symmetric under space inversion (called even). But, if the sign does change, the state is antisymmetric under space inversion (called odd). As an example, in a space of one dimension (defined by ‘x’), the function sine is antisymmetric (odd) while the function cosine is symmetric (even).

Graph

In this image, taken from a graphing app on my smartphone, the white curve is plain old sine while the blue curve is the parity transformed sine. As mentioned, cosine does not change under parity.

As you may be aware, sines and cosines are energy eigenstates for the particle-in-the-box problem and so would constitute one example of legit parity eigenstates with physical significance.

Operators can also be transformed by parity. In order to see the significance, you just note that the definition of parity is that the position operation is reversed. So, a parity transformation of the position operator is this:

parity6

Kind of what should be expected. Position under parity turns negative.

As expressed, all of this is really academic. What’s the point?

Parity can give some insights that have deep significance. The deepest result that I understood is that matrix elements and expectation values will conserve with parity transformation. Matrix elements are a generalization of the expectation value where the bra and ket are not necessarily to the same eigenfunction. The proof of the statement here is one line:

parity7

At the end, the squiggles all denote parity transformed values, ‘m’ and ‘n’ are blanket eigenstates with arbitrary parity eigenvalues and V is some miscellaneous operator. First, the complex conjugation that turns a ket into a bra does not affect the parity eigenvalue equation, since parity is its own inverse operation and since the eigenvalues of 1 and -1 are not complex, so the bra above has just the same eigenvalue as if it were a ket. So, the matrix element does not change with the parity transformation –the combined parity transformation of all these parts are as if you just multiplied by identity a couple times, which should do nothing but return the original value.

What makes this important is that it sets a requirement on how many -1 eigenvalues can appear within the parity transformed matrix element (which is equal to the original matrix element): it can never be more than an even number (either zero or two). For the element to exist (that is, for it to have a non-zero value), if the initial and final states connected by the potential are both parity odd or parity even, the potential connecting them must be symmetric. Conversely, if the potential is parity odd, either the initial or final state must be odd, while the other is even. To sum up, a parity odd operator has non-zero matrix elements only when connecting states of differing parity while a parity even operator must connect states of the same parity. This restriction is observed simply by noting that the sign can’t change between a matrix element and the parity transformed matrix element.

Now, since an expectation value (average position, for example) is always a matrix element connecting an eigenket to itself, expectation values can only be non-zero for operators of even parity. For example, in a system defined across all space, average position ends up being zero because the position operator is odd, while both eigenbra and eigenket are of the same function, and therefore have the same parity. For average position to be non-zero, the wavefunction would need to be a superposition of eigenkets of opposite parity (and therefore not an eigenstate of parity at all!)

A tangible, far reaching result of this symmetry, related particularly to the position operator, is that no pure eigenstate can have an electric dipole moment. The dipole moment operator is built around the position operator, so a situation where position expectation value goes to zero will require dipole moment to be zero also. Any observed electric dipole moment must be from a mixture of states.

If you stop and think about that, that’s really pretty amazing. It tells you whether an observable is zero or not depending on which eigenkets are present and whether the operator for that observable can be inverted or not.

Hopefully I got that all correct. If anybody more sophisticated than me sees holes in my statement, please speak up!

Welcome to symmetry.

(For the few people who may have noticed, I still have it in mind to write more about the magnets puzzle, but I really haven’t had time recently. Magnets are difficult.)

Magnets, how do they work? (part 1)

Subtitle: Basic derivation of Ampere’s Law from the Biot-Savart equation.

Know your meme.

It’s been a while since this became a thing, but I think it’s actually a really good question. Truly, the original meme exploded from an unlikely source who wanted to relish in appreciating those things that seem magical without really appreciating how mind-bending and thought-expanding the explanation to this seemingly earnest question actually is.

As I got on in this writing, I realized that the scope of the topic is bigger than can be tackled in a single post. What is presented here will only be the first part (though I haven’t yet had a chance to write later parts!) The succeeding posts may end up being as mathematical as this, but perhaps less so. Moveover, as I got to writing, I realized that I haven’t posted a good bit of math here in a while: what good is the the mathematical poetry of physics if nobody sees it?

Magnets do not get less magical when you understand how they work: they get more compelling.

magnet-stem-cell-therapy

This image, taken from a website that sells quackery, highlights the intriguing properties of magnets. A solid object with apparently no moving parts has this manner of influencing the world around it. How can that not be magical? Lodestones have been magic forever and they do not get less magical with the explanation.

Truthfully, I’ve been thinking about the question of how they work for a couple days now. When I started out, I realized that I couldn’t just answer this out of hand, even though I would like to think that I’ve got a working understanding of magnetic fields –this is actually significant to me because the typical response to the Insane Clown Posse’s somewhat vacuous pondering is not really as simple as “Well, duh, magnetic fields you dope!” Someone really can explain how magnets work, but the explanation is really not trivial. That I got to a level in asking how they work where I said, “Well, um, I don’t really know this,” got my attention. How the details fit together gets deep in a hurry. What makes a bar magnet like the one in the picture above special? You don’t put batteries in it. You don’t flick a switch. It just works.

For most every person, that pattern above is the depth of how it works. How does it work? Well, it has a magnetic field. And, everybody has played with magnets at some point, so we sort of all know what they do, if not how they do it.

KONICA MINOLTA DIGITAL CAMERA

In this picture from penguin labs, these magnets are exerting sufficient force on one another that many of them apparently defy gravity. Here, the rod simply keeps the magnets confined so that they can’t change orientations with respect to one another and they exert sufficient repulsive force to climb up the rod as if they have no weight.

It’s definitely cool, no denying. There is definitely a quality to this that is magical and awe inspiring.

But, is it better knowing how they work, or just blindly appreciating them because it’s too hard to fill in the blank?

The central feature of how magnets work is quite effortlessly explained by the physics of Electromagnetism. Or, maybe it’s better to say that the details are laboriously and completely explained. People rebel against how hard it is to understand the details, but no true explanation is required to be easily explicable.

The forces which hold those little pieces of metal apart are relatively understandable.

Lorentz force

Here’s the Lorentz force law. It says that the force (F) on an object with a charge is equal to sum of the electric force on the object (qE) plus the magnetic force (qvB). Magnets interact solely by magnetic force, the second term.

2000px-lorentz_force-svg

In this picture from Wikipedia, if a charge (q) moving with speed (v) passes into a region containing this thing we call a “magnetic field,” it will tend to curve in its trajectory depending on whether the charge is negative or positive. We can ‘see’ this magnetic field thing in the image above with the bar magnet and iron filings. What is it, how is it produced?

The fundamental observation of magnetic fields is tied up into a phenomenological equation called the Biot-Savart law.

Biotsavart1

This equation is immediately intimidating. I’ve written it in all of it’s horrifying Jacksonian glory. You can read this equation like a sentence. It says that all the magnetic field (B) you can find at a location in space (r) is proportional to a sum of all the electric currents (J) at all possible locations where you can find any current (r’) and inversely proportional to the square of the distance between where you’re looking for the magnetic field and where all the electrical currents are –it may say ‘inverse cube’ in the equation, but it’s actually an inverse square since there’s a full power of length in the numerator. Yikes, what a sentence! Additionally, the equation says that the direction of the magnetic field is at right angles to both the direction that the current is traveling and the direction given by the line between where you’re looking for magnetic field and where the current is located. These directions are all wrapped up in the arrow scripts on every quantity in the equation and are determined by the cross-product as denoted by the ‘x’. The difference between the two ‘r’ vectors in the numerator creates a pure direction between the location of a particular current element and where you’re looking for magnetic field. The ‘d’ at the end is the differential volume that confines the electric currents and simply means that you’re adding up locations in 3D space. The scaling constants outside the integral sign are geometrical and control strength; the 4 and Pi relate to the dimensionality of the field source radiated out into a full solid angle (it covers a singularity in the field due to the location of the field source) and the ‘μ’ essentially tells how space broadcasts magnetic field… where the constant ‘μ’ is closely tied to the speed of light. This equation has the structure of a propagator: it takes an electric current located at r’ and propagates it into a field at r.

It may also be confusing to you that I’m calling current ‘J’ when nearly every basic physics class calls it ‘I’… well, get used to it. ‘Current vector’ is a subtle variation of current.

I looked for some diagrams to help depict Biot-Savart’s components, but I wasn’t satisfied with what Google coughed up. Here’s a rendering of my own with all the important vectors labeled.

biotsavart diagram

Now, I showed the crazy Biot-Savart equation, but I can tell you right now that it is a pain in the ass to work with. Very few people wake up in the morning and say “Boy oh boy, Biot-Savart for me today!” For most physics students this equation comes with a note of dread. Directly using it to analytically calculate magnetic fields is not easy. That cross product and all the crazy vectors pointing in every which direction make this equation a monster. There are some basic feature here which are common to many fields, particularly the inverse square, which you can find in the Newtonian gravity formula or Coulomb’s law for electrostatics, and the field being proportional to some source, in this case an electric current, where gravity has mass and electrostatics have charge.

Magnetic field becomes extraordinary because of that flipping (God damned, effing…) cross product, which means that it points in counter-intuitive directions. With electrostatics and gravity, the field is usually going toward or away from the source, while magnetism has the field seems to be going ‘around’ the source. Moreover, unlike electrostatics and gravity, the source isn’t exactly a something, like a charge or a mass, it’s dynamic… as in a change in state; electric charges are present in a current, but if you have those charges sitting stationary, even though they are still present, they can’t produce a magnetic field. Moreover, if you neutralize the charge, a magnetic field can still be present if those now invisible charges are moving to produce a current: current flowing in a copper wire is electric charges that are moving along the wire and this produces a magnetic field around the wire, but the presence of positive charges fixed to the metal atoms of the wire neutralizes the negative charges of the moving electrons, resulting in a state of otherwise net neutral charge. So, no electrostatic field, even though you have a magnetic field. It might surprise you to know that neutron stars have powerful magnetic fields, even though there are no electrons or protons present in order give any actual electric currents at all. The requirement for moving charges to produce a magnetic field is not inconsistent with the moving charge required to feel force from a magnetic field as well. Admittedly, there’s more to it than just ‘currents’ but I’ll get to that in another post.

With a little bit of algebraic shenanigans, Biot-Savart can be twisted around into a slightly more tractable form called Ampere’s Law, which is one of the four Maxwell’s equations that define electromagnetism. I had originally not intended to show this derivation, but I had a change of heart when I realized that I’d forgotten the details myself. So, I worked through them again just to see that I could. Keep in mind that this is really just a speed bump along the direction toward learning how magnets work.

For your viewing pleasure, the derivation of the Maxwell-Ampere law from the Biot-Savart equation.

In starting to set up for this, there are a couple fairly useful vector identities.

Useful identities 1

This trio contains several basic differential identities which can be very useful in this particular derivation. Here, the variables r are actually vectors in three dimensions. For those of you who don’t know these things, all it means is this:

vectors

These can be diagrammed like this:

vector example

This little diagram just treats the origin like the corner of a 3D box and each distance is a length along one of the three edges emanating from the corner.

I’ll try not to get too far afield with this quick vector tutorial, but it helps to understand that this is just a way to wrap up a 3D representation inside a simple symbol. The hatted symbols of x,y and z are all unit vectors that point in the relevant three dimensional directions where the un-hatted symbols just mean a variable distance along x or y or z. The prime (r’) means that the coordinate is used to tell where the electric current is located while the unprime (r) means that this is the coordinate for the magnetic field. The upside down triangle is an operator called ‘del’… you may know it from my hydrogen wave function post. What I’m doing here is quite similar to what I did over there before. For the uninitiated, here are gradient, divergence and curl:

gradivcurl

Gradient works on a scalar function to produce a vector, divergence works on a vector to produce a scalar function and curl works on a vector to produce a vector. I will assume that the reader can take derivatives and not go any further back than this. The operations on the right of the equal sign are wrapped up inside the symbols on the left.

One final useful bit of notation here is the length operation. Length operation just finds the length of a vector and is denoted by flat braces as an absolute value. Everywhere I’ve used it, I’ve been applying it to a vector obtained by finding the distance between where two different vectors point:

lengthoperation

As you can see, notation is all about compressing operations away until they are very compact. The equations I’ve used to this point all contain a great deal of math lying underneath what is written, but you can muddle through by the examples here.

Getting back to my identity trio:

Useful identities 1

The first identity here (I1) takes the vector object written on the left and produces a gradient from it… the thing in the quotient of that function is the length of the difference between those two vectors, which is simply a scalar number without a direction as shown in the length operation as written above.

The second identity (I2) here takes the divergence of the gradient and reveals that it’s the same thing as a Dirac delta (incredibly easy way to kill an integral!). I’ve not written the operation as divergence on a gradient, but instead wrapped it up in the ‘square’ on the del… you can know it’s a divergence of a gradient because the function inside the parenthesis is a scalar, meaning that the first operation has to be a gradient, which produces a vector, which automatically necessitates the second operation to be a divergence, since that only works on vectors to produce scalars.

The third identity (I3) shows that the gradient with respect to the unprimed vector coordinate system is actually equal to a negative sign times the primed coordinate system… which is a very easy way to switch from a derivative with respect to the first r and the same form of derivative with respect to the second r’.

To be clear, these identities are tailor-made to this problem (and similar electrodynamics problems) and you probably will never ever see them anywhere but the *cough cough* Jackson book. The first identity can be proven by working the gradient operation and taking derivatives. The second identity can be proven by using the vector divergence theorem in a spherical polar coordinate system and is the source of the 4*Pi that you see everywhere in electromagnetism. The third identity can also be proven by the same method as the first.

There are two additional helpful vector identities that I used which I produced in the process of working this derivation. I will create them here because, why not! If the math scares you, you’re on the wrong blog. To produce these identities, I used the component decomposition of the cross product and a useful Levi-Civita kroenecker delta identity –I’m really bad at remembering vector identities, so I put a great deal of effort into learning how to construct them myself: my Levi-Civita is ghetto, but it works well enough. For those of you who don’t know the ol’ Levi-Civita symbol, it’s a pretty nice tool for constructing things in a component-wise fashion: εijk . To make this work, you just have to remember it as I just wrote it… if any indices are equal, the symbol is zero, if they are all different, they are 1 or -1. If you take it as ijk, with the indices all different as I wrote, it equals 1 and becomes -1 if you reverse two of the indices: ijk=1, jik=-1, jki=1, kji=-1 and so on and so forth. Here are the useful Levi-Civita identities as they relate to cross product:

levicivita

Using these small tools, the first vector identity that I need is a curl of a curl. I derive it here:

vector id 1

Let’s see how this works. I’ve used colors to show the major substitutions and tried to draw arrows where they belong. If you follow the math, you’ll note that the Kroenecker deltas have the intriguing property of trading out indices in these sums. Kroenecker delta works on a finite sum the same way a Dirac delta works on an integral, which is nothing more than an infinite sum. Also, the index convention says that if you see duplicated indices, but without a sum on that index, you associate a sum with that index… this is how I located the divergences in that last step. This identity is a soft stopping point for the double curl: I could have used the derivative produce rule to expand it further, but that isn’t needed (if you want to see it get really complex, go ahead and try it! It’s do-able.) One will note that I have double del applied on a vector here… I said that it only applies on scalars above… in this form, it would only act on the scalar portion of each vector component, meaning that you would end up with a sum of three terms multiplied by unit vectors! Double del only ever acts on scalars, but you actually don’t need to know that in the derivation below.

This first vector identity I’ve produced I’ll call I4:

useful vector id 1

Here’s a second useful identity that I’ll need to develop:

useful vector id 2

This identity I’ll call I5:

vector id 2

*Pant Pant* I’ve collected all the identities I need to make this work. If you don’t immediately know something off the top of your head, you can develop the pieces you need. I will use I1, I2, I3, I4 and I5 together to derive the Maxwell-Ampere Law from Biot-Savart. Most of the following derivation comes from Jackson Electrodynamics, with a few small embellishments of my own.

first line amp devIn this first line of the derivation, I’ve rewritten Biot-Savart with the constants outside the integral and everything variable inside. Inside the integral, I’ve split the meat so that the different vector and scalar elements are clear. In what follows, it’s very important to remember that unprimed del operators are in a different space from the primed del operators: a value (like J) that is dependent on the primed position variable is essentially a constant with respect to the unprimed operator and will render a zero in a derivative by the unprimed del. Moreover, unprimed del can be moved into or out of the integral, which is with respect to the primed position coordinates. This observation is profoundly important to this derivation.

BS to amp 1

The usage of the first two identities here manages to extract the cross product from the midst of the function and puts it into a manipulable position where the del is unprimed while the integral is primed, letting me move it out of the integrand if I want.

BS to amp 2

This intermediate contains another very important magnetic quantity in the form of the vector potential (A) –“A” here not to be confused with the alphabetical placeholder I used while deriving my vector identities. I may come back to vector potential later, but this is simply an interesting stop-over for now. From here, we press on toward the Maxwell-Ampere law by acting in from the left with a curl onto the magnetic field…

BS to amp 3

The Dirac delta I end with in the final term allows me to collapse r’ into r at the expense of that last integral. At this point, I’ve actually produced the magnetostatic Ampere’s law if I feel like claiming that the current has no divergence, but I will talk about this later…

BS to amp 4

This substitution switches del from being unprimed to primed, putting it in the same terms as the current vector J. I use integration by parts next to switch which element of the first term the primed del is acting on.

BS to amp 5

Were I being really careful about how I depicted the integration by parts, there would be a unit vector dotted into the J in order to turn it into a scalar sum in that first term ahead of the integral… this is a little sloppy on my part, but nobody ever cares about that term anyway because it’s presupposed to vanish at the limits where it’s being evaluated. This is a physicist trick similar to pulling a rug over a mess on the floor –I’ve seen it performed in many contexts.

BS to amp 6

This substitution is not one of the mathematical identities I created above, this is purely physics. In this case, I’ve used conservation of charge to connect the divergence of the current vector to the change in charge density over time. If you don’t recognize the epic nature of this particular substitution, take my word for it… I’ve essentially inverted magnetostatics into electrodynamics, assuring that a ‘current’ is actually a form of moving charge.

BS to amp 75

In this line, I’ve switched the order of the derivatives again. Nothing in the integral is dependent on time except the charge density, so almost everything can pass through the derivative with respect to time. On the other hand, only the distance is dependent on the unprimed r, meaning that the unprimed del can pass inward through everything in the opposite direction.

BS to amp 8

At this point something amazing has emerged from the math. Pardon the pun; I’m feeling punchy. The quantity I’ve highlighted blue is a form of Coulomb’s law! If that name doesn’t tickle you at the base of your spine, what you’re looking at is the electrostatic version of the Biot-Savart law, which makes electric fields from electric charges. This is one of the reasons I like this derivation and why I decided to go ahead and detail the whole thing. This shows explicitly a connection between magnetism and electrostatics where such connection was not previously clear.

BS to amp 9

And thus ends the derivation. In this casting, the curl of the magnetic field is dependent both on the electric field and on currents. If there is no time varying electric field, that first term vanishes and you get the plain old magnetostatic Ampere’s law:

Ampere's law

This says simply that the curl of the magnetic field is equal to the current. There are some interesting qualities to this equation because of how the derivation leaves only a single positional dependence. As you can see, there is no separate position coordinate to describe magnetic field independently from its source. And, really, it isn’t describing the magnetic field as ‘generated’ by the current, but rather that a deformation to the linearity of the magnetic field is due to the presence of a current at that location… which is an interesting way to relate the two.

This relationship tends to cause magnetic lines to orbit around the current vector.

magcur

This image from hyperphysics sums up the whole situation –I realize I’ve been saying something similar from way up, but this equation is proof. If you have current passing along a wire, magnetic field will tend to wrap around the wire in a right handed sense. For all intents and purposes, this is all the Ampere’s law says, neglecting that you can manipulate the geometry of the situation to make the field do some interesting things. But, this is all.

Well, so what? I did a lot of math. What, if anything, have I gained from it? How does this help me along the path to understanding magnets?

The Ampere Law is useful in generating very simple magnetic field configurations that can be used in the Lorentz force law, ultimately showing a direct dynamical connection between moving currents and magnetic fields. I have it in mind to show a freshman level example of how this is done in the next part of this series. Given the length of this post, I will do more math in a different post.

This is a big step in the direction of learning how magnets work, but it should leave you feeling a little unsatisfied. How exactly do the forces work? In physics, it is widely known that magnetic fields do no work, so why is it that bar magnets can drag each other across the counter? That sure looks like work to me! And if electric currents are necessary to drive magnets, why is it that bar magnets and horseshoe magnets don’t require batteries? Where are the electric currents that animate a bar magnet and how is it that they seem to be unlimited or unpowered? These questions remain to be addressed.

Until the next post…

What is a qubit?

I was trolling around in the comments of a news article presented on Yahoo the other day. What I saw there has sort of stuck with me and I’ve decided I should write about it. The article in question, which may have been by an outfit other than Yahoo itself, was about the recent decision by IBM to direct a division of people toward the task of learning how to program a quantum computer.

Using the word ‘quantum’ in the title of a news article is a sure fire way to incite click-bait. People flock in awe to quantum-ness even if they don’t understand what the hell they’re reading. This article was a prime example. All the article really talked about was that IBM has decided that quantum computers are now a promising enough technology that they’re going to start devoting themselves to the task of figuring out how to compute with them. Note, the article spent a lot of time kind of masturbating over how marvelous quantum computers will be, but it really actually didn’t say anything new. Another tech company deciding to pretend to be in quantum computing by figuring out how to program an imaginary computer is not an advance in our technology… digital quantum computers are generally agreed to be at least a few years off yet and they’ve been a few years off for a while now. There’s no guarantee that the technology will suddenly emerge into the mainstream –and I’m neglecting the DSpace quantum computer because it is generally agreed among experts that DSpace hasn’t even managed to prove that their qubits remain coherent through a calculation to actually be a useful quantum computer, let alone that they achieved anything at all by scaling it up.

The title of this article was a prime example of media quantum click-bait. The title boldly declared that “IBM is planning to build a quantum computer millions of times faster than a normal computer.” Now, that title was based on an extrapolation in the midst of the article where a quantum computer containing a mere 1000 qubits suddenly becomes the fastest computing machine imaginable. We’re very used to computers that contain gigabytes of RAM now, which is actually several billion on-off switches on the chip, so a mere 1,000 qubits seems like a really tiny number. This should be underwritten with the general concerns of the physics community that an array of 100 entangled qubits may exceed what’s physically possible… and it neglects that the difficulty of dealing with entangled systems increases exponentially with the number of qubits to be entangled. Scaling up normal bits doesn’t bump into the same difficulty. I don’t know if it’s physically possible or not, but I am aware that IBM’s declaration isn’t a major break-through so much as splashing around a bit of tech gism to keep the stockholders happy. All the article really said was that IBM has happily decided to hop on the quantum train because that seems to be the thing to do right now.

I really should understand that trolling around in the comments on such articles is a lost cause. There are so many misconceptions about quantum mechanics running around in popular culture that there’s almost no hope of finding the truth in such threads.

All this background gets us to what I was hoping to talk about. One big misconception that seemed to be somewhat common among commenters on this article is that two identical things in two places actually constitute only one thing magically in two places. This may stem from a conflation of what a wave function is versus what a qubit is and it may also be a big misunderstanding of the information that can be encoded in a qubit.

In a normal computer we all know that pretty much every calculation is built around representing numbers using binary. As everybody knows, a digital computer switch has two positions: we say that one position is 0 and the other is 1. An array of two digital on-off switches then can produce four distinct states: in binary, to represent the on-off settings of these states, we have 00, 01, 10 and 11. You could easily map those four settings to mean 1, 2, 3 and 4.

Suppose we switch now to talk about a quantum computer where the array is not bits anymore, but qubits. A very common qubit to talk about is the spin of an atom or an electron. This atom can be in two spin states: spin-up and spin-down. We could easily map the state spin-up to be 1, and call it ‘on,’ while spin-down is 0, or ‘off.’ For two qubits, we then get the states 00, 01, 10 and 11 that we had before, where we know about what states the bits are in, but we also can turn around and invoke entanglement. Entanglement is a situation where we create a wave function that contains multiple distinct particles at the same time such that the states those particles are in are interdependent on one another based upon what we can’t know about the system as a whole. Note, these two particles are separate objects, but they are both present in the wave function as separate objects. For two spin-up/spin-down type particles, this can give access to the so-called singlet and triplet states in addition to the normal binary states that the usual digital register can explore.

The quantum mechanics works like this. For the system of spin-up and spin-down, the usual way to look at this is in increments of spinning angular momentum: spin-up is a 1/2 unit of angular momentum pointed up while spin-down is -1/2 unit of angular moment, but pointed the opposite direction because of the negative sign. For the entangled system of two such particles, you can get three different values of entangled angular momentum: 1, 0 and -1. Spin 1 has both spins pointing up, but not ‘observed,’ meaning that it is completely degenerate with the 11 state of the digital register since it can’t fall into anything but 11 when the wave function collapses. Spin -1 is the same way: both spins are down, meaning that they have 100% probability of dropping into 00. The spin 0 state, on the other hand, is kind of screwy, and this is where the extra information encoding space of quantum computing emerges. The 0 states could be the symmetric combination of spin-up with spin-down or the anti-symmetric combination of the same thing. Now, these are distinct states, meaning that the size of your register just expanded from (00, 01, 10 and 11) to (00, 01, 10, 11 plus anti-symmetric 10-01 and symmetric 10+01). So, the two qubit register can encode 6 possible values instead of just 4. I’m still trying to decide if the spin 1 and -1 states could be considered different from 11 and 00, but I don’t think they can since they lack the indeterminacy present in the different spin 0 states. I’m also somewhat uncertain whether you have two extra states to give a capacity in the register of 6 or just 5 since I’m not certain what the field has to say about the practicality of determining the phase constant between the two mixed spin-up/spin-down eigenstates, since this is the only way to determine the difference between the symmetric and anti-symmetric combinations of spin.

As I was writing here, I realized also that I made a mistake myself in the interpretation of the qubit as I was writing my comment last night. At the very unentangled minimum, an array of two qubits contains the same number of states as an array of two normal bits. If I consider only the states possible by entangled qubits, without considering the phasing constant between 10+01 and 10-01, this gives only three states, or at most four states with the phase constant. I wrote my comment without including the four purely unentangled cases, giving fewer total states accessible to the device, or at most the same number.

Now, the thing that makes this incredibly special is that the number of extra states available to a register of qubits grows exponentially with the number of qubits present in the register. This means that a register of 10 qubits can encode many more numbers than a register of ten bits! Further, this means that fewer bits can be used to make much bigger calculations, which ultimately translates to a much faster computer if the speed of turning over the register is comparable to that of a more conventional computer –which is actually somewhat doubtful since a quantum computer would need to repeat calculations potentially many times in order to build up quantum statistics.

One of the big things that is limiting the size of quantum computers at this point is maintaining coherence. Maintaining coherence is very difficult and proving that the computer maintains all the entanglements that you create 100% of the time is exceptionally non-trivial. This comes back to the old cat-in-the-box difficulty of truly isolating the quantum system from the rest of the universe. And, it becomes more non-trivial the more qubits you include. I saw a seminar recently where the presenting professor was expressing optimism about creating a register of 100 Josephson junction type qubits, but was forced to admit that he didn’t know for sure whether it would work because of the difficulties that emerge in trying to maintain coherence across a register of that size.

I personally think it likely that we’ll have real digital quantum computers in the relatively near future, but I think the jury is still out as to exactly how powerful they’ll be when compared to conventional computers. There are simply too many variables yet which could influence the power and speed of a quantum computer in meaningful ways.

Coming back to my outrage at reading comments in that thread, I’m still at ‘dear god.’ Quantum computers do not work by teleportation: they do not have any way of magically putting a single object in multiple places. The structure of a wave function is defined simply by what you consider to be a collection of objects that are simultaneously isolated from the rest of the universe at a given time. A wave function quite easily spans many objects all at once since it is merely a statistical description of the disposition of that system as seen from the outside, and nothing more. It is not exactly a ‘thing’ in and of itself insomuch as collections of indescribably simple objects tend to behave in absolutely consistent ways among themselves. Where it becomes wave-like and weird is that we have definable limits to how precisely we can understand what’s going on at this basic level and that our inability to directly ‘interact’ with that level more or less assures that we can’t ever know everything about that level or how it behaves. Quantum mechanics follows from there. It really is all about what’s knowable; building a situation where certain things are selectively knowable is what it means to build a quantum computer.

That’s admittedly pretty weird if you stop and think about it, but not crazy or magical in that wide-eyed new agey smack-babbling way.

A Physicist Responds to “The Three Body Problem” part 2

To start with, this post will be almost pure spoiler. I’m assuming, if you got through part 1, that you’ve read Cixin Liu’s book.

I’ve gotten partway through the second book in the trilogy myself, meaning that I’ve had some additional time to think about the contents of this post, but that I don’t know the ultimate outcome of the series.

This post is addressing a central conclusion of the first book, a major piece of science fiction that I didn’t address in the previous post because it is so intrinsic to the plot. This is about the idea of the Sophon induced ‘science lock-down.’ An alien race is going to invade the planet Earth in 400 years and this race is concerned that Human technology will advance in that time to be more powerful than the alien race’s own technology, so the aliens have played a trick to prevent humans from performing fundamental scientific research in order to prevent human technology from developing.

The key of this is the idea of the “Sophon.”As mentioned in the previous post, the word ‘proton’ was chosen over the name of an actual fundamental particle in order to facilitate a wordplay in Chinese… particularly the Chinese word that got translated into English as “Sophon.” This word was chosen from a modification of the word “Sophont.” As any science fiction aficionado can tell you, this word means “intelligent creature.” A Sophon is intended to be an intelligent proton, a robot the size and mass of a subatomic particle. These Sophons are capable to some extent of changing their size and shape and can communicate back to the aliens instantaneously. Sophons can also travel, as subatomic particles, at very nearly the speed of light.

You can see right from that paragraph the first place where the Sophon (and therefore the idea of science lock-down) are broken. Sophons communicate with the aliens instantaneously by means of quantum entanglement. If you’ve read anything else I’ve written, you know how I feel about the cliche of the ‘Ansible.’ Entanglement can’t be used to pass information: the Quantum mechanics doesn’t allow for this, no matter how you misinterpret it. Entanglement means correlation, not necessarily communication. This quantum mechanical effect is an interesting and very real phenomenon, but to understand what it actually means, you need to understand more about the rest of what quantum is… the story of ‘Three Body Problem’ never goes there. I won’t go there either except to suggest learning about the Bell Inequality.

The reason that Sophons are capable of producing science lock-down is because they can falsify data coming out of particle accelerators. Sophons can fly through the sensors in particle detectors and trigger them falsely, creating intelligently designed noise. At the surface, this is a horrible prospect, making it impossible for Humans to probe the deep structure of matter and therefore attain the understandings necessary to build Sophons ourselves. Do not pass go, no ‘correct’ results means no good science!

Obviously, this looks really bad. Very interesting science fiction idea. On the other hand, it also demands a bit of discussion, both about how particle accelerators work and on how science works.

Particle accelerators are the wrecking ball of the scientific enterprise. They generate data almost entirely by accelerating charged particles up to substantial fractions of the speed of light and slamming them into each other and into stationary targets. Particle physicists are all about impact cross sections and statistical probabilities of outcomes. The gold standard of a discovery in particle physics is a 5-sigma observation. ‘Sigma’ is, of course, standard deviation, which is a statistical standard by which scientists use the Gaussian statistical distribution to judge probability of occurrence –it’s the Bell Curve. Average is the peak of this curve, while one standard deviation is either one sigma to the left or right of average. Particle physics is set up around a simple statistical weight tabulation which can be couched as a question: “How likely is it that my observation is false/true?” If an event observed in the accelerator is spurious –that is, if the event is noise– the statistical machinery of particle physics places it close to the peak of the Bell Curve, that is at the average, which is to say that the event observed is ‘not different’ from noise. A 5-sigma event is an event which has been so well observed statistically that the difference from noise is five standard deviations from the peak of the Bell curve out into the tail (99.9999% of the curve’s area is captured within this extent of the tail!) This is essentially like saying that a conclusion is better than 99% certain to be NOT false.

Do you know how big a particle accelerator data set is? They include billions of events. Particle accelerators run for months to years on end, collecting data automatically 24 hours a day. And, the whole enterprise is based on the assumption that every observation independently might be a false outcome. Statistical weight determines the correctness of an observation. Physical theory exists to model both the trends and noise of an experiment.

As I said above, the purpose of the Sophons is to produce false results within the sensors of an accelerator’s detector apparatus. The most major detection devices in the modern systems are calorimeters and photomultipliers. Calorimeters simply detect heat deposition within the sensor volume while photomultipliers give a small current when they are perturbed by a passing electric charge. Usually, detector assemblies contain layers of sensors wrapped around the collision target where photomultipliers form multiple inner layers and calorimeters reside around the outside of the whole assembly. There are usually also magnetic fields applied through the detector so that charged particles will tend to follow curving paths as they pass outward through the different layers away from the collision site. There are other detector technologies and refinements of these ideas, but this gives a basic taste.

Here is the ALTAS detector at the Large Hadron Collider:

atlasdet

Using this layered design, photomultipliers can resolve the path of outward flying particles, determining their charges based upon their path curvature through the magnetic fields established by the solenoids and then the calorimeters determine how much energy was in the particle when that particle heats the calorimeter upon crashing into it. Certain particles types penetrate shields differently, necessitating layers of calorimeters with different structural characteristics in order to resolve different particle types. Computers correlate detection traces between the layers and tabulate what heat depositions relate to which flight paths. Particle physicists can then do simple arithmetic  to count up all the heats and all the charges on all the particles detected for one collision event and deduce which subatomic particles appeared during a particular collision. Momentum and energy/mass get conserved relativistically while charge is directly conserved and you simply add up what went in in order to account for what comes out during a collision.

In order to falsify data within such a detector, the smart subatomic particle, the Sophon, would need to fly back and forth through the detector layers, switching its charge polarity between passes and somehow dumping heat into calorimeters without being destroyed or lost in some way. How the Sophons get their kinetic energy is somewhat opaque in the story and I spent some time abortively rereading the TBP trying to figure this out, but it can be assumed that they possess a self-contained power supply which enables them to either recharge themselves from their surroundings, or simply dip into a long term battery reserve whenever they need it. They are clearly able to accelerate to highly relativistic velocities in a self-contained manner since they flew across the void from the alien homeworld to Earth, and then slowed down without external assistance at Earth. You could presume that they are able to write completely fake collision events into the detector, pretending to travel wrong velocities and masquerading as false charges and masses.

Now, like I said, this is terrible! The experiments can’t always give reliable results. Never mind that the real experiments must always be filtered for the fact that false results exist in the data set anyway.

In the paragraph above, I said “can’t always give reliable results” because the real data set of collision events still exists behind the fake data set. The Sophon flying back and forth can’t prevent real particle collisions from occurring and also interacting with the detector. The particle physicists would actually know right away that something isn’t right with the systematic structure of the experiment because they know how many particles are in their particle beams and also know the cross-sections of interaction, meaning that they start the experiment knowing statistically how many collision events to expect in a unit of time: Sophon interference with the experiment would only increase over the expected number. What you get is two overlapping data sets, one that is false and one that’s true. If the false data is much different from the true data, you inevitably bin them as distinct results because they would create a bimodal distribution to your data set… some measurements add up to five-sigma toward one result while a distinct set will ultimately add up as five-sigma toward something distinctly different. Then, you just let the theorists work out what’s what.

In the story, the scientists just throw up their hands and declare ‘sophon barrier’ saying that science ‘can’t advance’ because it can’t discern correctness.

This prospect has really kind of sat in the back of my mind, nagging me. I’m not completely certain that the author understands the overall scientific mindset or philosophy. Science starts out assuming that all results might be false! Having a falsehood layered on top of other potential falsehoods is really not that deterring to me, particularly since the scientists know the Sophon interference is present by the end of the story. Science as a process is intrinsically concerned with error checking and finding systematic interference, even intelligent fabrication of data within the scientific community –you think the Sophons are bad: somebody simply altering the data set as they see fit, completely independent of the experiment, is worse. And, we deal with this in reality! At least with the Sophons, a real data set must sit behind the mixture of false events. If the data set is merely bimodal or multimodal with statistics backing up each conclusion, you design experiments to address each… at some point, consistency of a result must ultimately dominate. Sorting out this noise would take time, but it would be unable to stop progress overall, especially since the scientists know the noise is present!

Now, giving false data is actually somewhat different than prohibiting data collection. This facet is somewhat unclear to me by the story –my memory fails. You can imagine that the Aliens realize that the humans know about the tampering and rather than leaving humans with a data set that contains some good data, they would simply have their Sophons swamp the detectors. In this, the Sophons fly back and forth within the detector giving so many false events that they prohibit the detector from being able to trigger for the resolution of real events. They could simply white us out!

While this would indeed be a bad thing, it would have a sort of a perverse effect on a real scientist. Consider: you know how fast your instrument triggers and you know the latency required for it to recover… this gives you a measure for how quickly and in what frequency the Sophon must act! You can just imagine the particle beam physicist salivating at the prospect of his Nobel prize in the nascent field of Sophon physics. Imagine the flood of grant proposals around the subject of baiting a Sophon into a particle beam line by the performance of basic science only to try to turn the particle beam against the Sophon in order to smash it apart and see how it works!

Really, if you were a high energy physicist and you knew unequivocally that a smart particle was flying around inside your instrument, how could you not be trying to figure out a way to probe it? It’s like getting Maxwell’s demon handed to you on a shiny platter!

A realistic outcome here is actually not the prohibition of science. It would be an arm-wrestling match with the Aliens: at the very best, leaving us with a partial data set that we can ultimately advance with, or giving us the chance to probe the Sophons directly.

The prospect of probing the Sophons directly contains the danger that it would be hard to distinguish engineered results from real ones, but every demonstration by the Sophons of some other confusing behavior is in fact data itself. The author made a huge argument in “Three Body Problem” that Sophons are typically point-like and would probably subscribe to the notion that they can’t be probed since they would essentially have no collision cross-section; I would resist this idea because it either violates or misunderstands quantum mechanics, which I detailed a bit in the previous post. The author might even suggest that Sophons can’t be probed because they can dodge collisions with other particles in the collider, but I would doubt that simply because of the inability for the Sophon to know things about other particles due to simple quantum mechanics and the affect of relativity altering the rates of information flow: the decision would need to be made very quickly and it would have a built in imprecision from Uncertainty! Moreover, the more time the Sophons spend performing confusing behavior in order to foil their own direct examination, the less time they can spend faking data in the experiments directed at basic research. As you may be aware, machines like the LHC are actually devoted to many lines of research simultaneously and physicists are remarkably adept at piggybacking one experiment on top of another in order to conserve resources and obtain additional bang for the same buck.

One final aspect of the “science lock-down” which I take some umbrage with is the notion that only particle accelerators are responsible for fundamental research. They aren’t. There is a huge branch of physics and chemistry probing quantum mechanics based on spectroscopy. Lasers are unequivocally a quantum mechanical device and much probing into basic quantum mechanics is performed by some variation on the theme of lasing. The Nobel prize winning discovery of the Bose-Einstein condensed matter phase did not occur in a super-collider, it occurred on an optical bench. Most super precise clock mechanisms used by the human race at this point are optical devices and have absolutely nothing to do with particle accelerators –optical gratings and optical metrology are driving the expansion of precision measurement! The leaps which are in the process of producing quantum computers (one device the author specifically prohibits in book 2 under the science lockdown!) are not being made at particle accelerators at all: they are being made in optical lattice traps on lab benches and in photo-etched masks used to produce nano-scale solid state resonators. We are currently in the process of building analog quantum computers for the purposes of simulating quantum chromodynamic systems using optical and nano-resonator devices… and this development has nothing to do with particle accelerators, except as a means of reproducing results! The author made the argument that humans couldn’t build massive super-collider accelerators, Synchrotrons and Linacs, fast enough to match the production capacity that the Aliens have for making the sophons needed to foil these instruments, but the author never even touched on the rapidly expanding field of plasma wake field acceleration, which uses lasers to accelerate particles to relativitistic speeds in bench-top apparatuses for a fraction of the price of a super-collider.

The bleeding edge of physics is very multi-pronged; the Higgs boson discovery carried out in a synchrotron may someday be reproduced by a bench-top plasma wake field accelerator for a tiny fraction of the price. Can ‘locking down’ big particle accelerators like the LHC prohibit the extensive physical exploration that is occurring due to a mostly unrelated black swan technological development like lasers? I really don’t think it can. Tying one arm behind your back leaves you with the other arm. It’s true that the mothballing of the superconducting super-collider in the United States prevented humans from definitively discovering the Higgs boson for more than a decade, but that isn’t to say that there aren’t other avenues to the same discovery.

Do I think that science lockdown is possible by the means suggested by the author? Not really. And, especially not for devices like quantum computers, which is one critical development that the author suggests is prohibited by sophon interference in the second book.

Don’t get me wrong, this is a good piece of science fiction and it’s a wonderful thought experiment, but like many thought experiments, it’s arguable.

edit 2-16-17

I saw a physics colloquium yesterday delivered by a Nobel prize winner. His lab is currently working on a molecular spectroscopy experiment directed at measuring the electric dipole moment of the electron. A precision measurement of this value ties directly to the existence (or not) of supersymmetric particle theory… which is one candidate expansion of the Standard Model of particle physics. This experiment is not being done in a super collider, but on an optics bench for a fraction of the price. Experiments like this one completely invalidate the thesis of Three Body Problem: that by locking down colliders that there is no other way for particle physics to advance. There are other ways that are comparatively cheap and requiring less resources and manpower. Physics would find a way.

A Physicist Responds to “The Three Body Problem”

I’ve not had much motivation to post recently: it seems like I read another article every week or so where some fool is making the same wrong conclusions about Quantum Mechanics or Relativity or AI, or all of the above, simultaneously. It gets exhausting to read. I also haven’t had time for constructing a post on my recent problem work in part because I’m prepping for a major exam.

But, I need some time to take a break and change my focus. So, I decided to write a bit about some things I saw in Liu Cixin’s “The Three-Body Problem” of which I read Ken Liu’s translation. If you’re not familiar with this book, I would highly recommend it. This book deservedly won the Nebula and Hugo awards –both– and it is one piece of science fiction that is truly worth going through.

One of my non-spoiling responses here is that it shows how another culture, namely the Chinese culture, can go to extremes with how it treats Scientists and Intelligentsia and all the different ways that this relationship can oscillate back and forth. It shows too the humanity of scientists, both for better and worse. Based on the structure of the story, it’s clear to me that the author has respect for the scientific disciplines which is usually not so present in western literature anymore. I was also quite happy that characters were not meaninglessly fed to the meat grinder in the way they are too often in many western books in the supposed name of ‘authenticity.’

With that said and my badge of worthiness placed, we will get to the actually purpose of this post… some places where Liu Cixin’s Science fiction Authoritis shows through.

header

The great problem with many science fiction writers is that they know just enough to be dangerous, but not enough to be right. Where they fall apart is when they start to over-explain the phenomenology of what’s happening in their stories in order to ‘make it work.’ There are two places I will talk about where this happened in ‘3BP’.

The first is the Zither.

To start with, I loved the idea of the zither. It was a very classy, ingenious use for the cliche of the monofilament wire. Note first that this is a cliche (a ‘trope’ maybe, but I detest that word for its cliche overuse). In the form that appeared in 3BP, nanomaterial monofilament 1/1000th the thickness of hair is strung in strings like a zither between pilings across a straight section of the Panama canal as an ambush trap for an oil tanker being used by the villains. The strings are strung between the banks of the canal attached to chains that can be raised and lowered so that ships which aren’t the target can be allowed through the canal unhindered. When the target ship approaches, the monofilaments are pulled up across the canal by tightening the chains such that the filaments are held in an invisible web of horizontal strands above water line, spaced from each other by only a few feet, like a big hardboiled egg slicer. The author even makes allowances for how the monofilaments can be attached to the chains so as not to shred the anchoring when the target ship pushes against them. When the ship hits the zither, it sails silently through and continues on until the engine of the ship rips itself to pieces and causes the whole boat to slide apart in sections.

You have to admit, it’s a nifty trap. The monofilament in question is described as a material intended for use building orbital elevators and is dubbed ‘nanotechnology’ by the story.

The great stumbling point most people have about nanotechnology is that it is not tiny without limit since it exists in a scale gap of less than 1 micron and more than 1 nanometer. For comparison, hair is about 100 microns and the length of a carbon-carbon sigma bond is about 0.1 nanometers; the zither monofilaments are therefore about 100 nanometers. This is sort of a crossover regime where building structures by top-down bulk techniques, like photo etching, becomes hard, while building from bottom-up by chemistry is also hard. In general, this is into a big enough scale where quantum mechanical effects become small and statistical mechanics tends to dominate manipulation. At the nanoscale, everything we understand about how the basic level of material stuff holds together remains true. In a way, nanoscale is small, but not so small that objects are markedly described by quantum mechanics, but also not so big that they behave like bulk objects. That’s why ‘nano’ is difficult: it sits at an uncomfortable seam between classical and quantum universes where the tools for one or the other aren’t quite right for doing what needs to be done.

Cutting material is by a process called scission. The act of ‘scission’ is, by definition, the breakage of a long chain molecule into two shorter chain molecules. It means separating at least one chemical bond in order to free a unified mass into two independent parts. And, a chemical bond always has at least two electrons since the bond state must consist of spin-up and spin-down parts in order to cancel out angular momentum… and that’s pretty much the theme of chemistry: stable states mostly have angular momentum canceled. There are some special exceptions, but these do not define the rule. Still, since you can’t subdivide an electron, splitting a bond means intact electrons residing somewhere who are no longer in a quantum mechanical ground state and also nuclei lacking complete valence shells. This means that the system, immediately after scission, will have a strong desire to rearrange by chemical reaction into a more stable state. What will it react with? Whatever is close by… in this case, the monofilament wire! This kind of process is part of why blades dull over time: for a conventional metal knife cutting a metal structure, the structure is literally ‘cutting’ the knife too and blunting its edge. With a nanofiber, there isn’t much mass to wear away.

This is one of the difficulties in scaling up nanotechnology: they usually become fragile!

Overlooking this fragility issue, one can argue that the process of making this nanofiber yielded a structure that is exceptionally strong and perhaps robust to chemical processes occurring around it. This is presumably what you would want in such a material that would be useful for building orbital elevators. If you want a tether from Earth up into orbit, you could bundle many of these fibers together and add coatings on the surface to help render them inert to chemistry. Many materials used in construction of advanced structures work in a manner like this: you’ve certainly heard of “Composites!”

Now then, singling one of these fibers out and stringing it across the Panama canal produces a second major issue. The energy necessary to allow the zither to slice apart the ship comes from the kinetic energy of ship coasting along the water way: the ship hits the zither and the monofibers of the zither redirect parts of the ship infinitesmally from each other so that their tensile strength is not great enough to resist going different directions from each other… causing them to rend apart microscopically. This redirection is arrested because the parts separated from one another can’t pass through the bulk materials holding them in place. This ‘motion’ is then completely incoherent and can only be tabulated as heat deposited into the material bulk at the location of the nanofilament. So, part of the kinetic energy of the ship’s motion is deposited as heat around the monofilament cut. This might not be quite a huge problem but that the monofilament has an intrinsically tiny mass and therefore a miniscule heat capacity: its electrical structure has relatively few valence modes where it can stuff higher energy vibrational states. Moreover, the fiber is located at the origin of the heat and the materials heating up surround it from all sides, so there is no other place where the fiber can dump heat except linearly along its own body. If the heat doesn’t dissipate through the hull of the ship fast enough, how hot can the fiber get before its electrical structure starts sampling continuum states? However tough the fiber is, if it can’t dump the heat somewhere, its temperature might well rise until it literally ionizes into a plasma. For such tiny mass, only a little heat input is a substantial thing.

This is a difficulty, but one a clever writer can probably still explain away (maybe better left as a black box). You might argue that the fiber can cope with this abuse by conducting the heat along its length and then radiating it into the air or emitting it as light. That might work, I suppose, but it would mean increasing complexity in the structure of the nanomaterial. Not an impossibility, but now the fiber glows at least as a black body and is no longer invisible! For anybody familiar with super-resolution microscopy, emission of light can make visible objects tinier than the optical resolution limits.

Maybe the classiest way would be to convert the fiber into a thermoelectric couple of some sort and get rid of the heat using an electrical current. Some of the well known modern nanofibers, the fullerenes and such, are also very good electrical conductors because of their bonding structures. In reality, this would also probably limit the cutting rate: the rate of heat deposition in the line must not exceed the rate at which the cooling mechanism can suck heat away! An unfortunate fact about very thin conductors is that their resistance tends to be high, meaning that the conduction rate goes up as the channel of the conductor is thickened… and you are unfortunately crippled by using a nanofiber, which is very skinny indeed. I won’t mention superconductors except to say that they have a limited range of temperatures where they can superconduct… using a superconductor in a thermoelectric couple is asking for trouble.

My big complaint about the zither boils down to that: heat and wear. Because of the difference of the applications, a material which is suitable to the purpose of building an orbital elevator is not necessarily suitable to building a monofilament cutter. I would also offer that a real monofilament cutter would be specifically engineered to the task and not a windfall of a second technology. The applications are just too different and don’t boil down to merely ‘strength’ and ‘tiny size.’

Having addressed the zither, I’ll talk about a second major point which suffered from too much description and too little plausibility. I’ll try to describe this part of the story without giving away a major plot point.

In this section of the story, someone is trying to use a colossal factory hovering in orbit above the planet to take a proton and expand it from a point-like object into a three dimensional structure. The author makes the case that a simple object, like a proton, which is essentially point-like when viewed from our place in spacetime, is actually an object with extensive higher dimensional structure and that some technological application can be carried out where this higher dimensionality can be expanded so that it can be manipulated in our three dimensional space. He even makes the case that these higher dimensions contain considerable volume and may be big enough to harbor entire universes. As he repeatedly emphasizes, a whole universe of complexity, but only a proton’s worth of mass.

To start with, I have little to say about the string theory. For one thing, I don’t really understand it. A major argument in string theory is that the tiniest bits of space in our universe can actually have seven or eight additional dimensions hidden away where we three dimensional creatures can’t see them. Perhaps that’s true, but as yet, string theory has made no predictions that have been verified by experiment. None!

From the standpoint of a person, it’s certainly true that a proton might seem point-like, but this is actually false! Unlike an electron, which is truly dimensionally point-like for all that physics currently understands of it, a proton has a known structure that occupies a definable three dimensional volume. The size here is tiny, at only about 10^-15 meters, but it is a volume with a few working parts. A proton is constructed of two “Up” quarks and a “Down” quark that are held together by nuclear strong force (making the proton a baryon with spin 1/2, and so abiding Fermi statistics).

I have considered that perhaps the application of a ‘proton’ in the story is perhaps a missed translation and that the author really wanted a dimensionless particle like a quark (which are never observed outside of particulate sets of two or three) or an electron (which can be a free particle). After writing the previous sentence, I spent some time looking at translator notes for this book and I found that the choice of the Chinese word for ‘proton’ facilitated a word play in the author’s native language that did not quite translate to English. I won’t detail this word play because it gives away a plot point of the book that is beyond the scope of what I wish to write about. A lesson here is that the author’s loyalty is definitely toward his literature above scientific truth.

One significant issue that must be brought up here is that ‘point-like’ is a relative description when you start talking about particles like these. An electron is fundamentally point-like, but it is also quantum mechanical, meaning that they tend to occupy finite volumes in space that vary quite strongly depending on the shape and boundaries of that space, as given by the wave function. Reaching in and ‘grabbing’ the electron reveals what appears to be a point, but that ‘point’ can be distributed in non-intuitive ways across the volume it occupies. We have no real capacity to describe that it has a shape and one might certainly consider that ‘point-like’ dimensionless object to be a singularity in exactly the same way that a black hole is a singularity. I have half a mind to say that the only reason an electron is not a black hole is because the diameter of the volume it occupies as described by the uncertainty principle is larger than its Schwarzschild radius. This statement is limited by the fact that Quantum Mechanics doesn’t play well with General relativity and the limits of the Schwarzchild radius may not coincide with the limits of the Uncertainty principle –both are physically true, but they each have a context where they are most valid and no unifying math exists to link one case directly to the other.

Now then, in 3BP, a point-like elementary particle with the mass and dimensionality of such a particle is shifted by a machine so that its higher dimensional properties are exhibited as a proportionate volume or geometric shape in three dimensions. In the first flawed experiment, the particle expands into a one dimensional thread which snaps off and comes wafting down everywhere onto the planet in nearly weightless tufts that annoy everybody. After the author spent such a time laboring over the invisible nature of a monofilament wire, he decided that a one dimensional thread could be visible! Note, a monofilament wire has a small but finite width, while a one dimensional line has no width at all! Which is ‘thinner?’ The 1D line is thinner by an infinite degree!

In the next flawed experiment, the higher dimensions of the point-like particle turn out to contain a super-intelligent civilization which realizes that the particle where they reside is about to be destroyed during the experiment. This civilization distends the structure of their particle into a huge mirror which they then use to focus the sunlight as a weapon onto the surface of the planet in order to attack their enemy, who they recognize to be the scientists running the experiment, and they start leveling cities! This is creative writing, but the author makes the explicit point that the mirror-structure formed from the elementary particle, while big, has only the mass of that particle, which is infinitesimal. If you’re versed in physics, you’ll see the first problem: light has momentum (Poynting vector!). When you reflect a beam of light, you change the direction of the momentum in that light. Conservation of momentum then requires the existence of a force causing the mirror to rebound. Reflecting enough light to thermally combust a city is a large intensity of light, easily megawatts per square meter. An electron has a minuscule mass at about ~10^-31 kilograms (0r 10^-27 if you insist on it being a proton). Force equals mass times acceleration and pressure equals force per area where light intensity can be easily converted to pressure and pressure to force. When you rearrange Newton’s second law to solve for acceleration, the big ‘force’ number ends up on top while tiny ‘mass’ number ends up on the bottom of the ratio, giving a catastrophically huge number for the value of the acceleration (conservatively on the order or 10^20 or 10^30 m/s^2 given intensity on the scale of only watts/m^2 where the mirror is only a square meter). That’s right, the huge mirror with the mass of a ‘proton’ accelerates away from the planet at a highly relativistic rate the instant light bounces off of it!

Yeah, I know, physicists and science fiction authors don’t often get along even though they both pretend to love each other.

I had significant problems with the idea of making a single electric charge into a reflective surface, but I’ve rewritten this point twice without being satisfied that the physics are at all instructive to my actual objection. In a real reflective surface, like a mirror, the existence of the reflected light wave can be understood as coherent bulk scattering from many scattering centers, which are all themselves individual charged particles. In this sort of system, the amount of reflected wave quite obviously depends on the amount of charged surface present to interact with incoming waves. The amount of surface available to reflect is conceptually dodgy when you’re talking about only a single charge, no matter how big of an area this charge is spread out to cover. This is why a half-silvered mirror reflects less intensity than a fully silvered mirror. Though I have failed in my own opinion to encapsulate the physical argument well, an individual charge has a finite average rate at which it can exchange information with the universe around it and reflecting photons en mass is an act of exchanging a great deal of information for such a tiny coupling. Since the quantum mechanics of scattering depends on a probability of overlap, the probability of simultaneously overlapping with many photons is small for only a single charge. The number densities are overwhelmingly different.

All said, the mirror is likely a very transparent mirror unless it has more than one charged particle’s worth of charge.

Despite all this analysis, I don’t believe that it detracts from the story. I really didn’t mind the flight of fancy in a well written piece of fiction. It’s unlikely that the casual reader will ever care.