Chemical Orbitals from Eigenstates

A small puzzle I recently set for myself was finding out how the hydrogenic orbital eigenstates give rise to the S- P- D- and F- orbitals in chemistry (and where s, p, d and f came from).

The reason this puzzle is important to me is that many of my interests sort of straddle how to go from the angstrom scale to the nanometer scale. There is a cross-over where physics becomes chemistry, but chemists and physicists often look at things very differently. I was not directly trained as a P-chemist; I was trained separately as a Biochemist and a Physicist. Remarkably, the Venn diagrams describing the education for these pursuits only overlap slightly. When Biochemists and Molecular Biologists talk, the basic structures below that are frequently just assumed (the scale here is >1nm), while Physicists frequently tend to focus their efforts toward going more and more basic (the scale here is <1 Angstrom). This leads to a clear non-overlap in the scale where chemistry and P-chem are relevant (~1 angstrom). Quite ironically, the whole periodic table of the elements lies there. I have been through P-chem and I’ve gotten hit with it as a Chemist, but this is something of an inconvenient scale gap for me. So, a cat’s paw of mine has been understanding, and I mean really understanding, where quantum mechanics transitions to chemistry.

One place is understanding how to get from the eigenstates I know how to solve to the orbitals structuring the periodic table.

1200px-periodic_table_chart

This assemblage is pure quantum mechanics. You learn a huge amount about this in your quantum class. But, there are some fine details which can be left on the counter.

One of those details for me was the discrepancy between the hydrogenic wave functions and the orbitals on the periodic table. If you aren’t paying attention, you may not even know that the s-, p-, d- orbitals are not all directly the hydrogenic eigenstates (or perhaps you were paying a bit closer attention in class than I was and didn’t miss when this detail was brought up). The discrepancy is a very subtle one because often times when you start looking for images of the orbitals, the sources tend to freely mix superpositions of eigenstates with direct eigenstates without telling why the mixtures were chosen…

For example, here are the S, P and D orbitals for the periodic table:

ao

This image is from http://www.chemcomp.com. Focusing on the P row, how is it that these functions relate to the pure eigenstates? Recall the images that I posted previously of the P eigenstates:

P-orbital probabiltiy densityorbital21-1 squared2

In the image for the S, P and D orbitals, of the Px, Py and Pz orbitals, all three look like some variant of P210, which is the pure state on the left, rather than P21-1, which is the state on the right. In chemistry, you get the orbitals directly without really being told where they came from, while in physics, you get the eigenstates and are told somewhat abstractly that the s-, p-, d- orbitals are all superpositions of these eigenstates. I recall seeing a professor during an undergraduate quantum class briefly derive Px and Py, but I really didn’t understand why he selected the combinations he did! Rationally, it makes sense that Pz is identical to P210 and that Px and Py are superpositions that have the same probability distribution as Pz, but are rotated into the X-Y plane ninety degrees from one another. How do Px and Py arise from superpositions of P21-1 and P211? P21-1 and P211 have identical probability distributions despite having opposite angular momentum!

Admittedly, the intuitive rotations that produce Px and Py from Pz make sense at a qualitative level, but if you try to extend that qualitative understanding to the D-row, you’re going to fail. Four of the D orbitals look like rotations of one another, but one doesn’t. Why? And why are there four that look identical? I mean, there are only three spatial dimensions to fill, presumably. How do these five fit together three dimensionally?

Except for the Dz^2, none of the D-orbitals are pure eigenstates: they’re all superpositions. But what logic produces them? What is the common construction algorithm which unites the logic of the D-orbitals with that of the P-orbitals (which are all intuitive rotations).

I’ll actually hold back on the math in this case because it turns out that there is a simple revelation which can give you the jump.

As it turns out, all of chemistry is dependent on angular momentum. When I say all, I really do mean it. The stability of chemical structures is dependent on cases where angular momentum has tended in some way to cancel out. Chemical reactivity in organic chemistry arises from valence choices that form bonds between atoms in order to “complete an octet,” which is short-hand for saying that species combine with each other in such a way that enough electrons are present to fill in or empty out eight orbitals (roughly push the number of electrons orbiting one type of atom across the periodic table in its appropriate row to match the noble gases column). For example, in forming the salt crystal sodium chloride, sodium possesses only one electron in its valence shell while chlorine contains seven: if sodium gives up one electron, it goes to a state with no need to complete the octet (with the equivalent electronic completion of neon), while chlorine gaining an electron pushes it into a state that is electronically equal to argon, with eight electrons. From a physicist stand-point, this is called “angular momentum closure,” where the filled orbitals are sufficient to completely cancel out all angular momentum in that valence level. As another example, one highly reactive chemical structure you might have heard about is a “radical” or maybe a “free radical,” which is simply chemist shorthand for the situation a physicist would recognize contains an electron with uncancelled spin and orbital angular momentum. Radical driven chemical reactions are about passing around this angular momentum! Overall, reactions tend to be driven to occur by the need to cancel out angular momentum. Atomic stoichiometry of a molecular species always revolves around angular momentum closure –you may not see it in basic chemistry, but this determines how many of each atom can be connected, in most cases.

From the physics, what can be known about an orbital is essentially the total angular momentum present and what amount of that angular momentum is in a particular direction, namely along the Z-axis. Angular momentum lost in the X-Y plane is, by definition, not in either the X or Y direction, but in some superposition of both. Without preparing a packet of angular momentum, the distribution ends up having to be uniform, meaning that it is in no particular direction except not in the Z-direction. For the P-orbitals, the eigenstates are purely either all angular momentum in the Z-direction, or none in that direction. For the D-orbitals, the states (of which there are five) can be combinations, two with angular momentum all along Z, two with half in the X-Y plane and half along Z and one with all in the X-Y plane.

What I’ve learned is that, for chemically relevant orbitals, the general rule is “minimal definite angular momentum.” What I mean by this is that you want to minimize situations where the orbital angular momentum is in a particular direction. The orbits present on the periodic table are states which have canceled out angular momentum located along the Z-axis. This is somewhat obvious for the homology between P210 and Pz. P210 points all of its angular momentum perpendicular to the z-axis. It locates the electron on average somewhere along the Z-axis in a pair of lobes shaped like a peanut, but the orbital direction is undefined. You can’t tell how the electron goes around.

As it turns out, Px and Py can both be obtained by making simple superpositions of P21-1 and P211 that cancel out z-axis angular momentum… literally adding together these two states so that their angular momentum along the z-axis goes away. Px is the symmetric superposition while Py is the antisymmetric version. For the two states obtained by this method, if you look for the expectation value of the z-axis angular momentum, you’ll find it missing! It cancels to zero.

It’s as simple as that.

The D-orbitals all follow. D320 already has no angular momentum on the z-axis, so it is directly Dzz. You therefore find four additional combinations by simply adding states that cancel the z-axis angular momentum: D321 and D32-1 symmetric and antisymmetric combinations and then the symmetric and antisymmetric combinations of D322 and D32-2.

Notice, all I’m doing to make any of these states is by looking at the last index (the m-index) of the eignstates and making a linear combination where the first index plus the second gives zero. 1-1 =0, 2-2=0. That’s it. Admittedly, the symmetric combination sums these with a (+) sign and a 1/sqrt(2) weighting constant so that Px = (1/sqrt(2))(P21 + P21-1) is normalized and the antisymmetric combination sums with a (-) sign as in Py = (1/sqrt(2))(P211 – P21-1), but nothing more complicated than that! The D-orbitals can be generated in exactly the same manner. I found one easy reference on line that loosely corroborated this observation, but said it instead as that the periodic table orbitals are all written such that the wave functions have no complex parts… which is also kind of true, but somewhat misleading because you sometimes have to multiply by a complex phase to put it genuinely in the form of sines for the polar coordinate (and as the polar coordinate is integrated over 360 degrees, expectation values on this coordinate, as z-axis momentum would contain, cancel themselves out; sines and cosines integrated over a full period, or multiples of a full period, integrate to zero.)

Before I wrap up, I had a quick intent to touch on where S-, P-, D- and F- came from. “Why did they pick those damn letters?” I wondered one day. Why not A-, B-, C- and D-? The nomenclature emerged from how spectral lines appeared visually and groups were named: (S)harp, (P)rincipal, (D)iffuse and (F)undamental. (A second interesting bit of “why the hell???” nomenclature is the X-ray lines… you may hate this notation as much as me: K, L, M, N, O… “stupid machine uses the K-line… what does that mean?” These letters simply match the n quantum number –the energy level– as n=1,2,3,4,5… Carbon K-edge, for instance, is the amount of energy between the n=1 orbital level and the ionized continuum for a carbon atom.) The sharpness tends to reflect the complexity of the structure in these groups.

As a quick summary about structuring of the periodic table, S-, P-, D-, and F- group the vertical columns (while the horizontal rows are the associated relative energy, but not necessarily the n-number). The element is determined by the number of protons present in the nucleus, which creates the chemical character of the atom by requiring an equal number of electrons present to cancel out the total positive charge of the nucleus. Electrons, as fermions, are forced to occupy distinct orbital states, meaning that each electron has a distinct orbit from every other (fudging for the antisymmetry of the wave function containing them all). As electrons are added to cancel protons, they fall into the available orbitals depicted in the order on the periodic table going from left to right, which can be a little confusing because they don’t necessarily purely close one level of n before starting to fill S-orbitals of the next level of n; for example at n=3, l can equal 0, 1 and 2… but, the S-orbitals for n=4 will fill before D-orbitals for n=3 (which are found in row 4). This has purely to do with the S-orbitals having lower energy than P-orbitals which have lower energy than D-orbitals, but that the energy of an S-orbital for a higher n may have lower energy than the D-orbital for n-1, meaning that the levels fill by order of energy and not necessarily by order to angular momentum closure, even though angular momentum closure influences the chemistry. S-, P-, D-, and F- all have double degeneracy to contain up and down spin of each orbital, so that S- contains 2 instead of 1, P- contains 6 instead of 3, and D- from 10 instead of 5. If you start to count, you’ll see that this produces the numerics of the periodic table.

Periodic table is a fascinating construct: it contains a huge amount of quantum mechanical information which really doesn’t look much like quantum mechanics. And, everybody has seen the thing! An interesting test to see the depth of a conversation about periodic table is to ask those conversing if they understand why the word “periodic” is used in the name “Periodic table of the elements.” The choice of that word is pure quantum mechanics.

Powerball Probabilities

If you’ve read anything else in this blog, you’ll know I write frequently about my playing around with Quantum Mechanics. As a digression away from a natural system that is all about probabilities, an interesting little toy problem I decided to tackle is figuring out how the “win” probabilities are determined in the lottery game Powerball.

Powerball is actually quite intriguing to me. They have a website here which details by level all the winners across the whole country who have won a Powerball prize in any given drawing. You may have looked at this chart at some point while trying to figure out if your ticket won something useful. A part of what intrigues me about this chart is that it tells you in a given drawing exactly how much money was spent on Powerball and how many people bought tickets. How does it tell you this? Because probability is an incredibly reliable gauge of behavior with big samples sizes. And, Powerball quite willingly lays all the numbers out for you to do their book keeping for them by telling you exactly how many people won… particularly at the high-probability-to-win levels which push into the regime of Gaussian statistics. For big samples, like millions of people buying powerball tickets, where N=big, the errors on average values become relatively insignificant since they go as sqrt(N). And, the probabilities reveal what those average values are.

The game is doubly intriguing to me because of the psychological component that drives it. As the pot becomes big, people’s willingness to play becomes big even though the probabilities never change. It suddenly leaps into the national consciousness every time the size of the pot becomes big and people play more aggressively as if they had a greater chance of winning said money. It is true that somebody ultimately walks away with the big pot, but what’s the likelihood that somebody is you?

But, as a starter, what are the probabilities that you win anything when you buy a ticket? To understand this, it helps to know how the game is set up.

As everybody knows, powerball is one of these games where they draw a bunch of little balls printed with numbers out of a machine with a spinning basket and you, as the player, simply match the numbers on your ticket to the numbers on the balls. If your ticket matches all the numbers, you win big! And, as an incentive to make people feel like they’re getting something out of playing, the powerball company awards various combinations of matching numbers and adds in multipliers which increase the size of the award if you do get any sort of match. You might only match a number or two, but they reward you a couple bucks for your effort. If you really want, you can pick the numbers yourself, but most people simply grab random numbers spat out of a computer… not like I’m telling you anything you don’t already know at this point.

One of the interesting qualities of the game is that the probabilities of prizes are very easy to adjust. The whole apparatus stays the same; they just add or subtract balls from the basket. In powerball, as currently run, there are two baskets: the first basket contains 69 balls while the second contains 26. Five balls are drawn from the first basket while only one, the Powerball, is drawn from the second. There is actually an entire record available of how the game has been run in the past, how many balls were in either the first or second baskets and when balls were added or subtracted from each. As the game has crossed state lines and the number of players has grown, the number of balls has also steadily swelled. I think the choice in numbering has been pretty careful to make the smallest prize attainably easy to get while pushing the chances for the grand prize to grow enticingly larger and larger. Prizes are mainly regulated by the presence of the Powerball: if your ticket manages to match the Powerball and nothing else, you win a small prize, no matter what. Prizes get bigger as a larger number of the other five balls are matched on your ticket.

The probabilities at a low level work almost exactly as you would expect: if there are 26 balls in the powerball basket, at any given drawing, you have 1 chance in 26 of matching the powerball. This means that you have 1 chance in 26 of winning some prize as determined by the presence of the powerball. There are also prizes for runs of larger than three matching balls drawn from the main basket, which tends to push the probabilities of winning anything to a slightly higher frequency than 1 in 26.

For the number savvy this begins to reveal the economics of powerball: an assured win by these means requires you to spend, on average, $48. That’s 26 tickets where you are likely to have one that matches the powerball. Note, the prize for matching that number is $4. $44 dollars spent to net only $4 is a big overall loss. But, this 26 ticket buy-in is actually hiding the fact that you have a small chance of matching some sequence of other numbers and obtaining a bigger prize… and it would certainly not be an economic loss if you matched the powerball and then the 5 other balls, yielding you a profit in the hundreds of millions of dollars (and this is usually what people tell themselves as they spend $2 for each number).

The probability to win the matched powerball prize only, that is to match just the powerball number, is actually somewhat worse than 1 in 26. The probability is attenuated by the requirement that you hit no matches on any other of the five possible numbers drawn.

Finding the actual probability is as follows: (1/26)*(64/69)*(63/68)*(62/67)*(61/66)*(60/65). If you multiply that out and invert it, you get 1 hit in 38.32 tries. The first number is, of course, the chances of hitting the powerball, while the other five are the chance of hitting numbers that aren’t picked… most of these probabilities are naturally quite close to 1, so you are likely to hit them, but they are probabilities that count toward hitting the powerball only.

This number may not be that interesting to you, but lots of people play the game and that means that the likelihood of hitting just the powerball is close to Gaussian. This is useful to a physicist because it reveals something about the structure of the Powerball playing audience on any given week: that site I gave tells you how many people won with only the powerball, meaning that by multiplying that number by 38.32, you know how many tickets were purchased prior to the drawing in question. For example, as of the August 12 2017 drawing, 1,176,672 numbers won the powerball-only prize, meaning that very nearly 38.32*1,176,672 numbers were purchased: ~45,090,071 numbers +/- 6,715, including error (notice that the error here is well below 1%).

How many people are playing? If people mostly purchase maybe two or three numbers, around 15-20 million people played. Of course, I’m not accounting for the slavering masses who went whole hog and dropped $20 on numbers; if everybody did this, 4.5 million people played… truly, I can’t really know people’s purchasing habits for certain, but I can with certainty say that only a couple tens of millions of people played.

The number there reveals quite clearly the economics of the game for the period between the 8/12 drawing and the one a couple days prior: $90 million was spent on tickets! This is really quite easy arithmetic since it’s all in factors of 2 over the number of ticket numbers sold. If you look at the total prize pay-out, also on that page I provided, $19.4 million was won. This means that the Powerball company kept ~$70 million made over about three days, of which some got dumped into the grand prize and some went to whatever overhead they keep (I hear at least some of that extra is supposed to go into public works and maybe some also ends up in the Godfather’s pocket). Lucrative business.

If you look at the prize payouts for the game, most of the lower level prizes pay off between $4 and $7. You can’t get a prize that exceeds $100 until you match at least 4 balls. Note, here, that the probability of matching 4 balls (including the powerball) is about 1 in 14,494. This means, that to assure yourself a prize of $100, you have to spend ~$29,000. You might argue that in 14,494 tickets, you’ll win a couple smaller prizes ($4 prizes are 1 in 38, 1 in 91, and $7 prizes are 1 in 700 and 1 in 580) and maybe break even. Here’s the calculation for how much you’ll likely make for that buy-in: $4*(14,494*(1/38 + 1/91)) + $7*(14,494*(1/700 + 1/580))… I’ve rounded the probabilities a bit… =$2482.65. For $29,000 spent to assure a single $100 win, you are assured to win at most $2500 from lesser winnings for a total loss of $27,500. Notice, $4 on a $44 loss is about 10%, while $2500 on $27,500 is also about 10%… the payoff does not improve at attainable levels! Granted, there’s a chance at a couple hundred million, but the probability of the bigger prize is still pretty well against you.

Suppose you are a big spender and you managed to rake up $29,000 in cash to dump into tickets, how likely is it that you will win just the $1 million prize? That’s five matched balls excluding the powerball. The probability is 1 in 11,688,053. By pushing the numbers, your odds of this prize have become 14,500/11,688,053, or about 1 chance in 800. Your odds are substantially improved here, but 1 in 800 is still not a wonderful bet despite the fact that you assured yourself a fourth tier prize of $100! The grand prize is still a much harder bet with odds running at about 1 in 20,000, despite the amount you just dropped on it. Do you just happen to have $30,000 burning a hole in your pocket? Lucky you! Lots of people live on that salary for a year.

Most of this is simple arithmetic and I’ve been bandying about probabilities gleaned from the Powerball website. If you’re as curious about it as me, you might be wondering exactly how all those probabilities were calculated. I gave an example above of the mechanical calculation of the lowest level probability, but I also went and figured out a pair of formulae that calculate any of the powerball prize probabilities. It reminded me a bit of stat mech…

prob without powerball

prob with powerball

number for hits

I’ve colored the main equations and annotated the the parts to make them a little clearer. The final relation just shows how you can see the number of tries needed in order to hit one success, given a probability as calculated with the other two equations. The first equation differs from the second in that it refers to probabilities where you have matched numbers without managing to match the powerball, while the second is the complement, where you match numbers having hit the powerball. Between these two equations, you can calculate all the probabilities for the powerball prizes. Since probabilities were always hard for me, I’ll try to explain the parts of these equations. If you’re not familiar with the factorial operation, this is what is denoted by the exclamation point “!” and it denotes a product string counting up from one to the number of the factorial… for example 5! means 1x2x3x4x5. The special case 0! should be read as 1. The first part, in blue, is the probability relating to either hitting on missing the powerball, where K = 26, the number of balls in the powerball basket. The second part (purple) is the multiplicity and tells you how many ways that you can draw a certain number of matches (Y) to fill a number of open slots (X), while drawing a number of mismatches (Z) in the process, where X=Y+Z. In powerball, you draw five balls, so X=5 and Y is the number of matches (anywhere from 0 to 5), while Z is the number of misses. Multiplicity shows up in stat mech and is intimately related to entropy. The totals drawn (green) is perhaps mislabeled… here I’m referring to the number of possible choices in the main basket, N=69, and the number of those that will not be drawn M = N – X, or 64. I should probably have called it “Main basket balls” or something. The last two parts determine the probabilities related to the given number of hits (Y) (orange) and the given number of misses (Z) (red) and I have applied the product operator to spiffy up the notation. Product operator is another iterand much like the summation operator and means that you repeatedly multiply successive values, much like a factorial, but where the value you are multiplying is produced from a particular range and given a set form. In these, the small script m and n start at zero (my bad, this should be under the Pi) and iterate until they are just less than the number up top (Y – 1 or Z – 1 and not equal to). At the extreme cases of either all hits or all misses, the relevant product operator (either Miss or Hit respectively) must be set equal to one in order to not count it.

This is one of those rare situations where the American public does a probability experiment with the values all well recorded where it’s possible to see the outcomes. How hard is it to win the grand prize? Well, the odds are one in 292 million. Consider that the population of the United States is 323 million. That means that if everybody in the United States bought one powerball number, about one person would win.

Only one.

Thanks to the power of the media, everybody has the opportunity to know that somebody won. Or not. That this person exists, nobody wants to doubt, but consider that the odds of winning are so scant that you not only won’t win, but you pretty likely will never meet anyone who did. Sort of surreal… everything is above board, you would think, but the rarity is so rare that there’s no assurance that it ever actually happens. You can suppose that maybe it does happen because people do win those dinky $4 prizes, but maybe this is just a red herring and nobody really actually wins! Those winner testimonials could be from actors!

Yeah, I’m not much of a conspiracy theorist, but it is true that a founding tenant of the idea of a ‘limit’ in math is that 99.99999% is effectively 100%. Going to the limit where the discrepancy is so small as to be infinitesimal is what calculus is all about. It is fair to say that it very nearly never happens! Everybody wants to be the one who beats the odds, which is why Powerball tickets are sold, but the extraordinarily vast majority never will win anything useful… I say “useful” because winning $4 or $7 is always a net loss. You have to win one of the top three prizes for it to be anywhere near worth anything, which you likely never will.

One final fairly interesting feature of the probability is that you can make some rough predictions about how frequently the grand prize is won based on how frequently the first prize is won. First prize is matching all five of the balls, but not the powerball. This frequency is about once per 12 million numbers, which is about 26 times more likely than all 5 plus the Powerball. In the report on winnings, a typical frequency is about 2 to 3 winners per drawing. About 1 time in 26 a person with all five manages to get the powerball too, so, with two drawings per week and about 2.5 first prize winners per drawing, that’s five winners per week… which implies that the grand prize should be won at a frequency of about once every five to six weeks –every month and a half or so. The average here will have a very large standard deviation because the number of winners is compact, meaning that the error is an appreciable portion of the measurement, which is why there is a great deal of variation in period between times when the grand prize is won. The incidence becomes much more Poissonian and stochastic, and allows some prizes to get quite big compared to others and causes their values to disperse across a fairly broad range. Uncertainty tends to dominate, making the game a bit more exciting.

While the grand prize is small, the number of people winning the first prize in a given week is small (maybe none or one), but this number grows in proportion to the size of the grand prize (maybe 5 or 6 or as high as 9). When the prize grows large enough to catch the public consciousness, the likelihood that somebody will win goes up simply because more people are playing it and this can be witnessed in the fluctuating frequency of the wins of lower level prizes. It breathes around the pulse of maybe 200 million dollars, lubbing at 40 million (maybe 0 to 1 person winning the first prize) and dubbing at 250 million (with 5 people or more winning the first prize).

Quite a story is told if you’re boring and as easily amused as me.

In my opinion, if you do feel inclined to play the game, be aware that when I say you probably won’t win, I mean that the numbers are so strongly against you that you do not appreciably improve your odds by throwing down $100 or even $1,000. The little $4 wins do happen, but they never pay and $1,000 spent will likely not get you more than $100 in total of winnings. It might as well be a voluntary tax. Cherish the dream your $2 buys, but do not stake your well-being on it. There’s nothing wrong with dreaming as long as you understand where to wake up.

Parity symmetry in Quantum Mechanics

I haven’t written about my problem play for a while. Since last I wrote about rotational problems, I’ve gone through the entire Sakurai chapter 4, which is an introduction to symmetry. At the moment, I’m reading Chapter 5 while still thinking about some of the last few problems in Chapter 4.

I admit that I had a great deal of trouble getting motivated to attack the Chapter 4 problems. When I saw the first aspects of symmetry in class, I just did not particularly understand it. Coming back to it on my own was not much better. Abstract symmetry is not easy to understand.

In Sakurai chapter 4, the text delves into a few different symmetries that are important to quantum mechanics and pretty much all of them are difficult to see at first. As it turns out, some of these symmetries are very powerful tools. For example, use of the reflection symmetry operation in a chiral molecule (like the C-alpha carbon of proteins or the hydrated carbons of sugars) can reveal neighboring degenerate ground states which can be accessed by racemization, where an atomic substituent of the molecule tunnels through the plane of the molecule and reverses the chirality of the state at some infrequent rate. Another example is translation symmetry operation, where a lattice of identical attractive potentials serves to hide a near infinite number of identical states where a bound particle can hop from one minimum to the next and traverse the lattice… this behavior essentially a specific model describing the passage of electrons through a crystalline semiconductor.

One of the harder symmetries was time reversal symmetry. I shouldn’t say “one of the harder;” for me time reversal was the hardest to understand and I would be hesitant to say that I completely understand it yet. Time reversal operator causes time to translate backward, making momenta and angular momenta reverse. Time reversal is really hard because the operator is anti-unitary, meaning that the operation switches the sign on complex quantities that it operates on. Nevertheless, time reversal has some interesting outcomes. For instance, if a spinless particle is bound to a fixed center where the state in question is not degenerate (Only one state at the given energy), time reversal says that the state can have no average angular momentum (it can’t be rotating or orbiting). On the other hand, if the particle has spin, the bound state must be degenerate because the particle can’t have no angular momentum!

A quick digression here for the laymen: in quantum mechanics, the word “degenerate” is used to refer to situations where multiple states lie on top of one another and are indistinguishable. Degeneracy is very important in quantum mechanics because certain situations contain only enough information to know an incomplete picture of the model where more information is needed to distinguish alternative answers… coexisting alternatives subsist in superposition, meaning that a wave function is in a superposition of its degenerate alternative outcomes if there is no way to distinguish among them. This is part of how entanglement arises: you can generate entanglement by creating a situation where discrete parts of the system simultaneously occupy degenerate states encompassing the whole system. The discrete parts become entangled.

Symmetry is important because it provides a powerful tool by which to break apart degeneracy. A set of degenerate states can often be distinguished from one another by exploiting the symmetries present in the system. L- and R- enantiomers in a molecule are related by a reflection symmetry at a stereo center, meaning that there are two states of indistinguishable energy that are reflections of one another. People don’t often notice it, but chemists are masters of quantum mechanics even though they typically don’t know as much of the math: how you build molecules is totally governed by quantum mechanics and chemists must understand the qualitative results of the physical models. I’ve seen chemists speak competently of symmetry transformations in places where the physicists sometimes have problems.

Another place where symmetry is important is in the search for new physics. The way to discover new physical phenomena is to look for observational results that break the expected symmetries of a given mathematical model. The LHC was built to explore symmetries. Currently known models are said to hold CPT symmetry, referring to Charge, Parity and Time Reversal symmetry… I admit that I don’t understand all the implications of this, but simply put, if you make an observation that violates CPT, you have discovered physics not accounted for by current models.

I held back talking about Parity in all this because I wanted to speak of it in greater detail. Of the symmetries covered in Sakurai chapter 4, I feel that I made the greatest jump in understanding on Parity.

Parity is symmetry under space inversion.

What?

Just saying that sounds diabolical. Space inversion. It sounds like that situation in Harry Potter where somebody screws up trying to disapparate and manages to get splinched… like they space invert themselves and can’t undo it.

The parity operation carries all the cartesian variables in a function to their negative values.

parity operation

Here Phi just stands in for the parity operator. By performing the parity operation, all the variables in the function which denote spatial position are turned inside out and sent to their negative value. Things get splinched.

You might note here that applying parity twice gets you back to where you started, unsplinching the splinched. This shows that parity operator has the special property that it is it’s own inverse operation. You might understand how special this is by noting that we can’t all literally be our own brother, but the parity operator basically is.

parity2.jpg

Applying parity twice is like multiplying by 1… which is how you know parity is its own inverse. This also makes parity a unitary operator since it doesn’t effect absolute value of the function. Parity operation times inverse parity is one, so unitary.

parity3 or parity4

Here, the daggered superscript means “complex conjugate” which is an automatic requirement for the inverse operation if you’re a unitary operator. Hello linear algebra. Be assured I’m not about the break out the matrices, so have no fear. We will stay in a representation free zone. In this regard, parity operation is very much like a rotation: the inverse operation is the complex conjugate of the operation, never mind the details that the inverse operation is the operation.

Parity symmetry is “symmetry under the parity operation.” There are many states that are not symmetric under parity, but we would be interested in searching particularly for parity operation eigenstates, which are states that parity operator will transform to give back that state times some constant eigenvalue. As it turns out, the parity operator can only ever have two eigenvalues, which are +1 and -1. A parity eigenstate is a state that only changes its sign (or not) when acted on by the parity operator. The parity eigenvalue equations are therefore:

parity5a

All this says is that under space inversion, the parity eigenstates will either not be affected by the transformation, or will be negative of their original value. If the sign doesn’t change, the state is symmetric under space inversion (called even). But, if the sign does change, the state is antisymmetric under space inversion (called odd). As an example, in a space of one dimension (defined by ‘x’), the function sine is antisymmetric (odd) while the function cosine is symmetric (even).

Graph

In this image, taken from a graphing app on my smartphone, the white curve is plain old sine while the blue curve is the parity transformed sine. As mentioned, cosine does not change under parity.

As you may be aware, sines and cosines are energy eigenstates for the particle-in-the-box problem and so would constitute one example of legit parity eigenstates with physical significance.

Operators can also be transformed by parity. In order to see the significance, you just note that the definition of parity is that the position operation is reversed. So, a parity transformation of the position operator is this:

parity6

Kind of what should be expected. Position under parity turns negative.

As expressed, all of this is really academic. What’s the point?

Parity can give some insights that have deep significance. The deepest result that I understood is that matrix elements and expectation values will conserve with parity transformation. Matrix elements are a generalization of the expectation value where the bra and ket are not necessarily to the same eigenfunction. The proof of the statement here is one line:

parity7

At the end, the squiggles all denote parity transformed values, ‘m’ and ‘n’ are blanket eigenstates with arbitrary parity eigenvalues and V is some miscellaneous operator. First, the complex conjugation that turns a ket into a bra does not affect the parity eigenvalue equation, since parity is its own inverse operation and since the eigenvalues of 1 and -1 are not complex, so the bra above has just the same eigenvalue as if it were a ket. So, the matrix element does not change with the parity transformation –the combined parity transformation of all these parts are as if you just multiplied by identity a couple times, which should do nothing but return the original value.

What makes this important is that it sets a requirement on how many -1 eigenvalues can appear within the parity transformed matrix element (which is equal to the original matrix element): it can never be more than an even number (either zero or two). For the element to exist (that is, for it to have a non-zero value), if the initial and final states connected by the potential are both parity odd or parity even, the potential connecting them must be symmetric. Conversely, if the potential is parity odd, either the initial or final state must be odd, while the other is even. To sum up, a parity odd operator has non-zero matrix elements only when connecting states of differing parity while a parity even operator must connect states of the same parity. This restriction is observed simply by noting that the sign can’t change between a matrix element and the parity transformed matrix element.

Now, since an expectation value (average position, for example) is always a matrix element connecting an eigenket to itself, expectation values can only be non-zero for operators of even parity. For example, in a system defined across all space, average position ends up being zero because the position operator is odd, while both eigenbra and eigenket are of the same function, and therefore have the same parity. For average position to be non-zero, the wavefunction would need to be a superposition of eigenkets of opposite parity (and therefore not an eigenstate of parity at all!)

A tangible, far reaching result of this symmetry, related particularly to the position operator, is that no pure eigenstate can have an electric dipole moment. The dipole moment operator is built around the position operator, so a situation where position expectation value goes to zero will require dipole moment to be zero also. Any observed electric dipole moment must be from a mixture of states.

If you stop and think about that, that’s really pretty amazing. It tells you whether an observable is zero or not depending on which eigenkets are present and whether the operator for that observable can be inverted or not.

Hopefully I got that all correct. If anybody more sophisticated than me sees holes in my statement, please speak up!

Welcome to symmetry.

(For the few people who may have noticed, I still have it in mind to write more about the magnets puzzle, but I really haven’t had time recently. Magnets are difficult.)

Magnets, how do they work? (part 1)

Subtitle: Basic derivation of Ampere’s Law from the Biot-Savart equation.

Know your meme.

It’s been a while since this became a thing, but I think it’s actually a really good question. If you stop to think about it, magnets are one of those things where the structure goes deep down and the pieces which drive the phenomenon become quite confusing and mind bending. Truly, the original meme exploded from an unlikely source who wanted to relish in appreciating those things that seem magical without really appreciating how mind-bending and thought-expanding the explanation to this seemingly earnest question actually is.

As I got on in this writing, I realized that the scope of the topic is bigger than can be tackled in a single post. What is presented here will only be the first part. The succeeding posts may end up being as mathematical as this, but perhaps less so. Moveover, as I got to writing, I realized that I haven’t posted a good bit of math here in a while: what good is the the mathematical poetry of physics if nobody sees it?

Magnets do not get less magical when you understand how they work: they get more compelling.

magnet-stem-cell-therapy

This image, taken from a website that sells quackery, highlights the intriguing properties of magnets. A solid object with apparently no moving parts has this manner of influencing the world around it. How can that not be magical? Lodestones have been magic forever and they do not get less magical with the explanation.

Truthfully, I’ve been thinking about the question of how they work for a couple days now. When I started out, I realized that I couldn’t just answer this out of hand, even though I would like to think that I’ve got a working understanding of magnetic fields. How the details fit together gets deep in a hurry. What makes a bar magnet like the one in the picture above special? You don’t put batteries in it. You don’t flick a switch. It just works.

For most every person, that pattern above is the depth of how it works. How does it work? Well, it has a magnetic field. When a piece of a certain kind of metal is in a magnetic field, it feels a force due to the magnet and this causes the magnet to pull on it or maybe to stick to it. If you have two magnets together, you can orient them in a certain way and they push each other apart.

KONICA MINOLTA DIGITAL CAMERA

In this picture from penguin labs, these magnets are exerting sufficient force on one another that many of them apparently defy gravity. Here, the rod simply keeps the magnets confined so that they can’t change orientations with respect to one another and they exert sufficient repulsive force to climb up the rod as if they have no weight.

It’s definitely cool, no denying.

But, is it better knowing how they work, or just blindly appreciating them because it’s too hard to fill in the blank?

Maybe we can answer that.

The central feature of how magnets work is quite effortlessly explained by the physics of Electromagnetism. Or, maybe it’s better to say that the details are laboriously and completely explained. People rebel against how hard it is to understand the details, but no true explanation is required to be easily explicable.

The forces which hold those little pieces of metal apart are relatively understandable.

Lorentz force

Here’s the Lorentz force law. It says that the force (F) on an object with a charge is equal to sum of the electric force on the object (qE) plus the magnetic force (qvB). Magnets interact solely by magnetic force, the second term.

2000px-lorentz_force-svg

In this picture from Wikipedia, if a charge (q) moving with speed (v) passes into a region containing this thing we call a “magnetic field,” it will tend to curve in its trajectory depending on whether the charge is negative or positive. We can ‘see’ this magnetic field thing in the image above with the bar magnet and iron filings. What is it, how is it produced?

The fundamental observation of magnetic fields is tied up into a phenomenological equation called the Biot-Savart law.

Biotsavart1

This equation is immediately intimidating. I’ve written it in all of it’s horrifying Jacksonian glory. You can read this equation like a sentence. It says that all the magnetic field (B) you can find at a location in space (r) is proportional to a sum of all the electric currents (J) at all possible locations where you can find any current (r’) and inversely proportional to the square of the distance between where you’re looking for the magnetic field and where all the electrical currents are –it may say ‘inverse cube’ in the equation, but it’s actually an inverse square since there’s a full power of length in the numerator. Yikes, what a sentence! Additionally, the equation says that the direction of the magnetic field is at right angles to both the direction that the current is traveling and the direction given by the line between where you’re looking for magnetic field and where the current is located. These directions are all wrapped up in the arrow scripts on every quantity in the equation and are determined by the cross-product as denoted by the ‘x’. The difference between the two ‘r’ vectors in the numerator creates a pure direction between the location of a particular current element and where you’re looking for magnetic field. The ‘d’ at the end is the differential volume that confines the electric currents and simply means that you’re adding up locations in 3D space. The scaling constants outside the integral sign are geometrical and control strength; the 4 and Pi relate to the dimensionality of the field source radiated out into a full solid angle (it covers a singularity in the field due to the location of the field source) and the ‘μ’ essentially tells how space broadcasts magnetic field… where the constant ‘μ’ is closely tied to the speed of light. This equation has the structure of a propagator: it takes an electric current located at r’ and propagates it into a field at r.

It may also be confusing to you that I’m calling current ‘J’ when nearly every basic physics class calls it ‘I’… well, get used to it. ‘Current vector’ is a subtle variation of current.

I looked for some diagrams to help depict Biot-Savart’s components, but I wasn’t satisfied with what Google coughed up. Here’s a rendering of my own with all the important vectors labeled.

biotsavart diagram

Now, I showed the crazy Biot-Savart equation, but I can tell you right now that it is a pain in the ass to work with. Very few people wake up in the morning and say “Boy oh boy, Biot-Savart for me today!” For most physics students this equation comes with a note of dread. Directly using it to analytically calculate magnetic fields is not easy. That cross product and all the crazy vectors pointing in every which direction make this equation a monster. There are some basic feature here which are common to many fields, particularly the inverse square, which you can find in the Newtonian gravity formula or Coulomb’s law for electrostatics, and the field being proportional to some source, in this case an electric current, where gravity has mass and electrostatics have charge.

Magnetic field becomes extraordinary because of that flipping (God damned, effing…) cross product, which means that it points in counter-intuitive directions. With electrostatics and gravity, the field is usually going toward or away from the source, while magnetism has the field seems to be going ‘around’ the source. Moreover, unlike electrostatics and gravity, the source isn’t exactly a something, like a charge or a mass, it’s dynamic… as in a change in state; electric charges are present in a current, but if you have those charges sitting stationary, even though they are still present, they can’t produce a magnetic field. Moreover, if you neutralize the charge, a magnetic field can still be present if those now invisible charges are moving to produce a current: current flowing in a copper wire is electric charges that are moving along the wire and this produces a magnetic field around the wire, but the presence of positive charges fixed to the metal atoms of the wire neutralizes the negative charges of the moving electrons, resulting in a state of otherwise net neutral charge. So, no electrostatic field, even though you have a magnetic field. It might surprise you to know that neutron stars have powerful magnetic fields, even though there are no electrons or protons present in order give any actual electric currents at all. The requirement for moving charges to produce a magnetic field is not inconsistent with the moving charge required to feel force from a magnetic field as well. Admittedly, there’s more to it than just ‘currents’ but I’ll get to that in another post.

With a little bit of algebraic shenanigans, Biot-Savart can be twisted around into a slightly more tractable form called Ampere’s Law, which is one of the four Maxwell’s equations that define electromagnetism. I had originally not intended to show this derivation, but I had a change of heart when I realized that I’d forgotten the details myself. So, I worked through them again just to see that I could. Keep in mind that this is really just a speed bump along the direction toward learning how magnets work.

For your viewing pleasure, the derivation of the Maxwell-Ampere law from the Biot-Savart equation.

In starting to set up for this, there are a couple fairly useful vector identities.

Useful identities 1

This trio contains several basic differential identities which can be very useful in this particular derivation. Here, the variables r are actually vectors in three dimensions. For those of you who don’t know these things, all it means is this:

vectors

These can be diagrammed like this:

vector example

This little diagram just treats the origin like the corner of a 3D box and each distance is a length along one of the three edges emanating from the corner.

I’ll try not to get too far afield with this quick vector tutorial, but it helps to understand that this is just a way to wrap up a 3D representation inside a simple symbol. The hatted symbols of x,y and z are all unit vectors that point in the relevant three dimensional directions where the un-hatted symbols just mean a variable distance along x or y or z. The prime (r’) means that the coordinate is used to tell where the electric current is located while the unprime (r) means that this is the coordinate for the magnetic field. The upside down triangle is an operator called ‘del’… you may know it from my hydrogen wave function post. What I’m doing here is quite similar to what I did over there before. For the uninitiated, here are gradient, divergence and curl:

gradivcurl

Gradient works on a scalar function to produce a vector, divergence works on a vector to produce a scalar function and curl works on a vector to produce a vector. I will assume that the reader can take derivatives and not go any further back than this. The operations on the right of the equal sign are wrapped up inside the symbols on the left.

One final useful bit of notation here is the length operation. Length operation just finds the length of a vector and is denoted by flat braces as an absolute value. Everywhere I’ve used it, I’ve been applying it to a vector obtained by finding the distance between where two different vectors point:

lengthoperation

As you can see, notation is all about compressing operations away until they are very compact. The equations I’ve used to this point all contain a great deal of math lying underneath what is written, but you can muddle through by the examples here.

Getting back to my identity trio:

Useful identities 1

The first identity here (I1) takes the vector object written on the left and produces a gradient from it… the thing in the quotient of that function is the length of the difference between those two vectors, which is simply a scalar number without a direction as shown in the length operation as written above.

The second identity (I2) here takes the divergence of the gradient and reveals that it’s the same thing as a Dirac delta (incredibly easy way to kill an integral!). I’ve not written the operation as divergence on a gradient, but instead wrapped it up in the ‘square’ on the del… you can know it’s a divergence of a gradient because the function inside the parenthesis is a scalar, meaning that the first operation has to be a gradient, which produces a vector, which automatically necessitates the second operation to be a divergence, since that only works on vectors to produce scalars.

The third identity (I3) shows that the gradient with respect to the unprimed vector coordinate system is actually equal to a negative sign times the primed coordinate system… which is a very easy way to switch from a derivative with respect to the first r and the same form of derivative with respect to the second r’.

To be clear, these identities are tailor-made to this problem (and similar electrodynamics problems) and you probably will never ever see them anywhere but the *cough cough* Jackson book. The first identity can be proven by working the gradient operation and taking derivatives. The second identity can be proven by using the vector divergence theorem in a spherical polar coordinate system and is the source of the 4*Pi that you see everywhere in electromagnetism. The third identity can also be proven by the same method as the first.

There are two additional helpful vector identities that I used which I produced in the process of working this derivation. I will create them here because, why not! If the math scares you, you’re on the wrong blog. To produce these identities, I used the component decomposition of the cross product and a useful Levi-Civita kroenecker delta identity –I’m really bad at remembering vector identities, so I put a great deal of effort into learning how to construct them myself: my Levi-Civita is ghetto, but it works well enough. For those of you who don’t know the ol’ Levi-Civita symbol, it’s a pretty nice tool for constructing things in a component-wise fashion: εijk . To make this work, you just have to remember it as I just wrote it… if any indices are equal, the symbol is zero, if they are all different, they are 1 or -1. If you take it as ijk, with the indices all different as I wrote, it equals 1 and becomes -1 if you reverse two of the indices: ijk=1, jik=-1, jki=1, kji=-1 and so on and so forth. Here are the useful Levi-Civita identities as they relate to cross product:

levicivita

Using these small tools, the first vector identity that I need is a curl of a curl. I derive it here:

vector id 1

Let’s see how this works. I’ve used colors to show the major substitutions and tried to draw arrows where they belong. If you follow the math, you’ll note that the Kroenecker deltas have the intriguing property of trading out indices in these sums. Kroenecker delta works on a finite sum the same way a Dirac delta works on an integral, which is nothing more than an infinite sum. Also, the index convention says that if you see duplicated indices, but without a sum on that index, you associate a sum with that index… this is how I located the divergences in that last step. This identity is a soft stopping point for the double curl: I could have used the derivative produce rule to expand it further, but that isn’t needed (if you want to see it get really complex, go ahead and try it! It’s do-able.) One will note that I have double del applied on a vector here… I said that it only applies on scalars above… in this form, it would only act on the scalar portion of each vector component, meaning that you would end up with a sum of three terms multiplied by unit vectors! Double del only ever acts on scalars, but you actually don’t need to know that in the derivation below.

This first vector identity I’ve produced I’ll call I4:

useful vector id 1

Here’s a second useful identity that I’ll need to develop:

useful vector id 2

This identity I’ll call I5:

vector id 2

*Pant Pant* I’ve collected all the identities I need to make this work. If you don’t immediately know something off the top of your head, you can develop the pieces you need. I will use I1, I2, I3, I4 and I5 together to derive the Maxwell-Ampere Law from Biot-Savart. Most of the following derivation comes from Jackson Electrodynamics, with a few small embellishments of my own.

first line amp devIn this first line of the derivation, I’ve rewritten Biot-Savart with the constants outside the integral and everything variable inside. Inside the integral, I’ve split the meat so that the different vector and scalar elements are clear. In what follows, it’s very important to remember that unprimed del operators are in a different space from the primed del operators: a value (like J) that is dependent on the primed position variable is essentially a constant with respect to the unprimed operator and will render a zero in a derivative by the unprimed del. Moreover, unprimed del can be moved into or out of the integral, which is with respect to the primed position coordinates. This observation is profoundly important to this derivation.

BS to amp 1

The usage of the first two identities here manages to extract the cross product from the midst of the function and puts it into a manipulable position where the del is unprimed while the integral is primed, letting me move it out of the integrand if I want.

BS to amp 2

This intermediate contains another very important magnetic quantity in the form of the vector potential (A) –“A” here not to be confused with the alphabetical placeholder I used while deriving my vector identities. I may come back to vector potential later, but this is simply an interesting stop-over for now. From here, we press on toward the Maxwell-Ampere law by acting in from the left with a curl onto the magnetic field…

BS to amp 3

The Dirac delta I end with in the final term allows me to collapse r’ into r at the expense of that last integral. At this point, I’ve actually produced the magnetostatic Ampere’s law if I feel like claiming that the current has no divergence, but I will talk about this later…

BS to amp 4

This substitution switches del from being unprimed to primed, putting it in the same terms as the current vector J. I use integration by parts next to switch which element of the first term the primed del is acting on.

BS to amp 5

Were I being really careful about how I depicted the integration by parts, there would be a unit vector dotted into the J in order to turn it into a scalar sum in that first term ahead of the integral… this is a little sloppy on my part, but nobody ever cares about that term anyway because it’s presupposed to vanish at the limits where it’s being evaluated. This is a physicist trick similar to pulling a rug over a mess on the floor –I’ve seen it performed in many contexts.

BS to amp 6

This substitution is not one of the mathematical identities I created above, this is purely physics. In this case, I’ve used conservation of charge to connect the divergence of the current vector to the change in charge density over time. If you don’t recognize the epic nature of this particular substitution, take my word for it… I’ve essentially inverted magnetostatics into electrodynamics, assuring that a ‘current’ is actually a form of moving charge.

BS to amp 75

In this line, I’ve switched the order of the derivatives again. Nothing in the integral is dependent on time except the charge density, so almost everything can pass through the derivative with respect to time. On the other hand, only the distance is dependent on the unprimed r, meaning that the unprimed del can pass inward through everything in the opposite direction.

BS to amp 8

At this point something amazing has emerged from the math. Pardon the pun; I’m feeling punchy. The quantity I’ve highlighted blue is a form of Coulomb’s law! If that name doesn’t tickle you at the base of your spine, what you’re looking at is the electrostatic version of the Biot-Savart law, which makes electric fields from electric charges. This is one of the reasons I like this derivation and why I decided to go ahead and detail the whole thing. This shows explicitly a connection between magnetism and electrostatics where such connection was not previously clear.

BS to amp 9

And thus ends the derivation. In this casting, the curl of the magnetic field is dependent both on the electric field and on currents. If there is no time varying electric field, that first term vanishes and you get the plain old magnetostatic Ampere’s law:

Ampere's law

This says simply that the curl of the magnetic field is equal to the current. There are some interesting qualities to this equation because of how the derivation leaves only a single positional dependence. As you can see, there is no separate position coordinate to describe magnetic field independently from its source. And, really, it isn’t describing the magnetic field as ‘generated’ by the current, but rather that a deformation to the linearity of the magnetic field is due to the presence of a current at that location… which is an interesting way to relate the two.

This relationship tends to cause magnetic lines to orbit around the current vector.

magcur

This image from hyperphysics sums up the whole situation –I realize I’ve been saying something similar from way up, but this equation is proof. If you have current passing along a wire, magnetic field will tend to wrap around the wire in a right handed sense. For all intents and purposes, this is all the Ampere’s law says, neglecting that you can manipulate the geometry of the situation to make the field do some interesting things. But, this is all.

Well, so what? I did a lot of math. What, if anything, have I gained from it? How does this help me along the path to understanding magnets?

The Ampere Law is useful in generating very simple magnetic field configurations that can be used in the Lorentz force law, ultimately showing a direct dynamical connection between moving currents and magnetic fields. I have it in mind to show a freshman level example of how this is done in the next part of this series. Given the length of this post, I will do more math in a different post.

This is a big step in the direction of learning how magnets work, but it should leave you feeling a little unsatisfied. How exactly do the forces work? In physics, it is widely known that magnetic fields do no work, so why is it that bar magnets can drag each other across the counter? That sure looks like work to me! And if electric currents are necessary to drive magnets, why is it that bar magnets and horseshoe magnets don’t require batteries? Where are the electric currents that animate a bar magnet and how is it that they seem to be unlimited or unpowered? These questions remain to be addressed.

Until the next post…

Nuclear Toxins

A physicist from Lawrence Livermore Labs has been restoring old nuclear bomb detonation footage. This seems to me to be an incredibly valuable task because all of the original footage was shot on film, which is currently in the process of decaying and falling apart. There have been no open air nuclear bomb detonations on planet Earth since probably the 1960s, which is good… except that people are in the process of forgetting exactly how bad a nuclear weapon is. The effort of saving this footage makes it possible for people to know something about this world-changing technology that wasn’t previously declassified. Nukes are sort of mythical to a body like me who wasn’t even born until about the time that testing went underground: to everybody younger than me, I suspect that nukes are an old-people thing, a less important weapon than computers. That Lawrence Livermore Labs has posted this footage to Youtube is an amazing public service, I think.

As I was reading an article on Gizmodo about this piece of news, I happened to wander into the comment threads to see what the echo chamber had to say about all this. I should know better. Admittedly, I actually didn’t post any comments castigating anyone, but there was a particular comment that got me thinking… and calculating.

Here is the comment:

Nuclear explosions produce radioactive substances that are rare in nature — like carbon-14, a radioactive form of the carbon atom that forms the chemical basis of all life on earth.

Once released into the atmosphere, carbon-14 enters the food chain and gets bound up in the cells of most living things. There’s still enough floating around for researchers to detect in the DNA of humans born in 2016. If you’re reading this, it’s inside you.

This is fear mongering. If you’ve never seen fear mongering before, this is what it looks like. The comment is intended to deliberately inspire fear not just in nuclear weapons, but in the prospect of radionuclides present in the environment. The last sentence is pure body terror. Dear godz, the radionuclides, they’re inside me and there’s no way to clean them out! I thought for a time about responding to this comment. I decided not to because there is enough truth here that anyone should probably stop and think about it.

For anyone curious, the wikipedia article on the subject has some nice details and seems thorough.

It is true the C-14 is fairly rare in nature. The natural abundance is 1 part per trillion of carbon. It is also true that the atmospheric test detonations of nuclear bombs created a spike in the C-14 present in the environment. And, while it’s true that C-14 is rare, it is actually not technically unnatural since it is formed by cosmic rays impinging on the upper atmosphere. For the astute reader, C-14 produced by cosmic rays forms the basis of radiocarbon dating since C-14 is present at a particular known, constant proportion in living things right up until you die and stop uptaking it from the environment –a scientist can then determine the date when living matter died based on the radioactive decay curve for C-14.

Since it’s not unnatural, the real question here is whether the spike of radionuclides created by nuclear testing significantly increases the health hazard posed by excess C-14 above and beyond what it would normally be. You have it in your body anyway, is there greater hazard due to the extra amount released? This puzzle is actually a somewhat intriguing one to me because I worked for a time with radionuclides and it is kind of chilling all the protective equipment that you need to use and all the safety measures that are required. The risk is a non-trivial one.

But, what is the real risk? Does having a detectable amount of radionuclide in your body that can be ascribed to atomic air tests constitute an increased health threat?

To begin with, what is the health threat? For the particular case of C-14, one of a handful of radionuclides that can be incorporated into your normal body structures, the health threat would obviously come from the radioactivity of the atom. In this particular case, C-14 is a beta-emitter. This means that C-14 radiates electrons; specifically, one of the neutrons in the atom’s nucleus converts into a proton by giving off an electron and a neutrino, resulting in the carbon turning into nitrogen. The neutrino basically doesn’t interact with anything, but the radiated electron can travel with energies of 156 keV (or about 2.4×10^-14 Joules). This will do damage to the human body in two routes, either by direct collision of the radiated electron with the body, or by a structurally important carbon atom converting into a nitrogen atom during the decay process if the C-14 was part of your body already. Obviously, if a carbon atom turns suddenly into nitrogen, that’s conducive to organic chemistry occurring since nitrogen can’t maintain the same number of valence interactions as carbon without taking on a charge. So, energy deposition by particle collision, or spontaneous chemistry is the potential cause of the health threat.

In normal terms, the carbon-nitrogen chemistry routes for damage are not accounted for in radiation damage health effects simply because of how radiation is usually encountered: you need a lot of radiation in order to have a health effect, and this is usually from an exogenous source, that is, provided by a radiation source that is outside the body rather than incorporated with it, like endogenous C-14. This would be radiation much like the UV radiation which causes a sunburn. Heath effects due to radiation exposure are measured on a scale by a dose unit called a ‘rem.’ A rem expresses an amount of radiation energy deposited into body mass, where 1 rem is equal to 1.0×10^-5 Joules of radiation energy deposited into 1 gram of body mass. Here is a table giving the general scale of rem doses which causes health effects. People who work around radiation as part of their job are limited to a full-body yearly dose of 5 rem, while the general public is limited to 0.1 rem per year. Everybody is expected to have an environmental radiation dose exposure of about 0.3 rem per year and there’s an allowance of 0.05 rem per year for medical x-rays. It’s noteworthy that not all radiation doses are created equal and that the target body tissue matters; this is manifest by different radiation doses being allowed to occur to the eyes (15 rem) or the extremities, like the skin (50 rem). A sunburn would be like a dose of 100 to 600 rem to the skin.

What part of an organism must the damage affect in order to cause a health problem? Really, only one is truly significant, and that’s your DNA. Easy to guess. Pretty much everything else is replaceable to the extent that even a single cell dying from critical damage is totally expendable in the context of an organism built of a trillion cells. The problem of C-14 being located in your DNA directly is numerically a rather minor problem: DNA actually only accounts for about 3% of the dry mass of your cells, meaning that only about 3% of the C-14 incorporated into your body is directly incorporated into your DNA, so that most of the damage to your DNA is due to C-14 not directly incorporated in that molecule. This is not to say that chemistry doesn’t cause the damage, merely that most of the chemical damage is probably due to energy deposition in molecules around the DNA which then react with the DNA, say by generation of superoxides or similar paths. This may surprise you, but DNA damage isn’t always a complete all-or-nothing proposition either: to an extent, the cell has machinery which is able to repair damaged DNA… the bacterium Dienococcus radiodurans is able to repair its DNA so efficiently that it’s able to subsist indefinitely inside a nuclear reactor. Humans have some repair mechanisms as well.

Cells handling radiation damage in humans have about two levels of response. For minor damage, the cell repairs its DNA. If the DNA damage is too great to fix, a mechanism triggers in the cell to cause it to commit suicide. You can see the effect of this in a sunburn: critically radiation damaged skin cells commit suicide en mass in the substratum of your skin, ultimately sacrificing the structural integrity of your skin, causing the external layer to sough off. This is why your skin peels due to a sunburn. If the damage is somewhere in between, matters are a little murkier… your immune system has a way of tracking down damaged cells and destroying them, but those screwed up cells sometimes slip through the cracks to cause serious disease. Inevitably cancer. Affects like these emerge for ~20 rem full body doses. People love to worry about superpowers and three-arm, three-eye type heritable mutations due to radiation exposure, but congenital mutations are a less frequent outcome simply because your gonads are such a small proportion of your body; you’re more likely to have other things screwed up first.

One important trick in all of this to notice is that to start having serious health effects that can be clearly ascribed to radiation damage, you must absorb a dose of greater than about 5 rem.

Now, what kind of a radiation dose do you acquire on a yearly basis from body-incorporated C-14 and how much did that dose change in people due to atmospheric nuclear testing?

I did my calculations on the supposition of a 70 kg person (which is 154 lbs). I also adjusted rem into a more easily used physical quantity of Joules/gram (1 rem = 1×10^-5 J/g, see above.)  One rem of exposure for a 70 kg person works out to an absorbed dose of 0.7 J/year. An exposure sufficient to hit 5 rems is 3.5 J/year while 20 rem is 14 J/year. Beta-electrons from c-14 maximally hit with 2.4×10^-14 J/strike (150 keV) with about 0.8×10^-14 J/hit on average (50 keV).

In the following part of the calculation, I use radioactive decay and half-life in order to determine the rate of energy transference to the human body on the assumption that all the beta-electron energy emitted by radiation is absorbed by the human body. Radiation rates are a purely probabilistic event where the likelihood of seeing a radiated electron is proportional to the size of the radioactive atom population. The differential equation is a simple one and looks like this:

decay rate differential equation

This just means that the rate of decay (and therefore electron production rate) is proportional to the size of the decaying population where the k variable is a rate constant that can be determined from the half-life. The decay differential equation is solved by the following function:

exponential decay

This is just a simple exponential decay which takes an initial population of some number of objects and reduces it over time. You can solve for the decay constant by plugging the half-life into the time and simply asserting that you have 1/2 of your original quantity of objects at that time. The above exponential rearranges to find the decay constant:

decay constant

Here, Tau is the half-life in seconds (I could have used my time as years, but I’m pretty thoroughly trained to stick with SI units) and I’ve already substituted 1/2 for the population change. With k from half-life, I just need the population of radiation emitters present in the body in order to know the rate given in the first equation above… where I would simply multiply k by N.

To do this calculation, the half-life of C-14 is known to be 5730 years, which I then converted into seconds (ick; if I only care about years, next time I only calculate in years). This gives a decay constant of 3.836×10^-12 emissions/sec. In order to get the decay rate, I also need the population of C-14 emitters present in the human body. We know that C-14 has a natural prevalence of 1 per trillion and also that a 70 kg human body is 16 kg carbon after a little google searching, which gives me 1.6×10^-8 g of C-14. With C-14’s mass of 14 g/mole and Avagadro’s number, this gives about 6.88×10^14 C-14 atoms present in a 154 lb person. This population together with the rate constant gives me the decay rate by the first equation above, which is 2.639×10^3 decays per second. Energy per beta-electron absorbed times the decay rate gives the rate of energy deposited into the body per second on the assumption that all beta-decay energy is absorbed by the target: 2.639×10^3 decays/sec * 2.4×10^-14 Joules/decay = 6.33 x 10^-11 J/s. For the course of an entire year, the amount of energy works out to about 0.002 Joules/year.

This gets me to a place where I can start making comparisons. The exposure limit for any old member of the general public to ‘artificial’ radiation is 0.1 rem, or 0.07 J/year. The maximum… maximum… contribution due to endogenous C-14 is 35 times smaller than the allowed public exposure limits (for mean energy, it’s more like 100 times smaller). On average, endogenous C-14 gives 1/100th of the allowed permitted artificial radiation dose.

But, I’ve actually fudged here. Note that I said above that humans normally get a yearly environmental radiation dose of about 0.3 rem (0.21 J/year)… meaning that endogenous C-14 only provides about 1/300th of your natural dose. Other radiation sources that you encounter on a daily basis provide radiation exposure that is 300 times stronger than C-14 directly incorporated into the structure of your body. And, keep in mind that this is way lower than the 5 rem where health effects due to radiation exposure begin to emerge.

How does C-14 produced by atmospheric nuclear testing figure into all of this?

The wikipedia article I cited above has a nice histogram of detected changes in the environmental C-14 levels due to atmospheric nuclear testing. At the time of such testing, C-14 prevalence spiked in the environment by about 2 fold and has decayed over the intervening years to be less than 1.1-fold. This has an effect on C-14 exposure specifically of changing it from 1/300th of your natural dose to 1/150th, or about 0.5%, which then tapers to less than a tenth of a percent above natural prevalence in less than fifty years. Detectable, yes. Significant? No. Responsible for health effects…… not above the noise!

This is not to say that a nuclear war wouldn’t be bad. It would be very bad. But, don’t exaggerate environmental toxins. We have radionuclides present in our bodies no matter what and the ones put there by 1950s nuclear testing are only a negligible part, even at the time –what’s 100% next to 100.5%? A big nuclear war might be much worse than this, but this is basically a forgettable amount of radiation.

For anybody who is worried about environmental radiation, I draw your attention back to a really simple fact:

depositphotos_9985842_s-199x300

The woman depicted in the picture above has received a 100 to 600 rem dose of very (very very) soft X-rays by deliberately sitting out in front of a nuclear furnace. You can even see the nuclear shadow on her back left by her scant clothing. Do you think I’m kidding? UV light, which is lower energy than x-rays, but not by that much… about 3 eV versus maybe 500 eV, is ionizing radiation which is absorbed directly by skin DNA to produce real radiation damage, which your body treats indistinguishably from how it treats damage from particle radiation of radionuclides or X-rays or gamma-rays. The dose which produced this affect is something like two to twelve times higher than the federally permitted dose that radiation workers are allowed to receive in their skin over the course of an entire year… and she did it to herself deliberately in a matter of hours!

Here’s a hint, don’t worry about the boogieman under the bed when what you just happily did to yourself over the weekend among friends is much much worse.

Calculating Molarity (mole/L)

As a preface to this post, I want to make doubly clear my stance on vaccines. There is no good scientific evidence to support the notion that vaccination is in any way an unsafe practice or that it is responsible for any manner of health problem above and beyond the diseases that vaccines protect against. Vaccination is the single most powerful health intervention created in the last 150 years of medicine. There is, in my opinion, some potential for this post to be used to damage the credibility of a person who I believe to be a necessary positive force in the Healthcare scene and I want to make it clear that this was not the intention of my writing here. Orac is a tireless advocate for science and for clear, skeptical thought in general and I respect him quite deeply for the time he puts in and for putting up with the static he puts up with.

That said, I believe that science advocacy is a double edged sword: if you didn’t get it right, it can come back to bite you.

I love Respectful Insolence, but I’ve got to ding Orac for failing to calculate molarity correctly. He is profoundly educated, but I think he’s a surgeon and not a physicist. We all have our weak points! (Thank heaven above I’m not ever in the operating room with the knife!)

In this post, which he may now have edited for correctness (and it seems he has), he makes the following statement:

More importantly, look at the numbers of precipitates found per sample. It ranges from two to 1,821.

O.M.G.! 1,821 particles! Holy crap! That’s horrible! The antivaxers are right that vaccines are hopelessly contaminated!

No. They. Are. Not.

Look at it this way. This is what was found in 20 μl (that’s microliters) of liquid. That’s 0.00002 liters. That means, in a theoretical liter of the vaccine, the most that one would find is 91,050,000 (9.105 x 107) particles! Holy hell! That’s a lot. We should be scared, shouldn’t we? well, no. Let’s go back to our homeopathy knowledge and look at Avogadro’s number. One mole of particles = 6.023 x 1023. So divide 91,050,000 by Avogadro’s number, and you’ll get the molarity of a solution of 91,050,000 particle in a liter, as a 1 M solution would contain 6.023 x 1023 particles. So what’s the concentration:

1.512 x 10-16 M. that’s 0.15 femtomolar (fM) (or 150 altomolar), an incredibly low concentration. And that’s the highest amount the investigators found.

Anybody see the mistake? Let’s start here: Avogadro’s number is a scaling constant for a linear relationship and it has a unit! The units on this number are atoms(or molecules) per mole. It converts a number of atoms or molecules into a number of moles.

‘Moles’ is a convenient person-sized number that is standardized around ‘molecular weight,’ which is a weight unit that arbitrarily says that a single carbon atom has a weight of ’12’ and results in atomic hydrogen having a weight of ‘1.’ That’s atomic mass units (or AMU), which is usually very convenient for calculating relative weights of molecules by adding up all the AMU of their atomic constituents. To use molarity, we usually need a molecular weight in the form of Daltons, or grams/mole. Grams per mole says that it takes this many grams in mass of a substance for that substance to contain a single mole’s worth of molecules (or atoms) where it is then implicit that the number of molecules or atoms is Avogadro’s number.

‘Mole’ is extremely special. It refers to a collection of objects that are atomically identical! If you have a mole of a kind of protein, it means that you have 6.02 x 10^23 number of this kind of identical object. If you make a comparison between two proteins, the same molar number of each with a different molecular weight is a different overall mass. Consider Insulin (5808 g/mole) compared to the 70S Ribosome (2,500,000 g/mole)… one mole of Insulin would weigh 5.8 kg while one mole of 70S Ribosome would weigh 2.5 metric tons!!! If they have roughly the average density of proteins, what would be the volume of 1 mole of 70S ribosome as compared to 1 mole of Insulin? It would be 430 times greater for the Ribosome; 2900 L for 70S Ribosome while Insulin is about 6 L!

Notice something here: an object with a big molecular weight occupies a bigger volume than the same object of a smaller molecular weight… regardless of the fact that they are at the same molarity. Molarity as a number depends strongly on the molecular weight of the substance in question in order to mean anything at all. For the Ribosome, the same molar concentration as for Insulin means a solution containing a much larger amount of solute.

In the post in question on Respectful Insolence, Orac is talking about a paper which observes particulate matter derived from vaccine specimens in an SEM. It is clear from the authorship and publication of the paper that the intent is to find fault in vaccines based upon the contents of materials examined by this probing… from what little I know about the paper, it does not seem to be producing any information that is truly that informative. But, you can’t fault a paper on a point that may not actually be as flawed as an initial interpretation would imply. The paper reports number of particles observed per 20 uL of a solvent. They find as many as 1,821 particles per 20 uL. We are not told for certain what these particles are composed of except that the investigators aren’t sure and shot an overpower EDS at everything and reported even the spurious results. Orac scales up this number to 1L to get 90.1 x 10^7 particles and then divides by Avogadro’s number to find what proportion this is of one mole of these particles, never mind that we don’t know how big the particles are in terms of molecular weight or how dense in volume per mass. He declares it to be a tenth of a femtomole and runs on with how tiny the concentration is. As I initially wrote this, I focused on the gleeful way in which Orac does his deconstruction in large part because it really isn’t a valid thing to laugh at when the deconstruction is not properly done.

Here is how someone of my background approaches the same series of observations. I can see from the micrograph in the blog post that the scale bar is something like 2 mm (2000 microns)… the objects in question are maybe tens to hundreds of microns in size. Let’s make a physicist supposition here and think about it: pulling this out of my ass, I’ll claim these are 1,821 approximately spherical identical particles of sodium chloride, each of 40 microns diameter. That gives a volume of 4/3*Pi*20^3 um^3 or 1.9 x 10^-12 m^3 per particle and 3.5 x 10^-9 m^3 for the whole collection of particles. Now, density usually is given in terms of g/cm^3 or g/mL… there are 100 cm per meter and you must convert three times to cube it, so 3.5 x 10^-9 x 100^3 = 3.5 x 10^-3 cm^3. Wait a minute, we’re now at a volume of 3.5 uL!!! Did you see that? A cubic centimeter is a mL and 0.0035 mL is 3.5 uL, or 17% of the original 20 uL sample volume! What molarity is this? The density of sodium chloride is 2.16 g/mL or 2.16 mg/uL… which is 7.56 mg. That’s 7.56 mg of salt dissolved in 20 uL. The molecular weight of sodium chloride is 58.44 g/mole or 58.44 mg/mmole, which gives .129 mmole. From this .129 mmole in .02 mL is 6.47 mmole/mL.

That’s 6.47 mole/L……. 6.47 M!!!!

Let’s pause for a second. Is that femtomolar?

Orac missed the science here! I initially wrote that he should be apologizing for it, but I’ve revised this so that my respect for his work is more apparent. The volume of these particles and their composition is everything. A single particle with a molecular weight in the gigadaltons or teradaltons range is suddenly a very substantial mass in low particle number. If these particles are as I specified and composed of simple salt, they are at a molarity that is abruptly appreciable. If we make these into tiny balls of Ricin, that’s unquestionably a fatally toxic quantity!

As with all things, dose makes the poison and there’s no Ricin in evidence, but this argument Orac has made about concentration, in this particular case is catastrophically wrong. A femtomole of a big particle that can be dissolved could be a large dose!

I forgive him and I love his blog, but let this be a lesson… you don’t just divide by Avogadro’s number in order to get meaningful concentrations!

Hydrogen atom radial equation

In between the Sakurai problems, I decided to tackle a small problem I set for myself.

The Sakurai quantum mechanics book is directed at about graduate student level, meaning that it explicitly overlooks problems that it deems too ‘undergraduate.’ When I started into the next problem in the chapter, which deals with the Wigner-Eckert relation, I decided to direct myself at a ‘lower level’ problem that demands practice from time to time. I worked in early January solving the angular component of the hydrogen atom by deriving the spherical harmonics and much of my play time since has been devoted to angular and angular momentum type problems. So, I decided it would be worth switching up a little and solving the radial portion of the hydrogen atom electron central force problem.

One of my teachers once suggested that deriving the hydrogen atom was a task that any devoted physicist should play with every other year or so. Why not, I figured; the radial solution is actually a bit more mind boggling to me than the angular parts because it requires some substitutions that are not very intuitive.

The hydrogen atom problem is a classic problem mainly because it’s one of the last exactly solvable quantum mechanics problems you ever encounter. After the hydrogen atom, the water gets deeper and the field starts to focus on tools that give insight without actually giving exact answers. The only atomic system that is exactly solvable is the hydrogen atom… even helium, with just one more electron, demands perturbation in some way. It isn’t exactly crippling to the field because the solutions to all the other atoms are basically variations of the hydrogen atom and all, with some adjustment, have hydrogenic geometry or are superpositions of hydrogen-like functions that are only modified to the extent necessary to make the energy levels match. Solving the hydrogen atom ends up giving profound insight to the structure of the periodic table of the elements, even if it doesn’t actually solve for all the atoms.

As implied above, I decided to do a simplified version of this problem, focusing only on the radial component. The work I did on the angular momentum eigenstates was not in context of the hydrogen electron wave function, but can be inserted in a neat cassette to avoid much of the brute labor of the hydrogen atom problem. The only additional work needed is solving the radial equation.

A starting point here is understanding spherical geometry as mediated by spherical polar coordinates.

A hydrogen atom, as we all know from the hard work of a legion of physicists coming into the turn of the century, is a combination of a single proton with a single electron. The proton has one indivisible positive charge while the electron has one indivisible negative charge. These two charges attract each other and the proton, being a couple thousand times more massive, pulls the electron to it. The electron falls in until the kinetic energy it gains forces it to have enough momentum to be unlocalized to a certain extent, as required by quantum mechanical uncertainty. The system might then radiate photons as the electron sorts itself into a stable orbiting state. The resting combination of proton and electron has neutral charge with the electron ‘distributed’ around the proton in a sort of cloud as determined by its wave-like properties.

The first approximation of the hydrogen atom is a structure called the Bohr model, proposed by Niels Bohr in 1913. The Bohr model features classical orbits for the electron around the nucleus, much like the moon circles the Earth.

atom

This image, from duckster.com, is a crude example of a Bohr atom. The Bohr atom is perhaps the most common image of atoms in popular culture, even if it isn’t correct. Note that the creators of this cartoon didn’t have the wherewithall to make a ‘right’ atom, giving the nucleus four plus charges and the shell three minus… this would be a positively charged ion of Beryllium. Further, the electrons are not stacked into a decent representation for the actual structure: cyclic orbitals would be P-orbitals or above, where Beryllium has only S-orbitals for its ground state, which possess either no orbital angular momentum, or angular momentum without any defined direction. But, it’s a popular cartoon. Hard to sweat the small stuff.

The Bohr model grew from the notion of the photon as a discrete particle, where Bohr postulated that the only allowed stable orbits for the electron circling the nucleus is at integer quantities of angular momentum delivered by single photons… as quantized by Planck’s constant. ‘Quantized’ is a word invoked to mean ‘discrete quantities’ and comes back to that pesky little feature Deepak Chopra always ignores: the first thing we ever knew about quantum mechanics was Planck’s constant –and freaking hell is Planck’s constant small! ‘Quantization’ is the act of parsing into discrete ‘quantized’ states and is the word root which loaned the physics field its name: Quantum Mechanics. ‘Quantum Mechanics’ means ‘the mechanics of quantization.’

Quantum mechanics, as it has evolved, approaches problems like the hydrogen atom using descriptions of energy. In the classical sense, an electron orbiting a proton has some energy describing its kinetic motion, its kinetic energy, and some additional energy describing the interaction between the two masses, usually as a potential source of more kinetic energy, called a potential energy. If nothing interacts from the outside, the closed system has a non-varying total energy which is the sum of the kinetic and potential energies. Quantum mechanics evolved these ideas away from their original roots using a version of Hamiltonian formalism. Hamiltonian formalism, as it appears in quantum, is a way to merely sum up kinetic and potential energies as a function of position and momentum –this becomes complicated in Quantum because of the restriction that position and momentum cannot be simultaneously known to arbitrary precision. But, Schrodinger’s equation actually just boils down to a statement of kinetic energy plus potential energy.

Here is a quick demonstration of how to get from a statement of total energy to the Schrodinger equation:

5-12-16 schrodinger

After ‘therefore,’ I’ve simply multiplied in from the right with a wave function to make this an operator equation. The first term on the left is kinetic energy in terms of momentum while the second term is the Gaussian CGS form of potential energy for the electrical central force problem (for Gaussian CGS, the constants of permittivity and permeability are swept under the rug by collecting them into the speed of light and usually a constant of light speed appears with magnetic fields… here, the charge is in statcoulombs, which take coulombs and wrap in a scaling constant of 4*Pi.) When you convert momentum into its position space representation, you get Schrodinger’s time independent equation for an electron under a central force potential. The potential, which depends on the positional expression of ‘radius,’ has a negative sign to make it an attractive force, much like gravity.

Now, the interaction between a proton and an electron is a central force interaction, which means that the radius term could actually be pointed in any direction. Radius would be some complicated combination of x, y and z. But, because the central force problem is spherically symmetric, if we could move out of Cartesian coordinates and into spherical polar, we get a huge simplification of the math. The inverted triangle that I wrote for the representation of momentum is a three dimensional operator called the Laplace operator, or ‘double del.’ Picking the form of del ends up casting the dimensional symmetry of the differential equation… as written above, it could be Cartesian or spherical polar or cylindrical, or anything else.

A small exercise I sometimes put myself through is defining the structure of del. The easiest way that I know to do this is to pull apart the divergence theory of vector calculus in Spherical polar geometry, which means defining a differential volume and differential surfaces.

5-12-16 central force 2

Well, that turned out a little neater than my usual meandering crud.

This little bit of math is defining the geometry of the coordinate variables in spherical polar coordinates. You can see the spherical polar coordinates in the Cartesian coordinate frame and they consist of a radial distance from the origin and two angles, Phi and Theta, that act at 90 degrees from each other. If you pick a constant radius in spherical polar space, you get a spherical surface where lines of constant Phi and Theta create longitude and latitude lines, respectively, making a globe! You can establish a right handed coordinate system in spherical polar space by picking a point and considering it to be locally Cartesian… the three dimensions at this point are labeled as shown, along the outward radius and in the directions in which each of the angles increases.

If you were to consider an infinitesimal volume of these perpendicular dimensions, at this locally cartesian point, it would be a volume that ‘approaches’ cubic. But then, that’s the key to calculus: recognizing that 99.999999 effectively approaches 100. So then, this framework allows you to define the calculus occurring in spherical polar space. The integral performed along Theta, Phi and Rho would be adding up tiny cubical elements of volume welded together spherically, while the derivative would be with respect to each dimension of length as locally defined. The scaling values appear because I needed to convert differentials of angle into linear length in order to calculate volume, which can be accomplished by using the definition of the radian angle, which is arc length per radius –a curve is effectively linear when an arc becomes so tiny as to be negligible when considering the edges of an infinitesimal cube, like thinking about the curvature of the Earth effecting the flatness of the sidewalk outside your house.

The divergence operation uses Green’s formulas to say that a volume integral of divergence relates to a surface integral of flux wrapping across the surface of that same volume… and then you simply chase the constants. All that I do to find the divergence differential expression is to take the full integral and remove the infinite sum so that I’m basically doing algebra on the infinitesmal pieces, then literally divide across by the volume element and cancel the appropriate differentials. There are three possible area integrals because the normal vector is in three possible directions, one each for Rho, Theta and Phi.

The structure becomes a derivative if the volume is in the denominator because volume has one greater dimension than any possible area, where the derivative is with respect to the dimension of volume that doesn’t cancel out when you divide against the areas. If a scaling variable used to convert theta or phi into a length is dependent on the dimension of the differential left in the denominator, it can’t pass out of the derivative and remains inside at completion. The form of the divergence operation on a random vector field appears in the last line above. The value produced by divergence is a scalar quantity with no direction which could be said to reflect the ‘poofiness’ of a vector field at any given point in the space where you’re working.

I then continued by defining a gradient.

5-12-16 central force 1

Gradient is basically an opposite operation from divergence. Divergence creates a scalar from a vector which represents the intensity of ‘divergence’ at some point in a smooth function defined across all of space. Gradient, on the other hand, creates a vector field out of a scalar function, where the vectors point in the dimensional direction where the function tends to be increasing.

This is kind of opaque. One way to think about this is to think of a hill poking out of a two dimensional plane. A scalar function defines the topography of the hill… it says simply that at some pair of coordinates in a plane, the geography has an altitude. The gradient operation would take that topography map and give you a vector field which has a vector at every location that points in the direction toward which the altitude is increasing at that location. Divergence then goes backward from this, after a fashion: it takes a vector map and coverts it into a map which says ‘strength of change’ at every location. This last is not ‘altitude’ per se, but more like ‘rate at which altitude is changing’ at a given point.

The Laplace operator combines gradient with divergence as literally the divergence of a gradient, denoted as ‘double del,’ the upside-down triangle squared.

In the last line, I’ve simply taken the Laplace operator in spherical polar coordinates and dropped it into its rightful spot in Schrodinger’s equation as shown far above. Here, the wave equation, called Psi, is a density function defined in spherical polar space, varying along the radius (Rho) and the angles Theta and Phi (the so-called ‘solid angle’). Welcome to greek word salad…

What I’ve produced is an explicit form for Schrodinger’s equation with a coordinate set that is conducive to the problem. This differential equation is a multivariate second order partial differential equation. You have to solve this by separation of variables.

Having defined the hydrogen atom Schrodinger equation, I now switch to the more simple ‘radial only’ problem that I originally hinted at. Here’s how you cut out the angular parts:

5-12-16 radial schrodinger equation

You just recognize that the second and third differential terms are collectively the square of the total angular momentum and then use the relevant eigenvalue equation to remove it.

The L^2 operator comes out of the kinetic energy contained in the electron going ‘around.’ For the sake of consistency, it’s worth noting that the Hamiltonian for the full hydrogen atom contains a term for the kinetic energy of the proton and that the variable Rho refers to the distance between the electron and proton… in its right form, the ‘m’ given above is actually the reduced mass of that system and not directly the mass of the electron, which gives us a system where the electron is actually orbiting the center of mass, not the proton.

Starting on this problem, it’s convenient to recognize that the Psi wave function is a product of a Ylm (angular wave function) with a Radial function. I started by dividing out the Ylm and losing it. Psi basically just becomes R.

5-13-16 radial equation 1

The first thing to do is take out the units. There is a lot of extra crap floating around in this differential equation that will obscure the structure of the problem. First, take the energy ‘E’ down into the denominator to consolidate the units, then make a substitution that hides the length unit by setting it to ‘one’. This makes Rho a multiple of ‘r’ involving energy. The ‘8’ wedged in here is crazily counter intuitive at this point, but makes the quantization work in the method I’ve chosen! I’ll point out the use when I reach it. At the last line, I substitute for Rho and make a bunch of cancellations. Also, in that last line, there’s an “= R” which fell off the side of the picture –I assure you it’s there, it just didn’t get photographed.

After you clean everything up and bringing the R over from the behind the equals sign, the differential equation is a little simpler…

5-13-16 radial equation 2

The ‘P’ and ‘Q’ are quick substitutions made so that I don’t have to work as hard doing all this math; they are important later, but they just need to be simple to use at the moment. I also make a substitution for R, by saying that R = U/r. This converts the problem from radial probability into probability per unit radius. The advantage is that it lets me break up the complicated differential expression at the beginning of the equation.

The next part is to analyze the ‘asymptotic behavior’ of the differential equation. This is simply to look at what terms become important as the radius variable grows very big or very small. In this case, if radius gets very big, certain terms become small before others. If I can consider the solution U to be a separable composition of parts that solve different elements of this equation, I can create a further simplification.

5-13-16 asymptotic correction

If you consider the situation where r is very very big, the two terms in this equation which are 1/r or 1/r^2 tend to shrink essentially to zero, meaning that they have no impact on the solution at big radii. This gives you a very simple differential equation at big radii, as written at right, which is solved by a simple exponential with either positive or negative roots. I discard the positive root solution because I know that the wave equation must suppress to zero as r goes far away and because the positive exponential will tend to explode, becoming bigger the further you get from the proton –this situation would make no physical sense because we know the proton and electron to be attractive to one another and solutions that have them favor being separated don’t match the boundaries of the problem. Differential equations are frequently like this: they have multiple solutions which fit, but only certain solutions that can be correct for a given situation –doing derivatives loses information, meaning that multiple equations can give the same derivative and in going backward, you have to cope with this loss of information. The modification I made allows me to write U as a portion that’s an unknown function of radius and a second portion that fits as a negative exponent. Hidden here is a second route to the same solution of this problem… if I considered the asymptotic behavior at small radii. I did not utilize the second asymptotic condition.

I just need now to find a way to work out the identity of the rest of this function. I substitute the U back in with its new exponentially augmented form…

5-13-16 Froebenius

With the new version of U, the differential equation rearranges to give a refined set of differentials. I then divide out the exponential so that I don’t have it cluttering things up. All this jiggering about has basically reduced the original differential equation to a skin and bones that still hasn’t quite come apart. The next technique that I apply is the Frobenius method. This technique is to guess that the differential equation can be solved by some infinite power series where the coefficients of each power of radius control how much a particular power shows up in the solution. It’s basically just saying “What if my solution is some polynomial expression Ar^2 -Br +C,” where I can include as many ‘r’s as I want. This can be very convenient because the calculus of polynomials is so easy. In the ‘sum,’ the variable n just identifies where you are in the series, whether at n=0, which just sets r to 1, or n=1000, which has a power of r^1000. In this particular case, I’ve learned that the n=0 term can actually be excluded because of boundary conditions since the probability per unit radius will need to go to zero at the origin (at the proton), and since the radius invariant term can’t do that, you need to leave it out… I didn’t think of that as I was originally working the problem, but it gets excluded anyway for a second reason that I will outline later.

The advantage of Frobenius may not be apparent right away, but it lets you reconstruct the differential equation in terms of the power series. I plug in the sum wherever the ‘A’ appears and work the derivatives. This relates different powers of r to different A coefficients. I also pull the 1/r and 1/r^2 into their respective sums to the same affect. Then, you rewrite two of the sums by advancing the coefficient indices and rewriting the labels, which allows all the powers of r to be the same power, which can be consolidated all under the same sum by omitting coefficients that are known to be zero. This has the effect of saying that the differential equation is now identically repeated in every term of the sum, letting you work with only one.

The result is a recurrence relation. For the power series to be a solution to the given differential equation, each coefficient is related to the one previous by a consistent expression. The existence of the recurrence relation allows you to construct a power series where you need only define one coefficient to immediately set all the rest. After all those turns and twists, this is a solution to the radial differential equation, but not in closed form.

Screwing around with all this math involved a ton of substitutions and a great deal of recasting the problem. That’s part of why solving the radial equation is challenging. Here is a collection of all the important substitutions made…

Collecting solution

As you can see, there is layer on layer on layer of substitution here. Further, you may not realize it yet, but something rather amazing happened with that number Q.

Quantize radial equation

If you set Q/4 = -n, the recurrence relation which generates the power series solution for the radial wave function cuts off the sequence of coefficients with a zero. This gives a choice for cutting off the power series after only a few terms instead of including the infinite number of possible powers, where you can choose how many terms are included! Suddenly, the sum drops into a closed form and reveals an infinite family of solutions that depend on the ‘n’ chosen as to cut off. Further, Q was originally defined as a function of energy… if you substitute in that definition and solve for ‘E,’ you get an energy dependent on ‘n’. These are the allowed orbital energies for the hydrogen atom.

This is an example of Quantization!

Having just quantized the radial wave function of the hydrogen atom, you may want to sit back and smoke a cigarette (if you’re into that sort of thing).

It’s opaque and particular to this strategy, but the ‘8’ I chose to add way back in that first substitution that converts Rho into r came into play right here. As it turns out, the 4 which resulted from pulling a 2 out of the square root twice canceled another 2 showing up during a derivative done a few dozen lines later and had the effect of keeping a 2 from showing up with the ‘n’ on top of the recurrence relation… allowing the solutions to be successive integers in the power series instead of every other integer. This is something you cannot see ahead, but has a profound, Rube Goldbergian effect way down the line. I had to crash into the extra two while doing the problem to realize it might be needed.

At this point, I’ve looked at a few books to try to validate my method and I’ve found three different ways to approach this problem, all producing equivalent results. This is only one way.

The recurrence relation also gives a second very important outcome:

n to l relation

The energy quantum number must be bigger than the angular momentum quantum number. ‘n’ must always be bigger than ‘l’ by at least 1. And secondarily, and this is really important, the unprimed n must also always be bigger than ‘l.’ This gives:

n’ = n > l

This constrains which powers of n can be added in the series solution. You can’t just start blindly at the zero order power; ‘n’ must be bigger than ‘l’ so that it never equals ‘l’ in the denominator and the primed number is always bigger too. If ‘l’ and ‘n’ are ever equal, you get an undefined term. One might argue that maybe you can include negative powers of n, but these will produce terms that are 1/r, which are asymptotic at the origin and blow up when the radius is small, even though we know from the boundary conditions that the probability must go to zero at the origin. There is therefore a small window of powers that can be included in the sum, going between n = l+1 and n = n’.

I spent some significant effort thinking about this point as I worked the radial problem this time; for whatever reason, it has always been hazy in my head which powers of the sum are allowed and how the energy and angular momentum quantum numbers constrained them. The radial problem can sometimes be an afterthought next to the intricacy of the angular momentum problem, but it is no less important.

For all of this, I’ve more or less just told you the ingredients needed to construct the radial wave functions. There is a big amount of back substitution and then you must work the recurrence relation while obeying the quantization conditions I’ve just detailed.

constructing solution

A general form for the radial wave equations appears at the lower right, fabricated from the back-substitutions. The powers of ‘r’ in the series solution must be replaced with the original form of ‘rho’ which now includes a constant involving mass, charge and Plank’s constant which I’ve dubbed the Bohr radius. The Bohr radius ao is a relic of the old Bohr atom model that I started off talking about and it’s used as the scale length for the modern version of the atom. The wave function, as you can see, ends up being a polynomial in radius multiplied by an exponential, where the polynomial is further multiplied by a single 1/radius term and includes terms that are powers of radial distance between l+1, where l is the angular momentum quantum number, and n’, the energy quantum number.

Here is how you construct a specific hydrogen atom orbital from all the gobbledigook written above. This is the simplest orbital, the S-orbital, where the energy quantum number is 1 and the angular momentum is 0. This uses the Y00 spherical harmonic, the simplest spherical harmonic, which more or less just says that the wave function does not vary across any angle, making it completely spherically symmetric.

Normalized S orbital

The ‘100’ attached in subscript to the Psi wave function is a physicist shorthand for representing the hydrogen atom wave functions: these subscripts are ‘nlm,’ the three quantum numbers that define the orbital, which are n=1, l=0 and m=0 in this case. All I’ve done to produce the final wave function is take my prescription from before and use it to construct one of an infinite series of possible solutions. I then perform the typical Quantum Mechanics trick of making it a probability distribution by normalizing it. The process of normalization is just to make certain that the value ‘under the curve’ contained by the square of the wave function, counted up across all of space in the integral, is 1. This way, you have a 100% chance of finding the particle somewhere in space as defined by the probability distribution of the wave function.

You can use the wave function to ask questions about the distribution of the electron in space around the proton –for instance, what’s the average orbital radius of the electron? You just look for the expectation value of the radius using the wave function probability distribution:

Average radius

For the hydrogen atom ground state, which is the lowest energy state for a 1 electron, 1 proton atom, the electron is distributed, on average, about 1 and a half Bohr radii from the nucleus. Bohr radius is about 0.52 angstrom (1×10^-10 meters), which means that the electron is on average distributed 0.78 angstroms from the nucleus.

(special note 8-2-17: If you’ve read my recent post on parity symmetry, you may be wondering why this situation doesn’t break parity. Average position can never be reported as anything other than zero for a pure eigenstate–and yet I’ve reported a positionally related average value other than zero right here. The reason this doesn’t break parity symmetry is because the radial distance is only fundamentally defined over “half” of space to begin with, from a radius of zero to a radius of infinity and with no respect for a direction from the origin. In asking “What’s average radius?” I’m not asking “What’s the average position?” Another way to look at this is that the radius operator Rho is a parity symmetric operator since it doesn’t reverse under parity transformation and it can connect states that have the same parity, allowing radial expectation values to be non-zero.)

Right now, this is all very abstract and mathematical, so I’ll jump into the more concrete by including some pictures. Here is a 3D density plot of the wave function performed using Mathematica.

S-orbital density

Definitely anticlimactic and a little bit blah, but this is the ground state wave function. We know it doesn’t vary in any angle, so it has to be spherically symmetric. The axes are distance in units of Bohr’s radius. One thing I can do to make it a little more interesting is to take a knife to it and chop it in half.

 

This is just the same thing bisected. The legend at left just shows the intensity of the wave function as represented in color.

As you can see, this is a far cry from the atomic model depicted in cartoon far above.

For the moment, I’m going to hang up this particular blog post. This took quite a long time to construct. Some of the higher energy, larger angular momentum hydrogenic wave functions start looking somewhat crazy and more beautiful, but I really just had it in mind to show the math which produces them. I may produce another post containing a few of them as I have time to work them out and render images of them. If the savvy reader so desires, the prescriptions given here can generate any hydrogenic wave function you like… just refer back to my Ylm post where I talk some about the spherical harmonics, or by referring directly to the Ylm tables in wikipedia, which is a good, complete online source of them anyway.

Edit:

Because I couldn’t leave it well enough alone, I decided to do images of one more hydrogen atom wave function. This orbital is 210, the P-orbital. I won’t show the equation form of this, but I did calculate it by hand before turning it over to Mathematica. In Mathematica, I’m not showing directly the wave function this time because the density plot doesn’t make clear intuitive sense, but I’m putting up the probability densities (which is the wave function squared).

P-orbital probabiltiy density

Mr. Peanut is the P-orbital. Here, angular momentum lies somewhere in the x-y plane since the z axis angular momentum eigenstate is zero. You can kind of think of it as a propeller where you don’t quite know which direction the axle is pointed.

Here’s a bisection of the same density map, along the long axis.

P-orbital probability density bisect

Edit 5-18-16

I keep finding interesting structures here. Since I was just sitting on all the necessary mathematical structures for hydrogen wave function 21-1 (no work needed, it was all in my notebook already), I simply plugged it into mathematica to see what the density plot would produce. The first image, where the box size was a little small, was perhaps the most striking of what I’ve seen thus far…

orbital21-1 squared

I knew basically that I was going to find a donut, but it’s oddly beautiful seen with the outsides peeled off. Here’s more of 21-1…

 

The donut turned out to be way more interesting than I thought. In this case, the angular momentum is pointing down the Z-axis since the Z-axis eigenstate is -1. This orbital shape is most similar qualitatively to the orbits depicted in the original Bohr atom model with an electron density that is known to be ‘circulating’ clockwise primarily within the donut. This particular state is almost the definition of a magnetic dipole.