Magnets, how do they work? (part 1)

Subtitle: Basic derivation of Ampere’s Law from the Biot-Savart equation.

Know your meme.

It’s been a while since this became a thing, but I think it’s actually a really good question. If you stop to think about it, magnets are one of those things where the structure goes deep down and the pieces which drive the phenomenon become quite confusing and mind bending. Truly, the original meme exploded from an unlikely source who wanted to relish in appreciating those things that seem magical without really appreciating how mind-bending and thought-expanding the explanation to this seemingly earnest question actually is.

As I got on in this writing, I realized that the scope of the topic is bigger than can be tackled in a single post. What is presented here will only be the first part. The succeeding posts may end up being as mathematical as this, but perhaps less so. Moveover, as I got to writing, I realized that I haven’t posted a good bit of math here in a while: what good is the the mathematical poetry of physics if nobody sees it?

Magnets do not get less magical when you understand how they work: they get more compelling.


This image, taken from a website that sells quackery, highlights the intriguing properties of magnets. A solid object with apparently no moving parts has this manner of influencing the world around it. How can that not be magical? Lodestones have been magic forever and they do not get less magical with the explanation.

Truthfully, I’ve been thinking about the question of how they work for a couple days now. When I started out, I realized that I couldn’t just answer this out of hand, even though I would like to think that I’ve got a working understanding of magnetic fields. How the details fit together gets deep in a hurry. What makes a bar magnet like the one in the picture above special? You don’t put batteries in it. You don’t flick a switch. It just works.

For most every person, that pattern above is the depth of how it works. How does it work? Well, it has a magnetic field. When a piece of a certain kind of metal is in a magnetic field, it feels a force due to the magnet and this causes the magnet to pull on it or maybe to stick to it. If you have two magnets together, you can orient them in a certain way and they push each other apart.


In this picture from penguin labs, these magnets are exerting sufficient force on one another that many of them apparently defy gravity. Here, the rod simply keeps the magnets confined so that they can’t change orientations with respect to one another and they exert sufficient repulsive force to climb up the rod as if they have no weight.

It’s definitely cool, no denying.

But, is it better knowing how they work, or just blindly appreciating them because it’s too hard to fill in the blank?

Maybe we can answer that.

The central feature of how magnets work is quite effortlessly explained by the physics of Electromagnetism. Or, maybe it’s better to say that the details are laboriously and completely explained. People rebel against how hard it is to understand the details, but no true explanation is required to be easily explicable.

The forces which hold those little pieces of metal apart are relatively understandable.

Lorentz force

Here’s the Lorentz force law. It says that the force (F) on an object with a charge is equal to sum of the electric force on the object (qE) plus the magnetic force (qvB). Magnets interact solely by magnetic force, the second term.


In this picture from Wikipedia, if a charge (q) moving with speed (v) passes into a region containing this thing we call a “magnetic field,” it will tend to curve in its trajectory depending on whether the charge is negative or positive. We can ‘see’ this magnetic field thing in the image above with the bar magnet and iron filings. What is it, how is it produced?

The fundamental observation of magnetic fields is tied up into a phenomenological equation called the Biot-Savart law.


This equation is immediately intimidating. I’ve written it in all of it’s horrifying Jacksonian glory. You can read this equation like a sentence. It says that all the magnetic field (B) you can find at a location in space (r) is proportional to a sum of all the electric currents (J) at all possible locations where you can find any current (r’) and inversely proportional to the square of the distance between where you’re looking for the magnetic field and where all the electrical currents are –it may say ‘inverse cube’ in the equation, but it’s actually an inverse square since there’s a full power of length in the numerator. Yikes, what a sentence! Additionally, the equation says that the direction of the magnetic field is at right angles to both the direction that the current is traveling and the direction given by the line between where you’re looking for magnetic field and where the current is located. These directions are all wrapped up in the arrow scripts on every quantity in the equation and are determined by the cross-product as denoted by the ‘x’. The difference between the two ‘r’ vectors in the numerator creates a pure direction between the location of a particular current element and where you’re looking for magnetic field. The ‘d’ at the end is the differential volume that confines the electric currents and simply means that you’re adding up locations in 3D space. The scaling constants outside the integral sign are geometrical and control strength; the 4 and Pi relate to the dimensionality of the field source radiated out into a full solid angle (it covers a singularity in the field due to the location of the field source) and the ‘μ’ essentially tells how space broadcasts magnetic field… where the constant ‘μ’ is closely tied to the speed of light. This equation has the structure of a propagator: it takes an electric current located at r’ and propagates it into a field at r.

It may also be confusing to you that I’m calling current ‘J’ when nearly every basic physics class calls it ‘I’… well, get used to it. ‘Current vector’ is a subtle variation of current.

I looked for some diagrams to help depict Biot-Savart’s components, but I wasn’t satisfied with what Google coughed up. Here’s a rendering of my own with all the important vectors labeled.

biotsavart diagram

Now, I showed the crazy Biot-Savart equation, but I can tell you right now that it is a pain in the ass to work with. Very few people wake up in the morning and say “Boy oh boy, Biot-Savart for me today!” For most physics students this equation comes with a note of dread. Directly using it to analytically calculate magnetic fields is not easy. That cross product and all the crazy vectors pointing in every which direction make this equation a monster. There are some basic feature here which are common to many fields, particularly the inverse square, which you can find in the Newtonian gravity formula or Coulomb’s law for electrostatics, and the field being proportional to some source, in this case an electric current, where gravity has mass and electrostatics have charge.

Magnetic field becomes extraordinary because of that flipping (God damned, effing…) cross product, which means that it points in counter-intuitive directions. With electrostatics and gravity, the field is usually going toward or away from the source, while magnetism has the field seems to be going ‘around’ the source. Moreover, unlike electrostatics and gravity, the source isn’t exactly a something, like a charge or a mass, it’s dynamic… as in a change in state; electric charges are present in a current, but if you have those charges sitting stationary, even though they are still present, they can’t produce a magnetic field. Moreover, if you neutralize the charge, a magnetic field can still be present if those now invisible charges are moving to produce a current: current flowing in a copper wire is electric charges that are moving along the wire and this produces a magnetic field around the wire, but the presence of positive charges fixed to the metal atoms of the wire neutralizes the negative charges of the moving electrons, resulting in a state of otherwise net neutral charge. So, no electrostatic field, even though you have a magnetic field. It might surprise you to know that neutron stars have powerful magnetic fields, even though there are no electrons or protons present in order give any actual electric currents at all. The requirement for moving charges to produce a magnetic field is not inconsistent with the moving charge required to feel force from a magnetic field as well. Admittedly, there’s more to it than just ‘currents’ but I’ll get to that in another post.

With a little bit of algebraic shenanigans, Biot-Savart can be twisted around into a slightly more tractable form called Ampere’s Law, which is one of the four Maxwell’s equations that define electromagnetism. I had originally not intended to show this derivation, but I had a change of heart when I realized that I’d forgotten the details myself. So, I worked through them again just to see that I could. Keep in mind that this is really just a speed bump along the direction toward learning how magnets work.

For your viewing pleasure, the derivation of the Maxwell-Ampere law from the Biot-Savart equation.

In starting to set up for this, there are a couple fairly useful vector identities.

Useful identities 1

This trio contains several basic differential identities which can be very useful in this particular derivation. Here, the variables r are actually vectors in three dimensions. For those of you who don’t know these things, all it means is this:


These can be diagrammed like this:

vector example

This little diagram just treats the origin like the corner of a 3D box and each distance is a length along one of the three edges emanating from the corner.

I’ll try not to get too far afield with this quick vector tutorial, but it helps to understand that this is just a way to wrap up a 3D representation inside a simple symbol. The hatted symbols of x,y and z are all unit vectors that point in the relevant three dimensional directions where the un-hatted symbols just mean a variable distance along x or y or z. The prime (r’) means that the coordinate is used to tell where the electric current is located while the unprime (r) means that this is the coordinate for the magnetic field. The upside down triangle is an operator called ‘del’… you may know it from my hydrogen wave function post. What I’m doing here is quite similar to what I did over there before. For the uninitiated, here are gradient, divergence and curl:


Gradient works on a scalar function to produce a vector, divergence works on a vector to produce a scalar function and curl works on a vector to produce a vector. I will assume that the reader can take derivatives and not go any further back than this. The operations on the right of the equal sign are wrapped up inside the symbols on the left.

One final useful bit of notation here is the length operation. Length operation just finds the length of a vector and is denoted by flat braces as an absolute value. Everywhere I’ve used it, I’ve been applying it to a vector obtained by finding the distance between where two different vectors point:


As you can see, notation is all about compressing operations away until they are very compact. The equations I’ve used to this point all contain a great deal of math lying underneath what is written, but you can muddle through by the examples here.

Getting back to my identity trio:

Useful identities 1

The first identity here (I1) takes the vector object written on the left and produces a gradient from it… the thing in the quotient of that function is the length of the difference between those two vectors, which is simply a scalar number without a direction as shown in the length operation as written above.

The second identity (I2) here takes the divergence of the gradient and reveals that it’s the same thing as a Dirac delta (incredibly easy way to kill an integral!). I’ve not written the operation as divergence on a gradient, but instead wrapped it up in the ‘square’ on the del… you can know it’s a divergence of a gradient because the function inside the parenthesis is a scalar, meaning that the first operation has to be a gradient, which produces a vector, which automatically necessitates the second operation to be a divergence, since that only works on vectors to produce scalars.

The third identity (I3) shows that the gradient with respect to the unprimed vector coordinate system is actually equal to a negative sign times the primed coordinate system… which is a very easy way to switch from a derivative with respect to the first r and the same form of derivative with respect to the second r’.

To be clear, these identities are tailor-made to this problem (and similar electrodynamics problems) and you probably will never ever see them anywhere but the *cough cough* Jackson book. The first identity can be proven by working the gradient operation and taking derivatives. The second identity can be proven by using the vector divergence theorem in a spherical polar coordinate system and is the source of the 4*Pi that you see everywhere in electromagnetism. The third identity can also be proven by the same method as the first.

There are two additional helpful vector identities that I used which I produced in the process of working this derivation. I will create them here because, why not! If the math scares you, you’re on the wrong blog. To produce these identities, I used the component decomposition of the cross product and a useful Levi-Civita kroenecker delta identity –I’m really bad at remembering vector identities, so I put a great deal of effort into learning how to construct them myself: my Levi-Civita is ghetto, but it works well enough. For those of you who don’t know the ol’ Levi-Civita symbol, it’s a pretty nice tool for constructing things in a component-wise fashion: εijk . To make this work, you just have to remember it as I just wrote it… if any indices are equal, the symbol is zero, if they are all different, they are 1 or -1. If you take it as ijk, with the indices all different as I wrote, it equals 1 and becomes -1 if you reverse two of the indices: ijk=1, jik=-1, jki=1, kji=-1 and so on and so forth. Here are the useful Levi-Civita identities as they relate to cross product:


Using these small tools, the first vector identity that I need is a curl of a curl. I derive it here:

vector id 1

Let’s see how this works. I’ve used colors to show the major substitutions and tried to draw arrows where they belong. If you follow the math, you’ll note that the Kroenecker deltas have the intriguing property of trading out indices in these sums. Kroenecker delta works on a finite sum the same way a Dirac delta works on an integral, which is nothing more than an infinite sum. Also, the index convention says that if you see duplicated indices, but without a sum on that index, you associate a sum with that index… this is how I located the divergences in that last step. This identity is a soft stopping point for the double curl: I could have used the derivative produce rule to expand it further, but that isn’t needed (if you want to see it get really complex, go ahead and try it! It’s do-able.) One will note that I have double del applied on a vector here… I said that it only applies on scalars above… in this form, it would only act on the scalar portion of each vector component, meaning that you would end up with a sum of three terms multiplied by unit vectors! Double del only ever acts on scalars, but you actually don’t need to know that in the derivation below.

This first vector identity I’ve produced I’ll call I4:

useful vector id 1

Here’s a second useful identity that I’ll need to develop:

useful vector id 2

This identity I’ll call I5:

vector id 2

*Pant Pant* I’ve collected all the identities I need to make this work. If you don’t immediately know something off the top of your head, you can develop the pieces you need. I will use I1, I2, I3, I4 and I5 together to derive the Maxwell-Ampere Law from Biot-Savart. Most of the following derivation comes from Jackson Electrodynamics, with a few small embellishments of my own.

first line amp devIn this first line of the derivation, I’ve rewritten Biot-Savart with the constants outside the integral and everything variable inside. Inside the integral, I’ve split the meat so that the different vector and scalar elements are clear. In what follows, it’s very important to remember that unprimed del operators are in a different space from the primed del operators: a value (like J) that is dependent on the primed position variable is essentially a constant with respect to the unprimed operator and will render a zero in a derivative by the unprimed del. Moreover, unprimed del can be moved into or out of the integral, which is with respect to the primed position coordinates. This observation is profoundly important to this derivation.

BS to amp 1

The usage of the first two identities here manages to extract the cross product from the midst of the function and puts it into a manipulable position where the del is unprimed while the integral is primed, letting me move it out of the integrand if I want.

BS to amp 2

This intermediate contains another very important magnetic quantity in the form of the vector potential (A) –“A” here not to be confused with the alphabetical placeholder I used while deriving my vector identities. I may come back to vector potential later, but this is simply an interesting stop-over for now. From here, we press on toward the Maxwell-Ampere law by acting in from the left with a curl onto the magnetic field…

BS to amp 3

The Dirac delta I end with in the final term allows me to collapse r’ into r at the expense of that last integral. At this point, I’ve actually produced the magnetostatic Ampere’s law if I feel like claiming that the current has no divergence, but I will talk about this later…

BS to amp 4

This substitution switches del from being unprimed to primed, putting it in the same terms as the current vector J. I use integration by parts next to switch which element of the first term the primed del is acting on.

BS to amp 5

Were I being really careful about how I depicted the integration by parts, there would be a unit vector dotted into the J in order to turn it into a scalar sum in that first term ahead of the integral… this is a little sloppy on my part, but nobody ever cares about that term anyway because it’s presupposed to vanish at the limits where it’s being evaluated. This is a physicist trick similar to pulling a rug over a mess on the floor –I’ve seen it performed in many contexts.

BS to amp 6

This substitution is not one of the mathematical identities I created above, this is purely physics. In this case, I’ve used conservation of charge to connect the divergence of the current vector to the change in charge density over time. If you don’t recognize the epic nature of this particular substitution, take my word for it… I’ve essentially inverted magnetostatics into electrodynamics, assuring that a ‘current’ is actually a form of moving charge.

BS to amp 75

In this line, I’ve switched the order of the derivatives again. Nothing in the integral is dependent on time except the charge density, so almost everything can pass through the derivative with respect to time. On the other hand, only the distance is dependent on the unprimed r, meaning that the unprimed del can pass inward through everything in the opposite direction.

BS to amp 8

At this point something amazing has emerged from the math. Pardon the pun; I’m feeling punchy. The quantity I’ve highlighted blue is a form of Coulomb’s law! If that name doesn’t tickle you at the base of your spine, what you’re looking at is the electrostatic version of the Biot-Savart law, which makes electric fields from electric charges. This is one of the reasons I like this derivation and why I decided to go ahead and detail the whole thing. This shows explicitly a connection between magnetism and electrostatics where such connection was not previously clear.

BS to amp 9

And thus ends the derivation. In this casting, the curl of the magnetic field is dependent both on the electric field and on currents. If there is no time varying electric field, that first term vanishes and you get the plain old magnetostatic Ampere’s law:

Ampere's law

This says simply that the curl of the magnetic field is equal to the current. There are some interesting qualities to this equation because of how the derivation leaves only a single positional dependence. As you can see, there is no separate position coordinate to describe magnetic field independently from its source. And, really, it isn’t describing the magnetic field as ‘generated’ by the current, but rather that a deformation to the linearity of the magnetic field is due to the presence of a current at that location… which is an interesting way to relate the two.

This relationship tends to cause magnetic lines to orbit around the current vector.


This image from hyperphysics sums up the whole situation –I realize I’ve been saying something similar from way up, but this equation is proof. If you have current passing along a wire, magnetic field will tend to wrap around the wire in a right handed sense. For all intents and purposes, this is all the Ampere’s law says, neglecting that you can manipulate the geometry of the situation to make the field do some interesting things. But, this is all.

Well, so what? I did a lot of math. What, if anything, have I gained from it? How does this help me along the path to understanding magnets?

The Ampere Law is useful in generating very simple magnetic field configurations that can be used in the Lorentz force law, ultimately showing a direct dynamical connection between moving currents and magnetic fields. I have it in mind to show a freshman level example of how this is done in the next part of this series. Given the length of this post, I will do more math in a different post.

This is a big step in the direction of learning how magnets work, but it should leave you feeling a little unsatisfied. How exactly do the forces work? In physics, it is widely known that magnetic fields do no work, so why is it that bar magnets can drag each other across the counter? That sure looks like work to me! And if electric currents are necessary to drive magnets, why is it that bar magnets and horseshoe magnets don’t require batteries? Where are the electric currents that animate a bar magnet and how is it that they seem to be unlimited or unpowered? These questions remain to be addressed.

Until the next post…

Nuclear Toxins

A physicist from Lawrence Livermore Labs has been restoring old nuclear bomb detonation footage. This seems to me to be an incredibly valuable task because all of the original footage was shot on film, which is currently in the process of decaying and falling apart. There have been no open air nuclear bomb detonations on planet Earth since probably the 1960s, which is good… except that people are in the process of forgetting exactly how bad a nuclear weapon is. The effort of saving this footage makes it possible for people to know something about this world-changing technology that wasn’t previously declassified. Nukes are sort of mythical to a body like me who wasn’t even born until about the time that testing went underground: to everybody younger than me, I suspect that nukes are an old-people thing, a less important weapon than computers. That Lawrence Livermore Labs has posted this footage to Youtube is an amazing public service, I think.

As I was reading an article on Gizmodo about this piece of news, I happened to wander into the comment threads to see what the echo chamber had to say about all this. I should know better. Admittedly, I actually didn’t post any comments castigating anyone, but there was a particular comment that got me thinking… and calculating.

Here is the comment:

Nuclear explosions produce radioactive substances that are rare in nature — like carbon-14, a radioactive form of the carbon atom that forms the chemical basis of all life on earth.

Once released into the atmosphere, carbon-14 enters the food chain and gets bound up in the cells of most living things. There’s still enough floating around for researchers to detect in the DNA of humans born in 2016. If you’re reading this, it’s inside you.

This is fear mongering. If you’ve never seen fear mongering before, this is what it looks like. The comment is intended to deliberately inspire fear not just in nuclear weapons, but in the prospect of radionuclides present in the environment. The last sentence is pure body terror. Dear godz, the radionuclides, they’re inside me and there’s no way to clean them out! I thought for a time about responding to this comment. I decided not to because there is enough truth here that anyone should probably stop and think about it.

For anyone curious, the wikipedia article on the subject has some nice details and seems thorough.

It is true the C-14 is fairly rare in nature. The natural abundance is 1 part per trillion of carbon. It is also true that the atmospheric test detonations of nuclear bombs created a spike in the C-14 present in the environment. And, while it’s true that C-14 is rare, it is actually not technically unnatural since it is formed by cosmic rays impinging on the upper atmosphere. For the astute reader, C-14 produced by cosmic rays forms the basis of radiocarbon dating since C-14 is present at a particular known, constant proportion in living things right up until you die and stop uptaking it from the environment –a scientist can then determine the date when living matter died based on the radioactive decay curve for C-14.

Since it’s not unnatural, the real question here is whether the spike of radionuclides created by nuclear testing significantly increases the health hazard posed by excess C-14 above and beyond what it would normally be. You have it in your body anyway, is there greater hazard due to the extra amount released? This puzzle is actually a somewhat intriguing one to me because I worked for a time with radionuclides and it is kind of chilling all the protective equipment that you need to use and all the safety measures that are required. The risk is a non-trivial one.

But, what is the real risk? Does having a detectable amount of radionuclide in your body that can be ascribed to atomic air tests constitute an increased health threat?

To begin with, what is the health threat? For the particular case of C-14, one of a handful of radionuclides that can be incorporated into your normal body structures, the health threat would obviously come from the radioactivity of the atom. In this particular case, C-14 is a beta-emitter. This means that C-14 radiates electrons; specifically, one of the neutrons in the atom’s nucleus converts into a proton by giving off an electron and a neutrino, resulting in the carbon turning into nitrogen. The neutrino basically doesn’t interact with anything, but the radiated electron can travel with energies of 156 keV (or about 2.4×10^-14 Joules). This will do damage to the human body in two routes, either by direct collision of the radiated electron with the body, or by a structurally important carbon atom converting into a nitrogen atom during the decay process if the C-14 was part of your body already. Obviously, if a carbon atom turns suddenly into nitrogen, that’s conducive to organic chemistry occurring since nitrogen can’t maintain the same number of valence interactions as carbon without taking on a charge. So, energy deposition by particle collision, or spontaneous chemistry is the potential cause of the health threat.

In normal terms, the carbon-nitrogen chemistry routes for damage are not accounted for in radiation damage health effects simply because of how radiation is usually encountered: you need a lot of radiation in order to have a health effect, and this is usually from an exogenous source, that is, provided by a radiation source that is outside the body rather than incorporated with it, like endogenous C-14. This would be radiation much like the UV radiation which causes a sunburn. Heath effects due to radiation exposure are measured on a scale by a dose unit called a ‘rem.’ A rem expresses an amount of radiation energy deposited into body mass, where 1 rem is equal to 1.0×10^-5 Joules of radiation energy deposited into 1 gram of body mass. Here is a table giving the general scale of rem doses which causes health effects. People who work around radiation as part of their job are limited to a full-body yearly dose of 5 rem, while the general public is limited to 0.1 rem per year. Everybody is expected to have an environmental radiation dose exposure of about 0.3 rem per year and there’s an allowance of 0.05 rem per year for medical x-rays. It’s noteworthy that not all radiation doses are created equal and that the target body tissue matters; this is manifest by different radiation doses being allowed to occur to the eyes (15 rem) or the extremities, like the skin (50 rem). A sunburn would be like a dose of 100 to 600 rem to the skin.

What part of an organism must the damage affect in order to cause a health problem? Really, only one is truly significant, and that’s your DNA. Easy to guess. Pretty much everything else is replaceable to the extent that even a single cell dying from critical damage is totally expendable in the context of an organism built of a trillion cells. The problem of C-14 being located in your DNA directly is numerically a rather minor problem: DNA actually only accounts for about 3% of the dry mass of your cells, meaning that only about 3% of the C-14 incorporated into your body is directly incorporated into your DNA, so that most of the damage to your DNA is due to C-14 not directly incorporated in that molecule. This is not to say that chemistry doesn’t cause the damage, merely that most of the chemical damage is probably due to energy deposition in molecules around the DNA which then react with the DNA, say by generation of superoxides or similar paths. This may surprise you, but DNA damage isn’t always a complete all-or-nothing proposition either: to an extent, the cell has machinery which is able to repair damaged DNA… the bacterium Dienococcus radiodurans is able to repair its DNA so efficiently that it’s able to subsist indefinitely inside a nuclear reactor. Humans have some repair mechanisms as well.

Cells handling radiation damage in humans have about two levels of response. For minor damage, the cell repairs its DNA. If the DNA damage is too great to fix, a mechanism triggers in the cell to cause it to commit suicide. You can see the effect of this in a sunburn: critically radiation damaged skin cells commit suicide en mass in the substratum of your skin, ultimately sacrificing the structural integrity of your skin, causing the external layer to sough off. This is why your skin peels due to a sunburn. If the damage is somewhere in between, matters are a little murkier… your immune system has a way of tracking down damaged cells and destroying them, but those screwed up cells sometimes slip through the cracks to cause serious disease. Inevitably cancer. Affects like these emerge for ~20 rem full body doses. People love to worry about superpowers and three-arm, three-eye type heritable mutations due to radiation exposure, but congenital mutations are a less frequent outcome simply because your gonads are such a small proportion of your body; you’re more likely to have other things screwed up first.

One important trick in all of this to notice is that to start having serious health effects that can be clearly ascribed to radiation damage, you must absorb a dose of greater than about 5 rem.

Now, what kind of a radiation dose do you acquire on a yearly basis from body-incorporated C-14 and how much did that dose change in people due to atmospheric nuclear testing?

I did my calculations on the supposition of a 70 kg person (which is 154 lbs). I also adjusted rem into a more easily used physical quantity of Joules/gram (1 rem = 1×10^-5 J/g, see above.)  One rem of exposure for a 70 kg person works out to an absorbed dose of 0.7 J/year. An exposure sufficient to hit 5 rems is 3.5 J/year while 20 rem is 14 J/year. Beta-electrons from c-14 maximally hit with 2.4×10^-14 J/strike (150 keV) with about 0.8×10^-14 J/hit on average (50 keV).

In the following part of the calculation, I use radioactive decay and half-life in order to determine the rate of energy transference to the human body on the assumption that all the beta-electron energy emitted by radiation is absorbed by the human body. Radiation rates are a purely probabilistic event where the likelihood of seeing a radiated electron is proportional to the size of the radioactive atom population. The differential equation is a simple one and looks like this:

decay rate differential equation

This just means that the rate of decay (and therefore electron production rate) is proportional to the size of the decaying population where the k variable is a rate constant that can be determined from the half-life. The decay differential equation is solved by the following function:

exponential decay

This is just a simple exponential decay which takes an initial population of some number of objects and reduces it over time. You can solve for the decay constant by plugging the half-life into the time and simply asserting that you have 1/2 of your original quantity of objects at that time. The above exponential rearranges to find the decay constant:

decay constant

Here, Tau is the half-life in seconds (I could have used my time as years, but I’m pretty thoroughly trained to stick with SI units) and I’ve already substituted 1/2 for the population change. With k from half-life, I just need the population of radiation emitters present in the body in order to know the rate given in the first equation above… where I would simply multiply k by N.

To do this calculation, the half-life of C-14 is known to be 5730 years, which I then converted into seconds (ick; if I only care about years, next time I only calculate in years). This gives a decay constant of 3.836×10^-12 emissions/sec. In order to get the decay rate, I also need the population of C-14 emitters present in the human body. We know that C-14 has a natural prevalence of 1 per trillion and also that a 70 kg human body is 16 kg carbon after a little google searching, which gives me 1.6×10^-8 g of C-14. With C-14’s mass of 14 g/mole and Avagadro’s number, this gives about 6.88×10^14 C-14 atoms present in a 154 lb person. This population together with the rate constant gives me the decay rate by the first equation above, which is 2.639×10^3 decays per second. Energy per beta-electron absorbed times the decay rate gives the rate of energy deposited into the body per second on the assumption that all beta-decay energy is absorbed by the target: 2.639×10^3 decays/sec * 2.4×10^-14 Joules/decay = 6.33 x 10^-11 J/s. For the course of an entire year, the amount of energy works out to about 0.002 Joules/year.

This gets me to a place where I can start making comparisons. The exposure limit for any old member of the general public to ‘artificial’ radiation is 0.1 rem, or 0.07 J/year. The maximum… maximum… contribution due to endogenous C-14 is 35 times smaller than the allowed public exposure limits (for mean energy, it’s more like 100 times smaller). On average, endogenous C-14 gives 1/100th of the allowed permitted artificial radiation dose.

But, I’ve actually fudged here. Note that I said above that humans normally get a yearly environmental radiation dose of about 0.3 rem (0.21 J/year)… meaning that endogenous C-14 only provides about 1/300th of your natural dose. Other radiation sources that you encounter on a daily basis provide radiation exposure that is 300 times stronger than C-14 directly incorporated into the structure of your body. And, keep in mind that this is way lower than the 5 rem where health effects due to radiation exposure begin to emerge.

How does C-14 produced by atmospheric nuclear testing figure into all of this?

The wikipedia article I cited above has a nice histogram of detected changes in the environmental C-14 levels due to atmospheric nuclear testing. At the time of such testing, C-14 prevalence spiked in the environment by about 2 fold and has decayed over the intervening years to be less than 1.1-fold. This has an effect on C-14 exposure specifically of changing it from 1/300th of your natural dose to 1/150th, or about 0.5%, which then tapers to less than a tenth of a percent above natural prevalence in less than fifty years. Detectable, yes. Significant? No. Responsible for health effects…… not above the noise!

This is not to say that a nuclear war wouldn’t be bad. It would be very bad. But, don’t exaggerate environmental toxins. We have radionuclides present in our bodies no matter what and the ones put there by 1950s nuclear testing are only a negligible part, even at the time –what’s 100% next to 100.5%? A big nuclear war might be much worse than this, but this is basically a forgettable amount of radiation.

For anybody who is worried about environmental radiation, I draw your attention back to a really simple fact:


The woman depicted in the picture above has received a 100 to 600 rem dose of very (very very) soft X-rays by deliberately sitting out in front of a nuclear furnace. You can even see the nuclear shadow on her back left by her scant clothing. Do you think I’m kidding? UV light, which is lower energy than x-rays, but not by that much… about 3 eV versus maybe 500 eV, is ionizing radiation which is absorbed directly by skin DNA to produce real radiation damage, which your body treats indistinguishably from how it treats damage from particle radiation of radionuclides or X-rays or gamma-rays. The dose which produced this affect is something like two to twelve times higher than the federally permitted dose that radiation workers are allowed to receive in their skin over the course of an entire year… and she did it to herself deliberately in a matter hours!

Here’s a hint, don’t worry about the boogieman under the bed when what you just happily did to yourself over the weekend among friends is much much worse.

Calculating Molarity (mole/L)

As a preface to this post, I want to make doubly clear my stance on vaccines. There is no good scientific evidence to support the notion that vaccination is in any way an unsafe practice or that it is responsible for any manner of health problem above and beyond the diseases that vaccines protect against. Vaccination is the single most powerful health intervention created in the last 150 years of medicine. There is, in my opinion, some potential for this post to be used to damage the credibility of a person who I believe to be a necessary positive force in the Healthcare scene and I want to make it clear that this was not the intention of my writing here. Orac is a tireless advocate for science and for clear, skeptical thought in general and I respect him quite deeply for the time he puts in and for putting up with the static he puts up with.

That said, I believe that science advocacy is a double edged sword: if you didn’t get it right, it can come back to bite you.

I love Respectful Insolence, but I’ve got to ding Orac for failing to calculate molarity correctly. He is profoundly educated, but I think he’s a surgeon and not a physicist. We all have our weak points! (Thank heaven above I’m not ever in the operating room with the knife!)

In this post, which he may now have edited for correctness (and it seems he has), he makes the following statement:

More importantly, look at the numbers of precipitates found per sample. It ranges from two to 1,821.

O.M.G.! 1,821 particles! Holy crap! That’s horrible! The antivaxers are right that vaccines are hopelessly contaminated!

No. They. Are. Not.

Look at it this way. This is what was found in 20 μl (that’s microliters) of liquid. That’s 0.00002 liters. That means, in a theoretical liter of the vaccine, the most that one would find is 91,050,000 (9.105 x 107) particles! Holy hell! That’s a lot. We should be scared, shouldn’t we? well, no. Let’s go back to our homeopathy knowledge and look at Avogadro’s number. One mole of particles = 6.023 x 1023. So divide 91,050,000 by Avogadro’s number, and you’ll get the molarity of a solution of 91,050,000 particle in a liter, as a 1 M solution would contain 6.023 x 1023 particles. So what’s the concentration:

1.512 x 10-16 M. that’s 0.15 femtomolar (fM) (or 150 altomolar), an incredibly low concentration. And that’s the highest amount the investigators found.

Anybody see the mistake? Let’s start here: Avogadro’s number is a scaling constant for a linear relationship and it has a unit! The units on this number are atoms(or molecules) per mole. It converts a number of atoms or molecules into a number of moles.

‘Moles’ is a convenient person-sized number that is standardized around ‘molecular weight,’ which is a weight unit that arbitrarily says that a single carbon atom has a weight of ’12’ and results in atomic hydrogen having a weight of ‘1.’ That’s atomic mass units (or AMU), which is usually very convenient for calculating relative weights of molecules by adding up all the AMU of their atomic constituents. To use molarity, we usually need a molecular weight in the form of Daltons, or grams/mole. Grams per mole says that it takes this many grams in mass of a substance for that substance to contain a single mole’s worth of molecules (or atoms) where it is then implicit that the number of molecules or atoms is Avogadro’s number.

‘Mole’ is extremely special. It refers to a collection of objects that are atomically identical! If you have a mole of a kind of protein, it means that you have 6.02 x 10^23 number of this kind of identical object. If you make a comparison between two proteins, the same molar number of each with a different molecular weight is a different overall mass. Consider Insulin (5808 g/mole) compared to the 70S Ribosome (2,500,000 g/mole)… one mole of Insulin would weigh 5.8 kg while one mole of 70S Ribosome would weigh 2.5 metric tons!!! If they have roughly the average density of proteins, what would be the volume of 1 mole of 70S ribosome as compared to 1 mole of Insulin? It would be 430 times greater for the Ribosome; 2900 L for 70S Ribosome while Insulin is about 6 L!

Notice something here: an object with a big molecular weight occupies a bigger volume than the same object of a smaller molecular weight… regardless of the fact that they are at the same molarity. Molarity as a number depends strongly on the molecular weight of the substance in question in order to mean anything at all. For the Ribosome, the same molar concentration as for Insulin means a solution containing a much larger amount of solute.

In the post in question on Respectful Insolence, Orac is talking about a paper which observes particulate matter derived from vaccine specimens in an SEM. It is clear from the authorship and publication of the paper that the intent is to find fault in vaccines based upon the contents of materials examined by this probing… from what little I know about the paper, it does not seem to be producing any information that is truly that informative. But, you can’t fault a paper on a point that may not actually be as flawed as an initial interpretation would imply. The paper reports number of particles observed per 20 uL of a solvent. They find as many as 1,821 particles per 20 uL. We are not told for certain what these particles are composed of except that the investigators aren’t sure and shot an overpower EDS at everything and reported even the spurious results. Orac scales up this number to 1L to get 90.1 x 10^7 particles and then divides by Avogadro’s number to find what proportion this is of one mole of these particles, never mind that we don’t know how big the particles are in terms of molecular weight or how dense in volume per mass. He declares it to be a tenth of a femtomole and runs on with how tiny the concentration is. As I initially wrote this, I focused on the gleeful way in which Orac does his deconstruction in large part because it really isn’t a valid thing to laugh at when the deconstruction is not properly done.

Here is how someone of my background approaches the same series of observations. I can see from the micrograph in the blog post that the scale bar is something like 2 mm (2000 microns)… the objects in question are maybe tens to hundreds of microns in size. Let’s make a physicist supposition here and think about it: pulling this out of my ass, I’ll claim these are 1,821 approximately spherical identical particles of sodium chloride, each of 40 microns diameter. That gives a volume of 4/3*Pi*20^3 um^3 or 1.9 x 10^-12 m^3 per particle and 3.5 x 10^-9 m^3 for the whole collection of particles. Now, density usually is given in terms of g/cm^3 or g/mL… there are 100 cm per meter and you must convert three times to cube it, so 3.5 x 10^-9 x 100^3 = 3.5 x 10^-3 cm^3. Wait a minute, we’re now at a volume of 3.5 uL!!! Did you see that? A cubic centimeter is a mL and 0.0035 mL is 3.5 uL, or 17% of the original 20 uL sample volume! What molarity is this? The density of sodium chloride is 2.16 g/mL or 2.16 mg/uL… which is 7.56 mg. That’s 7.56 mg of salt dissolved in 20 uL. The molecular weight of sodium chloride is 58.44 g/mole or 58.44 mg/mmole, which gives .129 mmole. From this .129 mmole in .02 mL is 6.47 mmole/mL.

That’s 6.47 mole/L……. 6.47 M!!!!

Let’s pause for a second. Is that femtomolar?

Orac missed the science here! I initially wrote that he should be apologizing for it, but I’ve revised this so that my respect for his work is more apparent. The volume of these particles and their composition is everything. A single particle with a molecular weight in the gigadaltons or teradaltons range is suddenly a very substantial mass in low particle number. If these particles are as I specified and composed of simple salt, they are at a molarity that is abruptly appreciable. If we make these into tiny balls of Ricin, that’s unquestionably a fatally toxic quantity!

As with all things, dose makes the poison and there’s no Ricin in evidence, but this argument Orac has made about concentration, in this particular case is catastrophically wrong. A femtomole of a big particle that can be dissolved could be a large dose!

I forgive him and I love his blog, but let this be a lesson… you don’t just divide by Avogadro’s number in order to get meaningful concentrations!

Hydrogen atom radial equation

In between the Sakurai problems, I decided to tackle a small problem I set for myself.

The Sakurai quantum mechanics book is directed at about graduate student level, meaning that it explicitly overlooks problems that it deems too ‘undergraduate.’ When I started into the next problem in the chapter, which deals with the Wigner-Eckert relation, I decided to direct myself at a ‘lower level’ problem that demands practice from time to time. I worked in early January solving the angular component of the hydrogen atom by deriving the spherical harmonics and much of my play time since has been devoted to angular and angular momentum type problems. So, I decided it would be worth switching up a little and solving the radial portion of the hydrogen atom electron central force problem.

One of my teachers once suggested that deriving the hydrogen atom was a task that any devoted physicist should play with every other year or so. Why not, I figured; the radial solution is actually a bit more mind boggling to me than the angular parts because it requires some substitutions that are not very intuitive.

The hydrogen atom problem is a classic problem mainly because it’s one of the last exactly solvable quantum mechanics problems you ever encounter. After the hydrogen atom, the water gets deeper and the field starts to focus on tools that give insight without actually giving exact answers. The only atomic system that is exactly solvable is the hydrogen atom… even helium, with just one more electron, demands perturbation in some way. It isn’t exactly crippling to the field because the solutions to all the other atoms are basically variations of the hydrogen atom and all, with some adjustment, have hydrogenic geometry or are superpositions of hydrogen-like functions that are only modified to the extent necessary to make the energy levels match. Solving the hydrogen atom ends up giving profound insight to the structure of the periodic table of the elements, even if it doesn’t actually solve for all the atoms.

As implied above, I decided to do a simplified version of this problem, focusing only on the radial component. The work I did on the angular momentum eigenstates was not in context of the hydrogen electron wave function, but can be inserted in a neat cassette to avoid much of the brute labor of the hydrogen atom problem. The only additional work needed is solving the radial equation.

A starting point here is understanding spherical geometry as mediated by spherical polar coordinates.

A hydrogen atom, as we all know from the hard work of a legion of physicists coming into the turn of the century, is a combination of a single proton with a single electron. The proton has one indivisible positive charge while the electron has one indivisible negative charge. These two charges attract each other and the proton, being a couple thousand times more massive, pulls the electron to it. The electron falls in until the kinetic energy it gains forces it to have enough momentum to be unlocalized to a certain extent, as required by quantum mechanical uncertainty. The system might then radiate photons as the electron sorts itself into a stable orbiting state. The resting combination of proton and electron has neutral charge with the electron ‘distributed’ around the proton in a sort of cloud as determined by its wave-like properties.

The first approximation of the hydrogen atom is a structure called the Bohr model, proposed by Niels Bohr in 1913. The Bohr model features classical orbits for the electron around the nucleus, much like the moon circles the Earth.


This image, from, is a crude example of a Bohr atom. The Bohr atom is perhaps the most common image of atoms in popular culture, even if it isn’t correct. Note that the creators of this cartoon didn’t have the wherewithall to make a ‘right’ atom, giving the nucleus four plus charges and the shell three minus… this would be a positively charged ion of Beryllium. Further, the electrons are not stacked into a decent representation for the actual structure: cyclic orbitals would be P-orbitals or above, where Beryllium has only S-orbitals for its ground state, which possess either no orbital angular momentum, or angular momentum without any defined direction. But, it’s a popular cartoon. Hard to sweat the small stuff.

The Bohr model grew from the notion of the photon as a discrete particle, where Bohr postulated that the only allowed stable orbits for the electron circling the nucleus is at integer quantities of angular momentum delivered by single photons… as quantized by Planck’s constant. ‘Quantized’ is a word invoked to mean ‘discrete quantities’ and comes back to that pesky little feature Deepak Chopra always ignores: the first thing we ever knew about quantum mechanics was Planck’s constant –and freaking hell is Planck’s constant small! ‘Quantization’ is the act of parsing into discrete ‘quantized’ states and is the word root which loaned the physics field its name: Quantum Mechanics. ‘Quantum Mechanics’ means ‘the mechanics of quantization.’

Quantum mechanics, as it has evolved, approaches problems like the hydrogen atom using descriptions of energy. In the classical sense, an electron orbiting a proton has some energy describing its kinetic motion, its kinetic energy, and some additional energy describing the interaction between the two masses, usually as a potential source of more kinetic energy, called a potential energy. If nothing interacts from the outside, the closed system has a non-varying total energy which is the sum of the kinetic and potential energies. Quantum mechanics evolved these ideas away from their original roots using a version of Hamiltonian formalism. Hamiltonian formalism, as it appears in quantum, is a way to merely sum up kinetic and potential energies as a function of position and momentum –this becomes complicated in Quantum because of the restriction that position and momentum cannot be simultaneously known to arbitrary precision. But, Schrodinger’s equation actually just boils down to a statement of kinetic energy plus potential energy.

Here is a quick demonstration of how to get from a statement of total energy to the Schrodinger equation:

5-12-16 schrodinger

After ‘therefore,’ I’ve simply multiplied in from the right with a wave function to make this an operator equation. The first term on the left is kinetic energy in terms of momentum while the second term is the Gaussian CGS form of potential energy for the electrical central force problem (for Gaussian CGS, the constants of permittivity and permeability are swept under the rug by collecting them into the speed of light and usually a constant of light speed appears with magnetic fields… here, the charge is in statcoulombs, which take coulombs and wrap in a scaling constant of 4*Pi.) When you convert momentum into its position space representation, you get Schrodinger’s time independent equation for an electron under a central force potential. The potential, which depends on the positional expression of ‘radius,’ has a negative sign to make it an attractive force, much like gravity.

Now, the interaction between a proton and an electron is a central force interaction, which means that the radius term could actually be pointed in any direction. Radius would be some complicated combination of x, y and z. But, because the central force problem is spherically symmetric, if we could move out of Cartesian coordinates and into spherical polar, we get a huge simplification of the math. The inverted triangle that I wrote for the representation of momentum is a three dimensional operator called the Laplace operator, or ‘double del.’ Picking the form of del ends up casting the dimensional symmetry of the differential equation… as written above, it could be Cartesian or spherical polar or cylindrical, or anything else.

A small exercise I sometimes put myself through is defining the structure of del. The easiest way that I know to do this is to pull apart the divergence theory of vector calculus in Spherical polar geometry, which means defining a differential volume and differential surfaces.

5-12-16 central force 2

Well, that turned out a little neater than my usual meandering crud.

This little bit of math is defining the geometry of the coordinate variables in spherical polar coordinates. You can see the spherical polar coordinates in the Cartesian coordinate frame and they consist of a radial distance from the origin and two angles, Phi and Theta, that act at 90 degrees from each other. If you pick a constant radius in spherical polar space, you get a spherical surface where lines of constant Phi and Theta create longitude and latitude lines, respectively, making a globe! You can establish a right handed coordinate system in spherical polar space by picking a point and considering it to be locally Cartesian… the three dimensions at this point are labeled as shown, along the outward radius and in the directions in which each of the angles increases.

If you were to consider an infinitesimal volume of these perpendicular dimensions, at this locally cartesian point, it would be a volume that ‘approaches’ cubic. But then, that’s the key to calculus: recognizing that 99.999999 effectively approaches 100. So then, this framework allows you to define the calculus occurring in spherical polar space. The integral performed along Theta, Phi and Rho would be adding up tiny cubical elements of volume welded together spherically, while the derivative would be with respect to each dimension of length as locally defined. The scaling values appear because I needed to convert differentials of angle into linear length in order to calculate volume, which can be accomplished by using the definition of the radian angle, which is arc length per radius –a curve is effectively linear when an arc becomes so tiny as to be negligible when considering the edges of an infinitesimal cube, like thinking about the curvature of the Earth effecting the flatness of the sidewalk outside your house.

The divergence operation uses Green’s formulas to say that a volume integral of divergence relates to a surface integral of flux wrapping across the surface of that same volume… and then you simply chase the constants. All that I do to find the divergence differential expression is to take the full integral and remove the infinite sum so that I’m basically doing algebra on the infinitesmal pieces, then literally divide across by the volume element and cancel the appropriate differentials. There are three possible area integrals because the normal vector is in three possible directions, one each for Rho, Theta and Phi.

The structure becomes a derivative if the volume is in the denominator because volume has one greater dimension than any possible area, where the derivative is with respect to the dimension of volume that doesn’t cancel out when you divide against the areas. If a scaling variable used to convert theta or phi into a length is dependent on the dimension of the differential left in the denominator, it can’t pass out of the derivative and remains inside at completion. The form of the divergence operation on a random vector field appears in the last line above. The value produced by divergence is a scalar quantity with no direction which could be said to reflect the ‘poofiness’ of a vector field at any given point in the space where you’re working.

I then continued by defining a gradient.

5-12-16 central force 1

Gradient is basically an opposite operation from divergence. Divergence creates a scalar from a vector which represents the intensity of ‘divergence’ at some point in a smooth function defined across all of space. Gradient, on the other hand, creates a vector field out of a scalar function, where the vectors point in the dimensional direction where the function tends to be increasing.

This is kind of opaque. One way to think about this is to think of a hill poking out of a two dimensional plane. A scalar function defines the topography of the hill… it says simply that at some pair of coordinates in a plane, the geography has an altitude. The gradient operation would take that topography map and give you a vector field which has a vector at every location that points in the direction toward which the altitude is increasing at that location. Divergence then goes backward from this, after a fashion: it takes a vector map and coverts it into a map which says ‘strength of change’ at every location. This last is not ‘altitude’ per se, but more like ‘rate at which altitude is changing’ at a given point.

The Laplace operator combines gradient with divergence as literally the divergence of a gradient, denoted as ‘double del,’ the upside-down triangle squared.

In the last line, I’ve simply taken the Laplace operator in spherical polar coordinates and dropped it into its rightful spot in Schrodinger’s equation as shown far above. Here, the wave equation, called Psi, is a density function defined in spherical polar space, varying along the radius (Rho) and the angles Theta and Phi (the so-called ‘solid angle’). Welcome to greek word salad…

What I’ve produced is an explicit form for Schrodinger’s equation with a coordinate set that is conducive to the problem. This differential equation is a multivariate second order partial differential equation. You have to solve this by separation of variables.

Having defined the hydrogen atom Schrodinger equation, I now switch to the more simple ‘radial only’ problem that I originally hinted at. Here’s how you cut out the angular parts:

5-12-16 radial schrodinger equation

You just recognize that the second and third differential terms are collectively the square of the total angular momentum and then use the relevant eigenvalue equation to remove it.

The L^2 operator comes out of the kinetic energy contained in the electron going ‘around.’ For the sake of consistency, it’s worth noting that the Hamiltonian for the full hydrogen atom contains a term for the kinetic energy of the proton and that the variable Rho refers to the distance between the electron and proton… in its right form, the ‘m’ given above is actually the reduced mass of that system and not directly the mass of the electron, which gives us a system where the electron is actually orbiting the center of mass, not the proton.

Starting on this problem, it’s convenient to recognize that the Psi wave function is a product of a Ylm (angular wave function) with a Radial function. I started by dividing out the Ylm and losing it. Psi basically just becomes R.

5-13-16 radial equation 1

The first thing to do is take out the units. There is a lot of extra crap floating around in this differential equation that will obscure the structure of the problem. First, take the energy ‘E’ down into the denominator to consolidate the units, then make a substitution that hides the length unit by setting it to ‘one’. This makes Rho a multiple of ‘r’ involving energy. The ‘8’ wedged in here is crazily counter intuitive at this point, but makes the quantization work in the method I’ve chosen! I’ll point out the use when I reach it. At the last line, I substitute for Rho and make a bunch of cancellations. Also, in that last line, there’s an “= R” which fell off the side of the picture –I assure you it’s there, it just didn’t get photographed.

After you clean everything up and bringing the R over from the behind the equals sign, the differential equation is a little simpler…

5-13-16 radial equation 2

The ‘P’ and ‘Q’ are quick substitutions made so that I don’t have to work as hard doing all this math; they are important later, but they just need to be simple to use at the moment. I also make a substitution for R, by saying that R = U/r. This converts the problem from radial probability into probability per unit radius. The advantage is that it lets me break up the complicated differential expression at the beginning of the equation.

The next part is to analyze the ‘asymptotic behavior’ of the differential equation. This is simply to look at what terms become important as the radius variable grows very big or very small. In this case, if radius gets very big, certain terms become small before others. If I can consider the solution U to be a separable composition of parts that solve different elements of this equation, I can create a further simplification.

5-13-16 asymptotic correction

If you consider the situation where r is very very big, the two terms in this equation which are 1/r or 1/r^2 tend to shrink essentially to zero, meaning that they have no impact on the solution at big radii. This gives you a very simple differential equation at big radii, as written at right, which is solved by a simple exponential with either positive or negative roots. I discard the positive root solution because I know that the wave equation must suppress to zero as r goes far away and because the positive exponential will tend to explode, becoming bigger the further you get from the proton –this situation would make no physical sense because we know the proton and electron to be attractive to one another and solutions that have them favor being separated don’t match the boundaries of the problem. Differential equations are frequently like this: they have multiple solutions which fit, but only certain solutions that can be correct for a given situation –doing derivatives loses information, meaning that multiple equations can give the same derivative and in going backward, you have to cope with this loss of information. The modification I made allows me to write U as a portion that’s an unknown function of radius and a second portion that fits as a negative exponent. Hidden here is a second route to the same solution of this problem… if I considered the asymptotic behavior at small radii. I did not utilize the second asymptotic condition.

I just need now to find a way to work out the identity of the rest of this function. I substitute the U back in with its new exponentially augmented form…

5-13-16 Froebenius

With the new version of U, the differential equation rearranges to give a refined set of differentials. I then divide out the exponential so that I don’t have it cluttering things up. All this jiggering about has basically reduced the original differential equation to a skin and bones that still hasn’t quite come apart. The next technique that I apply is the Frobenius method. This technique is to guess that the differential equation can be solved by some infinite power series where the coefficients of each power of radius control how much a particular power shows up in the solution. It’s basically just saying “What if my solution is some polynomial expression Ar^2 -Br +C,” where I can include as many ‘r’s as I want. This can be very convenient because the calculus of polynomials is so easy. In the ‘sum,’ the variable n just identifies where you are in the series, whether at n=0, which just sets r to 1, or n=1000, which has a power of r^1000. In this particular case, I’ve learned that the n=0 term can actually be excluded because of boundary conditions since the probability per unit radius will need to go to zero at the origin (at the proton), and since the radius invariant term can’t do that, you need to leave it out… I didn’t think of that as I was originally working the problem, but it gets excluded anyway for a second reason that I will outline later.

The advantage of Frobenius may not be apparent right away, but it lets you reconstruct the differential equation in terms of the power series. I plug in the sum wherever the ‘A’ appears and work the derivatives. This relates different powers of r to different A coefficients. I also pull the 1/r and 1/r^2 into their respective sums to the same affect. Then, you rewrite two of the sums by advancing the coefficient indices and rewriting the labels, which allows all the powers of r to be the same power, which can be consolidated all under the same sum by omitting coefficients that are known to be zero. This has the effect of saying that the differential equation is now identically repeated in every term of the sum, letting you work with only one.

The result is a recurrence relation. For the power series to be a solution to the given differential equation, each coefficient is related to the one previous by a consistent expression. The existence of the recurrence relation allows you to construct a power series where you need only define one coefficient to immediately set all the rest. After all those turns and twists, this is a solution to the radial differential equation, but not in closed form.

Screwing around with all this math involved a ton of substitutions and a great deal of recasting the problem. That’s part of why solving the radial equation is challenging. Here is a collection of all the important substitutions made…

Collecting solution

As you can see, there is layer on layer on layer of substitution here. Further, you may not realize it yet, but something rather amazing happened with that number Q.

Quantize radial equation

If you set Q/4 = -n, the recurrence relation which generates the power series solution for the radial wave function cuts off the sequence of coefficients with a zero. This gives a choice for cutting off the power series after only a few terms instead of including the infinite number of possible powers, where you can choose how many terms are included! Suddenly, the sum drops into a closed form and reveals an infinite family of solutions that depend on the ‘n’ chosen as to cut off. Further, Q was originally defined as a function of energy… if you substitute in that definition and solve for ‘E,’ you get an energy dependent on ‘n’. These are the allowed orbital energies for the hydrogen atom.

This is an example of Quantization!

Having just quantized the radial wave function of the hydrogen atom, you may want to sit back and smoke a cigarette (if you’re into that sort of thing).

It’s opaque and particular to this strategy, but the ‘8’ I chose to add way back in that first substitution that converts Rho into r came into play right here. As it turns out, the 4 which resulted from pulling a 2 out of the square root twice canceled another 2 showing up during a derivative done a few dozen lines later and had the effect of keeping a 2 from showing up with the ‘n’ on top of the recurrence relation… allowing the solutions to be successive integers in the power series instead of every other integer. This is something you cannot see ahead, but has a profound, Rube Goldbergian effect way down the line. I had to crash into the extra two while doing the problem to realize it might be needed.

At this point, I’ve looked at a few books to try to validate my method and I’ve found three different ways to approach this problem, all producing equivalent results. This is only one way.

The recurrence relation also gives a second very important outcome:

n to l relation

The energy quantum number must be bigger than the angular momentum quantum number. ‘n’ must always be bigger than ‘l’ by at least 1. And secondarily, and this is really important, the unprimed n must also always be bigger than ‘l.’ This gives:

n’ = n > l

This constrains which powers of n can be added in the series solution. You can’t just start blindly at the zero order power; ‘n’ must be bigger than ‘l’ so that it never equals ‘l’ in the denominator and the primed number is always bigger too. If ‘l’ and ‘n’ are ever equal, you get an undefined term. One might argue that maybe you can include negative powers of n, but these will produce terms that are 1/r, which are asymptotic at the origin and blow up when the radius is small, even though we know from the boundary conditions that the probability must go to zero at the origin. There is therefore a small window of powers that can be included in the sum, going between n = l+1 and n = n’.

I spent some significant effort thinking about this point as I worked the radial problem this time; for whatever reason, it has always been hazy in my head which powers of the sum are allowed and how the energy and angular momentum quantum numbers constrained them. The radial problem can sometimes be an afterthought next to the intricacy of the angular momentum problem, but it is no less important.

For all of this, I’ve more or less just told you the ingredients needed to construct the radial wave functions. There is a big amount of back substitution and then you must work the recurrence relation while obeying the quantization conditions I’ve just detailed.

constructing solution

A general form for the radial wave equations appears at the lower right, fabricated from the back-substitutions. The powers of ‘r’ in the series solution must be replaced with the original form of ‘rho’ which now includes a constant involving mass, charge and Plank’s constant which I’ve dubbed the Bohr radius. The Bohr radius ao is a relic of the old Bohr atom model that I started off talking about and it’s used as the scale length for the modern version of the atom. The wave function, as you can see, ends up being a polynomial in radius multiplied by an exponential, where the polynomial is further multiplied by a single 1/radius term and includes terms that are powers of radial distance between l+1, where l is the angular momentum quantum number, and n’, the energy quantum number.

Here is how you construct a specific hydrogen atom orbital from all the gobbledigook written above. This is the simplest orbital, the S-orbital, where the energy quantum number is 1 and the angular momentum is 0. This uses the Y00 spherical harmonic, the simplest spherical harmonic, which more or less just says that the wave function does not vary across any angle, making it completely spherically symmetric.

Normalized S orbital

The ‘100’ attached in subscript to the Psi wave function is a physicist shorthand for representing the hydrogen atom wave functions: these subscripts are ‘nlm,’ the three quantum numbers that define the orbital, which are n=1, l=0 and m=0 in this case. All I’ve done to produce the final wave function is take my prescription from before and use it to construct one of an infinite series of possible solutions. I then perform the typical Quantum Mechanics trick of making it a probability distribution by normalizing it. The process of normalization is just to make certain that the value ‘under the curve’ contained by the square of the wave function, counted up across all of space in the integral, is 1. This way, you have a 100% chance of finding the particle somewhere in space as defined by the probability distribution of the wave function.

You can use the wave function to ask questions about the distribution of the electron in space around the proton –for instance, what’s the average orbital radius of the electron? You just look for the expectation value of the radius using the wave function probability distribution:

Average radius

For the hydrogen atom ground state, which is the lowest energy state for a 1 electron, 1 proton atom, the electron is distributed, on average, about 1 and a half Bohr radii from the nucleus. Bohr radius is about 0.52 angstrom (1×10^-10 meters), which means that the electron is on average distributed 0.78 angstroms from the nucleus.

Right now, this is all very abstract and mathematical, so I’ll jump into the more concrete by including some pictures. Here is a 3D density plot of the wave function performed using Mathematica.

S-orbital density

Definitely anticlimactic and a little bit blah, but this is the ground state wave function. We know it doesn’t vary in any angle, so it has to be spherically symmetric. The axes are distance in units of Bohr’s radius. One thing I can do to make it a little more interesting is to take a knife to it and chop it in half.

This is just the same thing bisected. The legend at left just shows the intensity of the wave function as represented in color.

As you can see, this is a far cry from the atomic model depicted in cartoon far above.

For the moment, I’m going to hang up this particular blog post. This took quite a long time to construct. Some of the higher energy, larger angular momentum hydrogenic wave functions start looking somewhat crazy and more beautiful, but I really just had it in mind to show the math which produces them. I may produce another post containing a few of them as I have time to work them out and render images of them. If the savvy reader so desires, the prescriptions given here can generate any hydrogenic wave function you like… just refer back to my Ylm post where I talk some about the spherical harmonics, or by referring directly to the Ylm tables in wikipedia, which is a good, complete online source of them anyway.


Because I couldn’t leave it well enough alone, I decided to do images of one more hydrogen atom wave function. This orbital is 210, the P-orbital. I won’t show the equation form of this, but I did calculate it by hand before turning it over to Mathematica. In Mathematica, I’m not showing directly the wave function this time because the density plot doesn’t make clear intuitive sense, but I’m putting up the probability densities (which is the wave function squared).

P-orbital probabiltiy density

Mr. Peanut is the P-orbital. Here, angular momentum lies somewhere in the x-y plane since the z axis angular momentum eigenstate is zero. You can kind of think of it as a propeller where you don’t quite know which direction the axle is pointed.

Here’s a bisection of the same density map, along the long axis.

P-orbital probability density bisect

Edit 5-18-16

I keep finding interesting structures here. Since I was just sitting on all the necessary mathematical structures for hydrogen wave function 21-1 (no work needed, it was all in my notebook already), I simply plugged it into mathematica to see what the density plot would produce. The first image, where the box size was a little small, was perhaps the most striking of what I’ve seen thus far…

orbital21-1 squared

I knew basically that I was going to find a donut, but it’s oddly beautiful seen with the outsides peeled off. Here’s more of 21-1…

The donut turned out to be way more interesting than I thought. In this case, the angular momentum is pointing down the Z-axis since the Z-axis eigenstate is -1. This orbital shape is most similar qualitatively to the orbits depicted in the original Bohr atom model with an electron density that is known to be ‘circulating’ clockwise primarily within the donut. This particular state is almost the definition of a magnetic dipole.

A Spherical Tensor Problem

Since last I wrote about it, my continued sojourn through Sakurai has brought me back to spherical tensors, a topic I didn’t well understand when last I saw it. The problem in question is Sakurai 3.21. We will get to this problem shortly…

I’ve been thinking about how best to include math on this blog. The fact of the matter is that it isn’t easy to do very fast. It looks awful if I photograph pages from my notebook, but it takes forever if I use a word processor to make it nice and neat and presentable. I’ve tried a stylus in OneNote before, but I don’t very much like the feeling compared to working on paper.

After my tirade the other day about the Smith siblings, I’ve been thinking again about everything I wanted this blog to be. It isn’t hard to find superficial level explanations of most of physics, but I also don’t want this to read like a textbook. If Willow Smith hosts ‘underground quantum mechanics teachings,’ I actually honestly envisioned this effort on my part as a sort of underground teaching –regardless of the nonexistent audience. What better way to put it. I didn’t want to put in pure math, at least not quite; I wanted to present here what happens in my head while I’m working with the math. How exactly do you do that?

Here’s an image of the mythological notebook where all my practicing and playing takes place:

4-16-16 Notebook image

I’ve never been neat and pretty while working with problems, but all that scratching doesn’t look like scratching to me while I’m working with it. It’s almost indescribable. I could shovel metaphors on top of it or take pictures of beautiful things and call that other thing ‘what I see.’ But there isn’t anything like it. If you’ve spent time on it yourself, maybe you know. It’s addictive. It’s conceptual tourism in the purest form, standing on the edge of the Grand Canyon looking out, then climbing down inside, feeling the crags of stone on my fingertips as I pass down toward where the river flows. It’s tourism in a way, going to a place that isn’t a place, not necessarily pushing back the frontiers since people have been there before, but climbing to the top of a mountain that nobody ever just visits in daily life. You can’t simply read it and you don’t just walk there.

The pages pictured above are of my efforts to derive a formula from Schwinger’s harmonic oscillator representation to produce the rotation matrices for any value of angular momentum. Writing the words will mean nothing to practically anybody who reads this. But what do I do to make it genuine? How do you create a travelogue for a landscape of mathematical ideas?

For the moment, at least, I hope you will forgive me. I’m going to use images of my notebook in all its messy glory.

Where we started in this post was mentioning Spherical Tensors. I hit this topic again while considering Sakurai problem 3.21. ‘Tensor’ is admittedly a very cool word. In Brandon Sanderson’s “Steelheart,” Tensors are a special tool that lets people use magic power to scissor through solid material.

For all the coolness of the word, what are Tensors really?

In the most general sense, a tensor is a sort of container. Here is a very simple tensor:


This construct holds things. Computer programmers call them Arrays sometimes, but here it’s just a very simple container. The subscript ‘i’ could stand for anything. If you make ‘i’ be 1,2 or 3, this tensor can contain three things. I could make it be a vector in 3 dimensions, describing something as simple as position.

In the problem I’m going to present, you have to think twice about what ‘tensor’ means in order to drag out the idea of a ‘spherical’ tensor.

Here is Sakurai 3.21 as written in my notebook:

Sakurai 3.21Omitting the |j,m> ket at the bottom, Sakurai 3.21 is innocuous enough. You’re just asked to evaluate two sums between parts a.) and b.). No problem right? Just count some stuff and you’re done! Trick is, what the hell are you trying to count?

Contrary to my using the symbol ‘Ai’ above to sneak in the meaning of ‘love,’ the dj here do not play in dance clubs, even if they are spinning like a turntable! These ‘d’s are symbols for a rotation operation which can transform a state as if rotating it by an angle (here angle β). Each ‘d’ transforms a state with a particular z-axis angular momentum, labeled by ‘m’, to a second state with a different label, where the angle between the two states is a rotation of β around the y-axis. Get all that? You’ve got a spinning object and you want to alter the axis of the spin by an angle β. Literally you’re spinning a spin! That’s a headache, I know.

Within quantum mechanics, you can know only certain things about the rotation of an object, but not really know others. This is captured in the label ‘j’. ‘j’ describes the total angular momentum contained in an object; literally how much it’s spinning. This is distinct from ‘m’ which describes the rotation around a particular axis. Together, ‘m’ and ‘j’ encapsulate all of the knowable rotational qualities of our quantum mechanical object, where you can know it’s rotating a certain amount and that some of that rotation is around a particular axis. The rest of the rotation is in some unknowable combination not along the axis of choice. This whole set of statements is good for both an object spinning and for an object revolving around another object, like a planet in orbit.

The weird trick that quantum mechanics plays is that only a certain number of rotational state are allowed for a particular state of total angular momentum; the more total angular momentum you have, the larger the library of rotational states you can select from. In the sum in the problem, you’re including all the possible states of z-axis angular momentum allowable by the particular total angular momentum. Simultaneous rotation around x and y-axis is knowable only to an extent depending on the magnitude of rotation about the z-axis (so says the Heisenberg Uncertainty Principle, in this case–but the problem doesn’t require that…).

Here is an example of how you ‘rotate a state’ in quantum mechanics. I expect that only readers familiar with the math will truly be able to follow, but it’s a straightforward application of an operator to carry out an operation at a symbolic level:

Rotating a state 4-20-16

All this shows is that a rotation operator ‘R’ works on one state to produce another. By the end of the derivation, operator R has been converted into a ‘dj’ like what I mentioned above. Each dj is a function of m and m” in a set of elements which can be written as a 2-dimensional matrix… dj is literally mapping the probability amplitude at m onto m”, which can be considered how you route one element of a 2-dimensional matrix into another based upon the operation of rotating the state. In this case, the example starts out without a representation, but ultimately shifts over to representing in a space of ‘j’ and ‘m.’ The final state can be regarded as a superposition of all the states in the set, as defined by the sum. In all of this, dj can be regarded as a tensor with three indices, j, m and m”, making it a 3-dimensional entity which  contains a variable number of elements depending on each level of j: dj is only the face-plate of that tensor, coughing up whatever is stored at the element indexed by a particular j, m and m”.

In problem 3.21, what you’re counting up is a series of objects that transform other objects as paired with whatever z-axis angular momentum they represent within the total angular momentum contained by the system. This collection of objects is closed, meaning that you can only transform among the objects in the set. If there were no weighting factor in the sum, the sum of these squared objects actually goes to ‘1’… the ‘d’ symbols become probability amplitudes when they’re squared and, for a closed set, you must have 100% probability of staying within that set. The headache in evaluating this sum, then, is dealing with the weighting factor, which is different for each element in the sum, particularly for whatever state they are ultimately supposed to rotate to.

My initial idea looking at this problem was that if I can calculate each ‘d,’ then I can just work the sum directly. Just square each ‘d’ and multiply it by the weighting factor and voila! There was no thought in my head about spherical tensors, despite the overwhelming weight of that hint following part b.)

Naively, this approach could work. You just need some way of calculating a generalized ‘d.’ This can be done using Schwinger’s simple harmonic oscillator model. All you need to do is rotate the double harmonic oscillator and then pick out the factor that appears in place of ‘d’ in the appropriate sum –an example of which can be seen in the rotation transformation above. Not hard, right?

A month ago, I would have agreed with you. I had spent only a little bit of time learning how the Schwinger model works and I thought, “Well, solve ‘d’ using the Schwinger method and then boom, we’re golden.” It didn’t seem too bad, except that days eventually converted themselves into weeks before I had a good enough understanding of the method to be able to crank out a ‘d.’ You can see one of my pages of work on this near the top of this post… there were factorials and sums everywhere. By the time I had it completely figured out –which I really don’t regret, by the way– I had actually pretty much forgotten why I went to all that trouble in the first place. My thesis here is that, yes, you can solve for each and every ‘d’ you may ever want using Schwinger’s method. On the other hand, when I came back to look at Sakurai 3.21, I realized that if I were to try to horsewhip a version of ‘d’ from the Schwinger method into that sum, I was probably never going to solve the problem. The formula to derive each ‘d’ is itself a big sum with a large number of working parts, the square of which would turn into a _really_ large number of moving parts. I know I’m not a computer and trying to go that way is begging for a trouble.

It was a bit of a letdown when I realized that I was on the wrong track. As a lesson, that happens to everyone: almost nobody gets it first shot. This should be an abstract lesson to many people: what you think is a truth isn’t always a truth, or necessarily the simplest path to a truth. I still expect that if you were a horrific glutton for punishment, you could work the problem the way I started out trying, but you would get old in the attempt.

I spent some introspective time reading Chapter 3 of Sakurai, looking at simple methods for obtaining the necessary ‘d’ matrix elements. Most of these can’t be used in the context of problem 3.21 because they are too specific. With half-integer j or j of 1, you can directly calculate rotation matrices, except that this is not a general solution. I had a feeling that you could suck the weighting factor of ‘m’ back into the square of the ‘d’ and use an eigenvalue equation to change the ‘m’ into the Jz operator, but I wasn’t completely sure what to do with it if I did. About a week ago, I started to look a bit more closely at the section outlining operator transformations using spherical tensor formalism. I had a feeling I could make something work in these new ideas, especially following that heavy-handed hint in part b.)

The spherical tensor formalism is very much like the Heisenberg picture; it enables one to rotate an operator using the same sorts of machineries that one might use to rotate a state. This, it turns out, is the necessary logical leap required by the problem. To be honest, I didn’t actually understand this while I was reading the math and trying to work through it. I only really understood very recently. Rotating the state is not the same as rotating operators. The math posted above is the rotation of a state.

As it turns out, with an operator written in a cartesian form, different parts will rotate differently from one another; you can’t just apply one rotation to the whole thing and expect the same operator back.

This becomes challenging because the angular momentum operators are usually written in a cartesian form and because operator transformations in quantum mechanics are usually handled as unitary transformations. Constructing a unitary transformation requires careful analysis of what can rotate and remain intact.

Here is a derivation which shows rotation converted into a unitary operation:

Rotation as a unitary transform 4-21-16

In this case, the rotation matrix ‘d’ has been replaced by a more general form. The script ‘D’ is generally used to represent a transformation involving all three Euler angles, whereas the original ‘d’ was a rotation only around the y-axis. In principle, this transformation can work for any reorientation. In this derivation, you start with a spherical harmonic and show, if you create a representation of something else with that spherical harmonic, that you can rotate that other object within the Ylm. In this derivation, the object being rotated is just a vector used to indicate direction, called ‘n’. The spherical harmonics have this incredible quality in that they are ready-made to describe spherical, angle-space objects and that they rotate naturally without distortion… if you want to rotate anything, writing it as an object which transforms like a spherical harmonic is definitely the best way to go.

In the last line of that derivation, the spherical harmonic containing the direction vector has been replaced with a construct labeled simply as ‘T’. T is a spherical tensor. This object contains whatever you put into it and resides in the description space of the spherical harmonics. It rotates like a spherical harmonic.

The last line of algebra contains another ramification that I think is interesting. In this math, for this particular case, the unitary transform of D*Object*D reduces to a simple linear transform D*Object.

This brings me roughly full circle: I’m back at spherical tensors.

A spherical tensor is a multi-dimensional object which sits in a space which uses the spherical harmonics as a descriptive basis set. Each index of the spherical tensor transforms like the Ylm that resides at that index location. In some ways, this looks very like a state function in spherical harmonic space, but it’s different since the object being represented is an operator native to that space rather than a state function. Operators and state functions must be treated differently in quantum mechanics because they are different. A state function is a nascent form of a probability distribution while an operator is an entity that can be used to manipulate that distribution in eigenvalue equations.

This may seem a non-sequitur, but I’ve just introduced you to a form of trans-dimensional travel. I’ve just shown you the gap for moving between a space involving the dimensions of length, width and depth into a space which replaces those descriptive commodities with angles. A being living in spherical harmonic space is a being constructed directly out of turns and rotations, containing nothing that we can directly witness as a physical volume. You will never find something so patently crazy in the best science fiction! Quantum mechanics is replete with real expressions for moving from one space to another.

The next great challenge of Sakurai 3.21 is learning how to convert a cartesian operator construct into a spherical one. You can put whatever you want into a spherical tensor, but this means figuring out how to transfer the meaning of the cartesian expression into the spherical expression. As far as I currently understand it, the operator can’t be directly applied while residing within the spherical tensor form –I screwed this problem up a number of times before I understood that. To make the problem work, you have to convert from cartesian objects into the spherical object, perform the rotation, then convert backward into the cartesian object in order to come up with the final expression. The spherical tensor forms of the operators end up being linear combinations of the cartesian forms.

Here is the template for using spherical harmonics to guide conversion of cartesian operators into spherical tensor components:

Conversion to spherical tensor 4-21-16

In this case, I’m converting the momentum operators into a spherical tensor. This requires only the rank 1 spherical harmonics. The spherical tensor of rank one is a three dimensional object with indices 1,0 and -1, which relate to the cartesian components of the momentum vector Jz, Jx and Jy as shown. For position, cosine = z/radius and the x and y conversions follow from that, given the relations above. Angular momentum needs no spatial component because of normalization in length, so z-axis angular momentum just converts directly into cosine.

As you can see, all the tensor does here is store things. In this case, the geometry of conversion between the spaces stores these things in such a way that they can be rotated with no effort.

Since I’ve slogged through the grist of the ideas needed to solve Sakurai 3.21, I can turn now to how I solved it. For all the rotation stuff that I’ve been talking about, there is one important, very easy technique for rotating spherical harmonics which is relevant to this particular problem. If you are rotating an m=0 state, of which there is only one in every rank of total angular momentum, the dj element is a spherical harmonic. No crazy Schwinger formulas, just bang, use the spherical harmonic. Further, both sections of problem 3.21 involve converting m into Jz and Jz converts to the m=0 element of the spherical tensor with nothing but a normalization (to see this, look at the conversion rules that I included above). This means that the unitary transform of Jz can be mediated either by rotating from any state into the m=0 state, or rotating m=0 toward any state, which lets the dj be a spherical harmonic in either direction.

Now, since part a.) is easy, here’s the solution to problem 3.21 part b.)

Sakurai 3.21 b1

I apologize here that the clarity of the images is not the best; the website downgraded the resolution. I included a restatement of problem 3.21 part b.) in the first line here and then began by expanding the absolute value and pulling the eigenvalue of m back into the expression so that I could recast it as operator Jz using an eigenvalue equation to give me Jz^2. Jz must then be manipulated to produce the spherical tensor, the process expanded below.

Sakurai 3.21 b2

Where I say “three meaningful terms,” I’m looking ahead to an outcome further along in the problem in order to avoid writing 6 extra terms from the multiplication that I don’t ultimately need. I do write my math exhaustively, but in this particular case, I know that any term that isn’t J0*J0, J1*J-1 or J-1*J1 will cancel out after the J+ and J- ladder operators have had their way. For anyone versed, J1 is directly the ladder operator J+ and J-1 is J-. If the m value doesn’t end up back where it started, with J+J- or J-J+ combinations, when you take the resulting expectation value, anything like <m|m+1> is zero. Knowing this a page in advance, I simply omitted writing all that math. I then worked out the two unique coefficients that show up in the sum of only three elements…

Sakurai 3.21 b3

In the middle of this last page, I converted the operators Jx and Jy into a combination of J^2 and Jz. The ladder operators composed of Jx and Jy served to strain out 2/3 of the mathematical extra and I more or less omitted writing all of that from the middle of the second page. After you’re back in the cartesian form, once you’ve made the rotation, which occurs once the sum has been expanded, there is no need to stay in terms of Jx and Jy because the system can’t be simultaneously expressed as eigen functions of Jx, Jy and Jz. You can have simultaneous eigen functions of only total angular momentum and one axis, typically chosen to be the z-axis. By converting to J^2 and Jz only, I get the option to use eigen values instead of operators, which is almost always where you want to end up in a quantum problem. This is why I started writing |m> as |j,m>… most of the time in this problem I only care about tracking the m values, but I understand from the very beginning of the problem that I have a j value hiding in there that I can use on choice.

One thing that eases your burden considerably in this problem is understanding how j compartmentalizes m values. As I mentioned before, each rank of j contains a small collection of m value eigenfunctions which only transform amongst themselves. Even though the problem is asking for a solution that is general to every j, by using transformations of the angular momentum operator, which is a rank 1 operator, I only needed the j=1 spherical harmonics to represent it. This allows me to work in a small space which can be general across all values of j. This is part of what makes the Schwinger approach to this problem so unwieldy; by trying to represent d for every j, I basically swelled the number of terms I was working with to infinity. You can work with situations like this, but it just gets too big too quickly in this case –I’m just not that smart.

It’s also possible to work omitting the normalization coefficients needed in the spherical harmonics, but do this with caution. It can be hard to tell which part of the coefficient is dedicated to flattening multiplicity and which is canceling out of the solid angle. In cases where terms are getting mixed, I hold onto normalization so that I know down the line whether or not all my 2s and -1 are going to turn out. I always screw things like this up, so I do my best to give myself whatever tools I can for figuring out where I’ve messed up arithmetic. I found an answer to this problem on-line which leaves cartesian indices on the transformations through the problem and completely omits the normalization… technically, this sort of solution is wrong and bypasses the mechanics. You can’t transform a cartesian tensor like a spherical tensor; getting yourself screwed up by missing the proper indices misses the math. How the guy hammered out the right answer from doing it so poorly makes no sense to me.

This problem took a considerable amount of work and thought. It may not show in the writing, but I had been thinking about it for weeks. One great difference between doing this for class and doing it on my own is that there is no time limit on completing it except for the admission of defeat. I never made that admission and I gradually became more and more clear on what to do in the problem. I had been thinking about it on and off so hard that it was losing me sleep and leaving me foggy headed on other daily tasks. It takes real work. Eventually, there was a morning while I was in the shower where I just saw it. Clear as day, the solution unfolded to me. I do my best thinking in the morning while taking my shower. Under some circumstances, the stress of this process can be soul-breaking. It can also be profoundly illuminating. Seeing through it can be addictive… but you must not give up when the going gets tough.

Schwinging the Pendulum

Masses on springs get a lot of use in physics; you see them early in that first year of introductory classical mechanics with Hooke’s law and they come back over and over again after that. Physicists are fond of saying that basically everything in reality reduces to a mass on a spring if you squint at it the right way. I chose the tortured title for this post thinking about how a pendulum bob can be described as a mass on a spring at small angles of deflection and that Schwinger’s method, which is important to the Quantum mechanics problem I’m covering, is almost like a pendulum swinging in an ellipse at a small angle.

If you haven’t guessed, this post is back to Sakurai problem 3.19, finally –of which I’ve spoken previously. What dalliance in quantum mechanics would be complete without spending time on Schwinger’s angular momentum method? Julian Schwinger should be familiar to anyone with a background in 20th century physics since he won the Nobel Prize at the same time as Richard Feynman. His wikipedia entry shows how his approaches contrasted with those of Feynman, but he was certainly no less brilliant.

As a quick refresher, the problem is asking about two of Schwinger’s operators, K+ and K-. “What do these two operators do?”

The quantum mechanical version of a mass on a spring is the ‘quantum simple harmonic oscillator.’ This system differs from the basic ‘mass on a spring’ model in that you really can’t think about it as something moving ‘back and forth’ the way a pendulum bob can. In the quantum version, it would be most accurate to say that the bob tends to be distributed along the range of its swing and that it is more likely to be found highly compressed and highly extended, at the extreme positions of its swing, the greater the energy it contains. In this, the swinging of the pendulum bob can be broken down into a spectrum of energy eigenstates where you can describe the motion as some combination of these states. Eigenstates are, of course, the bread and butter of quantum mechanics and correspond to stationary probability waves which are not overall that different from vibrations in a guitar string –even though it would be very very wrong to draw too close of an analogy here in absence of the math. A probability wave is literally existential, not ‘vibrational’ like a sound wave.

An important structure in this version of the quantum mass-on-a-spring is the existence of the so-called ‘creation’ and ‘annihilation’ operators (a† and a), which are very central to Schwinger’s method. These operators work together in a set where one undoes the action of the other. Together, these operators allow the skilled technician to transit the eigenstate energy spectrum of the mass-on-a-spring, using the creation operator to step from one state to the next higher energy state and the annihilation operator doing the opposite, stepping down in energy between successive states. These operators work sort of like moving your finger on the frets of a guitar string, the annihilator moving toward the tuning pegs and the creator moving down the neck toward the body of the instrument. If you get too close to the tuning pegs, the annihilator can actually cause you to fall off the end of the instrument. Not kidding, really: that’s part of why it earned the name ‘annihilator.’ The creation operator, on the other hand, can get arbitrarily close to the bottom of the string and still find notes, provided you continue to have some way of plucking the string.

Now then, Schwinger’s method takes this collection of ideas and turns it on its head to produce what can only be described as a stroke of genius. This idea follows from the basic observation that if you have an object moving at a constant speed around a circular track, that you can parameterize it using two mass-spring systems at right angles to each other in the plane of the track and have a complete description of the circular motion. Literally, if you’re moving in a circle in the x-y plane, the equation describing the x-position is a harmonic oscillator, as is the one describing the y-position. Schwinger’s brilliance was simply to say, “So, why don’t we do this in quantum mechanics?”

Schwinger’s angular momentum method applies to quantum mechanical rotation: the spinning top. As a physical parameter, angular momentum tends to describe what might be thought of as the ‘strength of a rotation.’ Having more angular momentum tends to correspond to greater speed in rotation, but in quantum mechanics, it also tends to strongly influence how ‘spinning’ objects are distributed within whatever volumes they occupy by giving them distinct ‘orientations.’

Coming back to the Sakurai problem, which I’ve been orbiting at quite a distance, the operators K+ and K- manipulate a state of rotation. K+ is two creation operators and K- is two annihilation operators. Mathematically, K+ increases the binary harmonic oscillator state by one unit of total angular momentum, while K- does the opposite. If you actually consider K+ to be two creation operators (a†a†) you can see it directly in the description of the general two-oscillator eigenstate:
Schwinger rotation 1Here, n+ and n- are just the two numbers needed to find the address of any one eigenstate among an infinite number where one mass-on-a-spring is labeled as ‘+’ while the one perpendicular to it is labeled as ‘-‘. The ket (state) on the right |0,0> is just the ground state where some number ‘n’ applications of a† elevates you to any eigenstate in spectrum. The equation, as written, simply tells you where you put your finger on the guitar string(s) to produce any note you want. The Schwinger method is actually where to put your finger on two strings in order to produce a particular kind of rotation in 2 dimensions. Do you see K+ in this equation? If n+ and n- are equal, K+ is just the thing right in the middle!

So, if you read my previous post (and survived far enough to read this), you’ll know that the Sakurai 3.19 problem was asking about matrix elements. Since ‘operators’ in quantum mechanics take states and convert them into other states, the structure of an operator is in a matrix where each element tells how one eigenstate is referenced to another during whatever transformation that operator is supposed to mediate. You could write K+ and K- in such a way that you can tell how any one state is converted to any other state by action of the operator.

This will almost certainly lose readers, but if you don’t actually like physics, you probably won’t like this blog anyway. As I worked problem 3.19, here are the forms I found for the matrix elements of the K+ and K- operators acting on the space of all harmonic oscillator eigenstates.



Solving the matrix element problem is actually quite simple and, as I worked the problem, I delayed executing this step until I had slogged exhaustively through the Schwinger method and was certain I knew what was up. To get the answer listed here, you just take the form for K+ or K- as presented in the previous blog post and act them on a particular ket |n’+,n’->, then sandwich with a particular bra <n+,n-|. This is like looking for the expectation value, but for only one element out of an entire matrix. The primed value of each ‘n’ is understood to be a different number from the unprimed form. Talk of Bra and Ket will sound weird to anyone who has never encountered the Dirac ‘bracket’ notation, which denotes eigenstates as ‘bra’-‘ket’ where the ‘bra’ is the conjugate transpose of the ‘ket’ and the ‘ket’ is a representation-free form of a particular quantum mechanical eigenstate. A matrix element is just a ‘bra’ sandwiched with a ‘ket’ and after the kets and bras are all gone, what’s left are Kroenecker deltas that describe where a particular element is located in a matrix since you only get non-zero elements where the indices of the ‘delta’s are equal. This form can be a very handy alternative to the two dimensional lists that every linear algebra student has learned to hate… in this case, the matrices are infinitely large in their two dimensions and you could never actually write them. With the delta notation, you only need to say which matrix elements are non-zero, thus reducing a matrix which can’t fit on this Earth into a single line expression. What’s written above are both two-matrix things where each matrix acts on one of two ket-spaces and each ket space is the series of eigenstates for one of the harmonic oscillators that Schwinger used in his description of angular momentum.

It certainly may not look it and it definitely doesn’t sound like it, but this does all come down to rotational quantum mechanics.

Post script:

As an apology to any reader who may stray here, I’m still deciding exactly what the voice of this blog will be. My initial vision was to be as broadly friendly as possible, but I’ve ping-ponged back and forth on this. I expect that there will be articles involving the far more crowd-friendly subject of ‘physics and/or science in popular culture,’ but I also had a desire to create a space where I could store my practicing. A formal truth about education in general is that you don’t keep what you make no effort to keep, which means that if you don’t actively practice, things you care about can disappear out of your life forever… particularly in a subject as hard as physics. I think we live in a world where it seems like everybody wants to believe they have a high level understanding of everything without having actually invested sufficient effort to attain even a basic level understanding of anything. One thing I want here is to serve as an example of what it takes to maintain skill and maybe make people think twice about what’s needed to be good at an intellectual pursuit. I have a desire to one day read an article quoting someone like Jaden Smith or Terrence Howard where they say “I was in this class at school learning how to do physics and it was so cool; I finally understood this and I wish everybody did!” but I fear that this will never happen. I have notebooks that are literally crammed with my physics practicing and I wish I could open them to the world and convince people to stop being afraid of playing at math. In some ways, it’s like working a crossword puzzle and I think anybody who has invested deeply probably can do as well as me. I’m no genius, I’m just a walking testament to hard work and due diligence.

Angular momentum by harmonic oscillator

I’ve been thinking about Sakurai problem 3.19 for the past few days.

The problem reads:

19.) What is the physical significance of the operators

K+ ≡ a+†a†        and       K ≡ a+a

in Schwinger’s Scheme for angular momentum? Give nonvanishing matrix elements of K±

I’m not sure I completely understand this problem yet. The ‘a’ operators are the creation (daggered operator) and annihilation operators (undaggered operator) for the simple Harmonic oscillator. Applying these operators to simple harmonic oscillator eigenstates increase or decrease, respectively, the energy quantum number of the state, allowing you to move up or down the energy spectrum. In Schwinger’s scheme, two harmonic oscillators are put together  to create an angular momentum eigenstate (thus proving that you really can create everything out of harmonic oscillators). These oscillators are represented by the ‘+’ and ‘-‘ subs on the ‘a’ operators. K+ and K- are almost but not quite the same as the ladder operators, defined in Schwinger’s Scheme as a+†a and a+a† which is definitely subtly different. In Schwinger’s scheme, the two harmonic oscillators are independent eigenspaces that are coupled together and applying a+†a increases the ‘+’ state while decreasing the ‘-‘ state. Eventually, the ‘-‘ state hits its ground state and a further application gives a zero. Going the other way with the second operator does the same thing, decreasing the ‘+’ state while increasing the ‘-‘ state until you hit the bottom of the ‘+’ state spectrum and annihilate the combination with a zero. The affect is like the Ladder operators walking across the ladder, either up or down ‘m’ values until you hit the end and kill the last state.

My conclusion about the K± operators is that they will essentially increment or decrement the total angular momentum (l-value) of the coupled state, which is actually really kind of wicked. I haven’t worked the math yet, but I think this is cool. This problem is also interesting to me because it gives a basic hit with a way that QFT deals with photon formalism. Photons are loaded into eigenstates using operators like these.

I’ve decided that Julian Schwinger was a very smart guy.