If you’ve read anything else in this blog, you’ll know I write frequently about my playing around with Quantum Mechanics. As a digression away from a natural system that is all about probabilities, an interesting little toy problem I decided to tackle is figuring out how the “win” probabilities are determined in the lottery game Powerball.
Powerball is actually quite intriguing to me. They have a website here which details by level all the winners across the whole country who have won a Powerball prize in any given drawing. You may have looked at this chart at some point while trying to figure out if your ticket won something useful. A part of what intrigues me about this chart is that it tells you in a given drawing exactly how much money was spent on Powerball and how many people bought tickets. How does it tell you this? Because probability is an incredibly reliable gauge of behavior with big samples sizes. And, Powerball quite willingly lays all the numbers out for you to do their book keeping for them by telling you exactly how many people won… particularly at the high-probability-to-win levels which push into the regime of Gaussian statistics. For big samples, like millions of people buying powerball tickets, where N=big, the errors on average values become relatively insignificant since they go as sqrt(N). And, the probabilities reveal what those average values are.
The game is doubly intriguing to me because of the psychological component that drives it. As the pot becomes big, people’s willingness to play becomes big even though the probabilities never change. It suddenly leaps into the national consciousness every time the size of the pot becomes big and people play more aggressively as if they had a greater chance of winning said money. It is true that somebody ultimately walks away with the big pot, but what’s the likelihood that somebody is you?
But, as a starter, what are the probabilities that you win anything when you buy a ticket? To understand this, it helps to know how the game is set up.
As everybody knows, powerball is one of these games where they draw a bunch of little balls printed with numbers out of a machine with a spinning basket and you, as the player, simply match the numbers on your ticket to the numbers on the balls. If your ticket matches all the numbers, you win big! And, as an incentive to make people feel like they’re getting something out of playing, the powerball company awards various combinations of matching numbers and adds in multipliers which increase the size of the award if you do get any sort of match. You might only match a number or two, but they reward you a couple bucks for your effort. If you really want, you can pick the numbers yourself, but most people simply grab random numbers spat out of a computer… not like I’m telling you anything you don’t already know at this point.
One of the interesting qualities of the game is that the probabilities of prizes are very easy to adjust. The whole apparatus stays the same; they just add or subtract balls from the basket. In powerball, as currently run, there are two baskets: the first basket contains 69 balls while the second contains 26. Five balls are drawn from the first basket while only one, the Powerball, is drawn from the second. There is actually an entire record available of how the game has been run in the past, how many balls were in either the first or second baskets and when balls were added or subtracted from each. As the game has crossed state lines and the number of players has grown, the number of balls has also steadily swelled. I think the choice in numbering has been pretty careful to make the smallest prize attainably easy to get while pushing the chances for the grand prize to grow enticingly larger and larger. Prizes are mainly regulated by the presence of the Powerball: if your ticket manages to match the Powerball and nothing else, you win a small prize, no matter what. Prizes get bigger as a larger number of the other five balls are matched on your ticket.
The probabilities at a low level work almost exactly as you would expect: if there are 26 balls in the powerball basket, at any given drawing, you have 1 chance in 26 of matching the powerball. This means that you have 1 chance in 26 of winning some prize as determined by the presence of the powerball. There are also prizes for runs of larger than three matching balls drawn from the main basket, which tends to push the probabilities of winning anything to a slightly higher frequency than 1 in 26.
For the number savvy this begins to reveal the economics of powerball: an assured win by these means requires you to spend, on average, $48. That’s 26 tickets where you are likely to have one that matches the powerball. Note, the prize for matching that number is $4. $44 dollars spent to net only $4 is a big overall loss. But, this 26 ticket buy-in is actually hiding the fact that you have a small chance of matching some sequence of other numbers and obtaining a bigger prize… and it would certainly not be an economic loss if you matched the powerball and then the 5 other balls, yielding you a profit in the hundreds of millions of dollars (and this is usually what people tell themselves as they spend $2 for each number).
The probability to win the matched powerball prize only, that is to match just the powerball number, is actually somewhat worse than 1 in 26. The probability is attenuated by the requirement that you hit no matches on any other of the five possible numbers drawn.
Finding the actual probability is as follows: (1/26)*(64/69)*(63/68)*(62/67)*(61/66)*(60/65). If you multiply that out and invert it, you get 1 hit in 38.32 tries. The first number is, of course, the chances of hitting the powerball, while the other five are the chance of hitting numbers that aren’t picked… most of these probabilities are naturally quite close to 1, so you are likely to hit them, but they are probabilities that count toward hitting the powerball only.
This number may not be that interesting to you, but lots of people play the game and that means that the likelihood of hitting just the powerball is close to Gaussian. This is useful to a physicist because it reveals something about the structure of the Powerball playing audience on any given week: that site I gave tells you how many people won with only the powerball, meaning that by multiplying that number by 38.32, you know how many tickets were purchased prior to the drawing in question. For example, as of the August 12 2017 drawing, 1,176,672 numbers won the powerball-only prize, meaning that very nearly 38.32*1,176,672 numbers were purchased: ~45,090,071 numbers +/- 6,715, including error (notice that the error here is well below 1%).
How many people are playing? If people mostly purchase maybe two or three numbers, around 15-20 million people played. Of course, I’m not accounting for the slavering masses who went whole hog and dropped $20 on numbers; if everybody did this, 4.5 million people played… truly, I can’t really know people’s purchasing habits for certain, but I can with certainty say that only a couple tens of millions of people played.
The number there reveals quite clearly the economics of the game for the period between the 8/12 drawing and the one a couple days prior: $90 million was spent on tickets! This is really quite easy arithmetic since it’s all in factors of 2 over the number of ticket numbers sold. If you look at the total prize pay-out, also on that page I provided, $19.4 million was won. This means that the Powerball company kept ~$70 million made over about three days, of which some got dumped into the grand prize and some went to whatever overhead they keep (I hear at least some of that extra is supposed to go into public works and maybe some also ends up in the Godfather’s pocket). Lucrative business.
If you look at the prize payouts for the game, most of the lower level prizes pay off between $4 and $7. You can’t get a prize that exceeds $100 until you match at least 4 balls. Note, here, that the probability of matching 4 balls (including the powerball) is about 1 in 14,494. This means, that to assure yourself a prize of $100, you have to spend ~$29,000. You might argue that in 14,494 tickets, you’ll win a couple smaller prizes ($4 prizes are 1 in 38, 1 in 91, and $7 prizes are 1 in 700 and 1 in 580) and maybe break even. Here’s the calculation for how much you’ll likely make for that buy-in: $4*(14,494*(1/38 + 1/91)) + $7*(14,494*(1/700 + 1/580))… I’ve rounded the probabilities a bit… =$2482.65. For $29,000 spent to assure a single $100 win, you are assured to win at most $2500 from lesser winnings for a total loss of $27,500. Notice, $4 on a $44 loss is about 10%, while $2500 on $27,500 is also about 10%… the payoff does not improve at attainable levels! Granted, there’s a chance at a couple hundred million, but the probability of the bigger prize is still pretty well against you.
Suppose you are a big spender and you managed to rake up $29,000 in cash to dump into tickets, how likely is it that you will win just the $1 million prize? That’s five matched balls excluding the powerball. The probability is 1 in 11,688,053. By pushing the numbers, your odds of this prize have become 14,500/11,688,053, or about 1 chance in 800. Your odds are substantially improved here, but 1 in 800 is still not a wonderful bet despite the fact that you assured yourself a fourth tier prize of $100! The grand prize is still a much harder bet with odds running at about 1 in 20,000, despite the amount you just dropped on it. Do you just happen to have $30,000 burning a hole in your pocket? Lucky you! Lots of people live on that salary for a year.
Most of this is simple arithmetic and I’ve been bandying about probabilities gleaned from the Powerball website. If you’re as curious about it as me, you might be wondering exactly how all those probabilities were calculated. I gave an example above of the mechanical calculation of the lowest level probability, but I also went and figured out a pair of formulae that calculate any of the powerball prize probabilities. It reminded me a bit of stat mech…
I’ve colored the main equations and annotated the the parts to make them a little clearer. The final relation just shows how you can see the number of tries needed in order to hit one success, given a probability as calculated with the other two equations. The first equation differs from the second in that it refers to probabilities where you have matched numbers without managing to match the powerball, while the second is the complement, where you match numbers having hit the powerball. Between these two equations, you can calculate all the probabilities for the powerball prizes. Since probabilities were always hard for me, I’ll try to explain the parts of these equations. If you’re not familiar with the factorial operation, this is what is denoted by the exclamation point “!” and it denotes a product string counting up from one to the number of the factorial… for example 5! means 1x2x3x4x5. The special case 0! should be read as 1. The first part, in blue, is the probability relating to either hitting on missing the powerball, where K = 26, the number of balls in the powerball basket. The second part (purple) is the multiplicity and tells you how many ways that you can draw a certain number of matches (Y) to fill a number of open slots (X), while drawing a number of mismatches (Z) in the process, where X=Y+Z. In powerball, you draw five balls, so X=5 and Y is the number of matches (anywhere from 0 to 5), while Z is the number of misses. Multiplicity shows up in stat mech and is intimately related to entropy. The totals drawn (green) is perhaps mislabeled… here I’m referring to the number of possible choices in the main basket, N=69, and the number of those that will not be drawn M = N – X, or 64. I should probably have called it “Main basket balls” or something. The last two parts determine the probabilities related to the given number of hits (Y) (orange) and the given number of misses (Z) (red) and I have applied the product operator to spiffy up the notation. Product operator is another iterand much like the summation operator and means that you repeatedly multiply successive values, much like a factorial, but where the value you are multiplying is produced from a particular range and given a set form. In these, the small script m and n start at zero (my bad, this should be under the Pi) and iterate until they are just less than the number up top (Y – 1 or Z – 1 and not equal to). At the extreme cases of either all hits or all misses, the relevant product operator (either Miss or Hit respectively) must be set equal to one in order to not count it.
This is one of those rare situations where the American public does a probability experiment with the values all well recorded where it’s possible to see the outcomes. How hard is it to win the grand prize? Well, the odds are one in 292 million. Consider that the population of the United States is 323 million. That means that if everybody in the United States bought one powerball number, about one person would win.
Thanks to the power of the media, everybody has the opportunity to know that somebody won. Or not. That this person exists, nobody wants to doubt, but consider that the odds of winning are so scant that you not only won’t win, but you pretty likely will never meet anyone who did. Sort of surreal… everything is above board, you would think, but the rarity is so rare that there’s no assurance that it ever actually happens. You can suppose that maybe it does happen because people do win those dinky $4 prizes, but maybe this is just a red herring and nobody really actually wins! Those winner testimonials could be from actors!
Yeah, I’m not much of a conspiracy theorist, but it is true that a founding tenant of the idea of a ‘limit’ in math is that 99.99999% is effectively 100%. Going to the limit where the discrepancy is so small as to be infinitesimal is what calculus is all about. It is fair to say that it very nearly never happens! Everybody wants to be the one who beats the odds, which is why Powerball tickets are sold, but the extraordinarily vast majority never will win anything useful… I say “useful” because winning $4 or $7 is always a net loss. You have to win one of the top three prizes for it to be anywhere near worth anything, which you likely never will.
One final fairly interesting feature of the probability is that you can make some rough predictions about how frequently the grand prize is won based on how frequently the first prize is won. First prize is matching all five of the balls, but not the powerball. This frequency is about once per 12 million numbers, which is about 26 times more likely than all 5 plus the Powerball. In the report on winnings, a typical frequency is about 2 to 3 winners per drawing. About 1 time in 26 a person with all five manages to get the powerball too, so, with two drawings per week and about 2.5 first prize winners per drawing, that’s five winners per week… which implies that the grand prize should be won at a frequency of about once every five to six weeks –every month and a half or so. The average here will have a very large standard deviation because the number of winners is compact, meaning that the error is an appreciable portion of the measurement, which is why there is a great deal of variation in period between times when the grand prize is won. The incidence becomes much more Poissonian and stochastic, and allows some prizes to get quite big compared to others and causes their values to disperse across a fairly broad range. Uncertainty tends to dominate, making the game a bit more exciting.
While the grand prize is small, the number of people winning the first prize in a given week is small (maybe none or one), but this number grows in proportion to the size of the grand prize (maybe 5 or 6 or as high as 9). When the prize grows large enough to catch the public consciousness, the likelihood that somebody will win goes up simply because more people are playing it and this can be witnessed in the fluctuating frequency of the wins of lower level prizes. It breathes around the pulse of maybe 200 million dollars, lubbing at 40 million (maybe 0 to 1 person winning the first prize) and dubbing at 250 million (with 5 people or more winning the first prize).
Quite a story is told if you’re boring and as easily amused as me.
In my opinion, if you do feel inclined to play the game, be aware that when I say you probably won’t win, I mean that the numbers are so strongly against you that you do not appreciably improve your odds by throwing down $100 or even $1,000. The little $4 wins do happen, but they never pay and $1,000 spent will likely not get you more than $100 in total of winnings. It might as well be a voluntary tax. Cherish the dream your $2 buys, but do not stake your well-being on it. There’s nothing wrong with dreaming as long as you understand where to wake up.
There was a grand prize winner last night (Wednesday 8-23-17). The outcomes are almost completely as should be expected: the winner is in Massachusetts… the majority of the country’s population is located in states on either the east or west coast, so this is unsurprising. There were 40 match 5 winners, so you would anticipate at least one to be a grand prize winner, which is exactly what happened (1 in 26 difference between 5 with powerball and 5 without). There were about 5.9 million powerball-only winners, so 38.32*5.9 is 226 million total powerball numbers sold in the run-up to last night’s drawing… with grand prize odds of 1 in 292 million, this is approaching parity. This means that more than $452 million was spent since Saturday on powerball lottery numbers (calculation excludes the extra dollar spent on multipliers). About five times as many ticket numbers were sold for this drawing as when I made my original analysis a week ago. With that many tickets sold, there was almost assuredly going to be a winner last night. This is not to say there shouldn’t have been a winner before this –probability is a fickle mistress– but the numbers are such that it was unlikely, but not impossible, for the prize to grow bigger. The last time the powerball was won was on 6-10-17, about two months and thirteen days ago… you can know that this is an unusually large jackpot because this period is longer than the usual period between wins (I had generously estimated 6 weeks based on the guess of 2 match 5 winners per drawing, but I think this might actually be a bit too high).
There was only one grand prize winning number out of 226 million tickets sold (not counting all the drawings that failed to yield a grand prize winner prior to this.) Think on that for a moment.