Everyone wants to get rich, especially if it only costs them $1. Fortunately, many local state governments host lotteries, allowing their constituents to donate cash into the budget in hopes of winning a multi-million dollar prize. In practice, most lottery drawings consist of a series of balls drawn randomly from a chamber which should guarantee a fair opportunity for everyone to win. Theories of rigged lotteries and fraud, however, run rampant across the Internet.1 The system is accused of not holding live drawings, publishing winning numbers prior to drawing them, permitting the tweaking of data archives to avoid payouts, intentionally modifying balls, or using balls with painted numbers whose natural weight affects their likelihood of appearance.
![]() |
This website is an update to the article you're reading now that features a dynamic, historical pattern analysis engine that refreshes itself with the latest trends following every MegaMillions drawing. |
| Sponsored Advertisement | |
Rather than debunk any of these theories of lottery fraud or rigging, this article reveals the trends and patterns of winning lottery numbers for public scrutiny using basic data analysis. It uses the results of the MegaMillions lottery and consists of the following analyses:
- distribution of winning numbers over time
- behavioral stratification of numbers based on numerical position
- relationship between mutually winning numbers
- common differences between winning numbers
- winning number frequency
While such scrutiny has the potential to yield useful results, such as identifying the existence or lack of "better numbers" to play, it is presented so as to appeal to those interested in number patterns.
MegaMillions History
Recently, MegaMillions drew the largest jackpot ever recorded at $370 million, exceeding the previous record held by PowerBall.2 The prizes were not always so large, nor did the participants span the United States. Beginning life in 1996, MegaMillions originally existed under a different nomer: "The Big Game." For two years this lottery was drawn weekly on Fridays until 1998, when a Tuesday drawing was added. Over the past eleven years, the number of participating states has doubled from only six to twelve. Although there are minor interstate variations regarding how jackpots are paid to winners, the basic game play remains the same.3
A single dollar in MegaMillions purchases a 1 in 175,711,536 chance of landing the jackpot. A player may opt for a "QuickPick" set of numbers generated automatically by a computer or they may choose to select their own numbers. Since 2005, MegaMillions allows players to choose five numbers between 1 and 56 plus a sixth number, the MegaBall, between 1 and 46. This, however, was not always the selection pool. When the "Big Game" was conceived, players were given a pool of numbers 1 through 50 to choose for their first five balls and numbers 1 through 25 for their sixth. Beginning in 1999, players were offered the numbers 1 through 50 for the five regular balls and 1 through 36 for the sixth. When the game became MegaMillions in 2002, players selected numbers between 1 and 52 for both the five regular balls and the MegaBall.4
Gathering Data
As a first step, it was necessary to obtain a collection of MegaMillions’ lottery numbers. Fortunately, the New Jersey Lottery website has an archive of all winning numbers since September 6, 1996.5 As an added bonus, the archive of numbers exists in both HTML format for a pretty web presentation and as a delimited file which is conducive for importing into a database. For the purposes of this analysis, the winning lottery numbers were imported into Microsoft SQL Server Express for processing queries. Subsequent graphs were then created with Microsoft Excel to visualize the trends and behavior.6
The delimited file of winning lottery numbers contained the results for 1078 drawings and provides the following fields:
- Year – formatted as YYYY
- Month – formatted as MM
- Day – formatted as DD
- Day of Week – formatted as Tuesday and Friday
- Ball 1, 2, 3, 4, 5 – as an integer
- MegaBall – as an integer
- Prize Payout – when present, formatted as a decimal value
- Date – formatted as YYYYMMDD
Distribution of Winning Numbers Over Time
The first trend analyzed was whether or not the numbers occur with an even distribution. Balls 1, 2, 3, 4 and 5 were consolidated into a single list to analyze their overall frequency of occurrence. Each separate version of the lottery – two editions of BigGame and two editions of MegaMillions – were analyzed independently to identify any outlying activity. Subsequently, a similar grouping was performed to determine the distribution of the MegaBall number. The following charts detail the number of times each number was selected over the course of the the entire span of MegaMillion’s drawings.
Behavioral Stratification of Numbers Based on Numerical Position
After looking at the behavior of the numbers in aggregate, the occurrence of numbers respective to their position was analyzed. Unfortunately, the lottery does not store the numbers in the order they were drawn. Rather, the data file saves the winning lottery numbers in ascending order.7 As such, positional analysis focused on how the numbers are stratified within their given position.
It is important to recognize the four variations of the lottery’s number pool has an impact on the ratio of occurrence for each number. As such, the data was broken into four sets titled (uncreatively) version 1, version 2, version 3 and version 4. Winning numbers per position were counted to determine the numbers that win most frequently within each set. Then, an aggregate winning percentage was assigned by combining the win ratio of each set multiplied by a time factor to obtain the overall likelihood of a number to win. The time factor represents the percentage share of drawings per version, which equates to 15.95%, 32.37%, 30.05% and 21.61%, respective to MegaMillions versions one (original) through four (current).
Each of the six graphs represent the top fifteen numbers per position:
- Green bars represent the current version of MegaMillions where players choose from numbers 1 through 56 and a MegaBall number of 1 through 46.
- Blue bars represent the weighted aggregation of a number’s winning percentage from all MegaMillions drawing variations since 1996.
- The red line represents a five variable polynomial trend line to the winning percentage of the current MegaMillions drawing pool.
Relationship Between Mutually Winning Numbers
Additionally, an analysis was performed to determine which numbers "win together." After all, a player does not need to pick all six numbers in order to win money from MegaMillions. Therefore, all possible combinations of balls 1, 2, 3, 4 and 5 were formed to analyze the occurrence of ball relationships.
|
|
||
|
In the current version of MegaMillions, pairs of numbers win repeatedly quite often. In the lottery’s lifetime, particular pairs have won regularly. |
Doubles
There are ten combinations of ball pairs: [1 2], [1 3], [1 4], [1 5], [2 3], [2 4], [2 5], [3 4], [3 5], & [4 5]. Using the MegaMillions data, there are 10,780 possible pairs of which 1503 are unique over the lottery’s lifetime. In the current version of MegaMillions, there are 2330 possible pairs of which 1202 are unique. Pairs of numbers occur quite frequently; 1426 pairs have occurred 10,703 times throughout the lottery’s lifetime compared with 692 pairs that have appeared 1820 times in since the fourth version of MegaMillions began.
The graph at right shows that in the current version of MegaMillions, pairs of numbers win repeatedly quite often. Over the lottery’s lifetime, particular pairs have won very regularly.
Triples
There are ten combinations of ball triples: [1 2 3], [1 2 4], [1 2 5], [1 3 4], [1 3 5], [1 4 5], [2 3 4], [2 3 5], [2 4 5], & [3 4 5]. There are 10,780 possible triples of which 8675 are unique over the lottery’s lifetime. Within the past version of MegaMillions, there are 2330 possible triples of which 2245 are unique. 1789 sets of triplets have repeated 3894 times in the lifetime of the lottery. Five sets of triplets have occurred five times and two sets of triplets have occurred six times. In the past year, however, only twenty-three sets of three balls have repeated twice.
|
|
||
|
There have been many repeat winning combinations of three numbers in the lifetime of the lottery, although it is a less frequent phenomenon in the current version. |
In the graph to the right, there have been many repeat winning combinations of three numbers in the lifetime of the lottery, although it is a less frequent phenomenon in the current version.
Quadruples
There are five combinations of ball quadruples: [1 2 3 4], [1 2 3 5], [1 2 4 5], [1 3 4 5], & [2 3 4 5]. There are 5390 possible quadruplets of which 5328 are unique over the lottery’s lifetime. Since June 22, 2005 there are 1165 unique combinations of possible quadruplets. Sixty-two sets of four numbers have repeated twice since MegaMillions began. There have been zero sets of quadruplets winning more than once in the current version of MegaMillions.
Quintuples
There is only one combination of ball quintuples: [1 2 3 4 5]. Only one set of five numbers has ever repeated twice in the history of MegaMillions: (11, 14, 18, 33, 48).
Common Differences Between Winning Numbers
A natural extension of analyzing number groups was to identify the trends by which numbers differ from one another. For example, while probability gives the numbers 20, 21, 22, 23 and 24 the same chance of appearing as any other combination, is it likely? The numbers for the most recent version of MegaMillions were scrutinized to determine if there is a common difference between each ball.
Winning Number Frequency
Analyzing the distribution of numbers over time only provided half the picture in terms of any given number’s propensity towards winning. Another aspect to consider was the temporal frequency by which a number wins. For example, a number may have won on thirty occasions, but maybe they were all two years ago. To study this behavior, the time delta between each number’s appearance was cataloged to establish statistics for all numbers and for each number across the lifetime of the fourth version of MegaMillions. Then, the analysis was repeated using only the most recent six months of data to identify the cross section of numbers that win frequently consistently and which numbers are just a current flash in the pan.
Conclusions
Interesting as these trends may be, they will not assist in making the odds of winning the MegaMillions lottery any better if the system is truly fair and random. However, in the event there is some peculiar factor skewing the ball selection such that any of these trends continue, a player stands a mildly better chance of winning a partial prize through the selection of weighted numbers.
1 "The Lottery is Rigged." Uncoverer. Accessed October 2007 from http://www.uncoveror.com/lottery.htm.
2 Roland, Neil. "Mega Millions Lottery Jackpot Now Record $370 Million." Bloomberg. Accessed October 2007 from http://www.bloomberg.com/apps/news?pid=20601103&sid=afUgc0t0u3hg&refer=us.
3 "How to Play: Play the game." MegaMillions.com_. Accessed October 2007 from http://megamillions.com/howtoplay/play_game.aspgame.asp.
4 "About Us: Game History." MegaMillions.com_. Accessed October 2007 from http://megamillions.com/aboutus/game_history.asphistory.asp.
5 New Jersey Lottery. Accessed September 2007 from http://www.state.nj.us/lottery/data/big.dat.
6 Microsoft SQL Server Express. Accessed September 2007 from http://www.microsoft.com/sql/editions/express/default.mspx.
7 Ultimately, order does not matter with lottery numbers.
Similarly tagged OmniNerd content:
- Pattern Analysis of the PowerBall Lottery, by VnutZ 11 months ago
This article was edited after publication by the author on 17 Mar 2009.
View changes.



an academic
article
by 
Print Friendly
Write an Article
Where's the analysis? by Anonymous :: NR0 :: Show
You’ve collected and summarized a bunch of data here, but it would be much more useful (and straightforward) to run some inferential procedures (Monte Carlo procedures would be particularly easy to implement in this case) to see if these results were compatible with the hypothesis of a fair game. Give it a try!
Winning numbers vs Prizes by Anonymous :: NR0 :: Show
Hi, first of all congratulations on your deep analysis.
I made something similar on the Spanish Lotto may years ago but using a different approach.
I started asumming the game was fair and the lottery wasn’t rigged. Then I compared every winning combination (and sub-combinations of 3, 4 and 5 winning numbers) with the expected number of winners for the total bets on each game each week.
Using this method I confirmed a theory many people (myself included) has: that not all the players play their numbers "randomly" but that they have some "favourite numbers" and others that people just don’t like. Combos like 1-2-3-4-5 are played a lot, also date combinations — while numbers from 31 seem to be played less — leading to too-many-winners or no-winner scenarios in each case.
You can detect this by checking how many winners are in each prize category on each game, comparing with the expected statistical results (not only for the big prize but for the smaller ones also).
On the Spanish Lotto (a basic 6/49 lotto, with 1 in 14 million odds) sometimes you have no big prize winners even when there are 30 or 40 million bets (and there "should be" at least 1 or 2); sometimes you may have 10 or 20 winners with only 10-15 million bets (for an «easy» combo like 1-2-4-8-9-24 or 7-14-21-28-41-42).
People can select their own numbers or let the «machine» select it at the shop, those numbers are supposed to be random and doesn’t seem to influence on this (even if they are large).
If you read Spanish or can find a good translator you can find my (totally amateur) work here
http://www.microsiervos.com/archivo/azar/loto-un-sistema.html
— Alvy
Any predictions? by Anonymous :: NR0 :: Show
I have read the article and would like to ask if you think that the results can be used to predict any trends in future outcomes of the lottery?
If not, what test would be necessary to prove that the balls are not evenly distributed?
Um... by Anonymous :: NR0 :: Show
Have you considered doing a REAL STATISTICAL TEST instead of making graphs. I doubt any of this data is statistically significant.
Assuming a completely fair game ... by Anonymous :: NR0 :: Show
Assuming a completely fair game could a player still benefit from playing numbers that were played less frequently by other players? For example, you could play more numbers higher than 31 which are not dates, so there are fewer other players to share a prize with, assuming other players play 1 to 31 more frequently. Not only would the winning numbers be looked at but also the number of winners for a given combination or number.
Choose most commonly wining numbers or least by Anonymous :: NR0 :: Show
Turns out I did this exact same analysis (but i didnt think to write it up, nor that anyone would be interested in reading it).
At the time I did it (well over a year ago), 32 was the most common winning number and 3 was the most common mega ball.
The problem than, and the problem now, is that there havent been enough drawings to see that the histogram is truly flat or not (truly random).
I would suggest that, the drawing is in fact random, and choosing least common numbers will be better than choosing more common numbers. the reason for this is that eventually we would expect the histogram to be flat, in time the occurrence of infrequently drawn numbers should increase so as to flatten out the histogram.
However, each drawing is random. Before we see another 35 (the least common number when I did it), we could see 20 more 32s.
missing "no winner" numbers drawn? by Anonymous :: NR0 :: Show
I read the article and the first thing I noticed is that he didn’t account for the number of times that there wasn’t any mega millions winner…you know, how many weeks went by without a jackpot winner…where are those "no winner" numbers in his analysis?
Mixing errors by Anonymous :: NR0 :: Show
Has anybody seen this – it’s about using mixing errors in lotto picks:
http://use4.com/Prove-it.html
Thumbs52 by Anonymous :: NR0 :: Show
In what area did you include the "luck" factor? The statistical analysis appears complete; except… in all areas of gathered factual information there tends to be a human factor skewing the results in one fashion or another. Might the human factor in your analysis be "luck"? After all, it appears that some people are simply luckier than others.
Location of winners by Anonymous :: NR0 :: Show
Does anyone have data showing the distribution of tickets vs winners? I remember, a while back, that it was skewed towards Georgia for some reason…
What about if they change there set of balls ? by Anonymous :: NR0 :: Show
Some lottery corporations are changing the balls set over time.
Great article! by Anonymous :: NR0 :: Show
This is great work, thanks for posting this. I’ve done some similar ongoing analysis, and created a website http://www.lottoroller.com/megamillions for both MegaMillions and PowerBall.
Vegasgamer by Anonymous :: NR0 :: Show
I can claim myself as a Powerball winner and it was all luck. My winnings amounted to 3500 dollars and it was a bonus drawing ticket and had nothing to do with picking the right numbers. That being said, I lived in Las Vegas for seven years and played a lot of video keno which is just like the lottery. I won some jackpots but lost more than I won. I won on single screen keno and I won on multiple screen keno which is like playing up to twenty lottery number combinations on one game. I probably played well over 100,000 games of keno or more and I tried every trick in the book. What I found is that there are two ways to win without a lot of luck and that is 1.) to overwhelm the game with number combinations. Or 2.) Do like the 80 year old man who recently won powerball and play the same numbers every week for 17 years (which is just what he did). The first option is expensive and the second option relies on the possibility (not probability) that if you stay on the numbers long enough, they will hit (as is the case in Keno). Take your pick.
Additional information by Anonymous :: NR0 :: Show
I recently performed some simple statistical testing on the data set corresponding to the current rules with 56 draw and 46 bonus balls. The data appears to exhibit acceptable variance, randomness and normality based on Chi Squared, runs test for randomness, and Anderson-Darling test for normality. The upshot, is that no matter how you pick your numbers, you have 1 in 175 million chance of winning. Best of luck.
Analyzing my state's lottery... by Anonymous :: NR0 :: Show
Interesting line of thought regarding your data, matthew. I’ve also been down this road, but have since abandoned such trend/frequency analyses. However, i found a theory online specifically addressing the true randomness of lotteries when a mechanical means is used to generate the results, e.g. balls bouncing about in a glass box via tumblers, air, etc. What this theory, which i call “the mechanical selection of randomization” (since i don’t remember what the author called it) did was confirm i was seeing patterns in something allegedly random. simply stated, the theory hypothesizes that if a mechanical process is used to select something randomly, and there are no variables introduced to alter the natural mechanical process, unintended PATTERNS inherently occur over time due to the repetitive nature of the mechanical process. i believe the theory is sound and lotto commissioners took note of it, also. if you’re on this thread, i assume it’s because you’re also fascinated by the alleged randomness of a lottery, or you see peculiarities in lottery results just by reading the actual winning numbers in chronological order. nevertheless, as i’ve experienced through the type of research mr. vea shares, seeing a pattern and predicting one in a 6/42-based lottery is an entirely different matter. for those live lotto draws, wise commissioners could easily thwart folks like us who love to crunch and analyze data simply by changing the order balls are dropped, rearranging the ball setup, increasing the air pressure for applicable systems, letting the balls tumble about longer, etc. “weighted balls”, imo, raises the specter of impropriety and corruption. then there are those backroom lotto draws that mys tate either does or used to. for example…if folks like us knew every lotto combination purchased for a particular play date, we could then determine which combinations weren’t purchased. if the draw isn’t live, a lottery commissioner, with the aforementioned knowledge could conceivably control WHEN and WHAT number combinations won…assuming all possible winning combinations weren’t actually purchased. anyway, if someone can predict the next lottery number “pattern” by examining trends and frequencies, that person is a g’damn genius! (or one lucky sob.)
“unintended patterns due to the repetitive nature of a mechanical process undermines true randomness”. it is this thought that motivated me to no longer consider looking at how often a number appears nor even when, but to focus more on GROUPS. it is my theory, piggy-backing on that mechanical selection of randomness theory, that if true randomness is undermined by a mechanical selection process, over time those patterns that develop can be identified through GROUPS. hint: in my theory you must think beyond how many balls are actually drawn to comprise a winning combination. forgive me folks, but i love opining about lotto!
Only one set of five numbers has ever repeated twice in the history of MegaMillions: (11, 14, 18, 33, 48). by Anonymous :: NR0 :: Show
Repeated numbers (( Only one set of five numbers has ever repeated twice in the history of MegaMillions: (11, 14, 18, 33, 48)).Please tell me im wrong But i dont think that any numbers have repeated themself like that AT LEAST NOT AS OF THIS COMMENT. PLEASE INFORM ME IF IM WRONG. lottery pirate
No Sequential Order by Anonymous :: NR0 :: Show
Too bad the lottery does not publish the numbers in the order they were drawn. That would make a measure of central tendencies more acurate to analyze. New York has a FOIL (Freedom of Information Law) which allows someone to request lottery records at $0.25 per page, but then you would have to enter the data in by hand.
Sucks huh?
Maybe the reason they don’t publish it is so it’s harder to make forecasts.
mega millions by mr-brent :: NR0 :: Show
have you studied roulette numbers into lottery it comes alot to win 4 numbers 5 numbers in grandseries and tiers if you look at the numbers are standed tier is 33 16 24 5 10 23 8 30 11 36 13 27
Analysis... by Anonymous :: NR0 :: Show
Great work! So with all this info, how much have you actually won? Must be a lot, because only someone with A LOT of time on their hands (like a lottery winner) would come up with something like this.
Plinko Rule by Anonymous :: NR0 :: Show
Great work here on the data analysis. One thought to bear in mind, all the lotto balls are not dropped, thus you may not have a full frequency analysis. This may sound a little strange, but to really get the answers you are looking for you need a test. No self respecting engineer would do this on data alone, you need the physical test to accelerate and refine the data. I saw an exhibit recently that comes to mind, it was at a children’s science fair. It was a giant plinko board about 10 feet tall by 15 feet wide. It released balls at the center point at the top from a tumbler. It was designed to explain statistics and distribution theory. It worked wonderfully as you would get a nice distribution curve, almost perfect, visually impressive. My thought would be to build such a device and drop the lotto balls “numbered” and run the test 10,000 times. Log the data and then give me your comparative results. Two tests would even be better. try dropping all 50 balls and test the repeatable distribution of number sequences. The problem with most folks is we’re lazy and won’t take the time to engage such a project. Plywood, nails, ping-pong balls….away ya go…. Take your own data…..
Sincerely,
Test Engineer