What is OmniNerd?

Welcome! OmniNerd's content is generated by you, the reader. Through voting and moderation we strive to highlight the nerdiest of what's around and provide content that's a little more thought provoking than other sites.

Submit New Content

Voting Booth

Overall 2008 debate winners?

41 votes, 5 comments
2
Nerd-Its
+ -

RE: Where's the analysis?

Comment comment by VnutZ on 01 November 2007

I don't suppose you could run that one last time - just restrict the data set to everything newer than June 25, 2005 (I think). That would be version 4, the current game, of MegaMillions.

I'll be the first to admit - statistics was the first math class I actually bombed ... only got a B- and that was years ago. So there are undoubtedly things I did in this analysis that would make a statistician cringe. I'm hoping more that the aggregation of all the different tables will result in something mildly more meaningful that straight random.

Star This to Save in Your Profile Favorite
Thread parent sort order:
Highest Voted : Lowest Voted : Oldest : Newest
Thread verbosity:
Expand All : Minimize Replies to Comments
6 Nerd-Its - +
RE: Where's the analysis? by Anonymous :: NR0

Hmm, there are of course two problems if actual balls are used. The normal balls can be unequal ('nope', p-value 0.5131 for v4). And the special balls can be unequal (0.8722).

Ok, the p-values are not significant, but... If actual balls are being used, there will be really tiny differences between balls. So one could currently select:

7 53 5 25 46 - 42

But if prices are shared between winners, one should not select those. Just select some random numbers in the middle...

On the other hand, it could also be that a ball that gets selected is being a little damaged, after which it becomes less likely to be selected. But I expect that a ball that is selected gets smaller and gets even more likely to be selected next time. I did not test for those type of effects yet. :)

Next to that, there could be interactions between balls that get selected together/not together more often. I did not test for them too. Replacement frequency and such is also of influence. Perhaps time for some physical tests? :)

Code (with a small fix):

big=read.table("big.dat",sep="%",fill=T)
big$date=as.Date(apply(big[,1:3],1,paste,collapse="-"))
big$type=ifelse(big$date>="1999-1-13",ifelse(big$date>="2002-3-15",ifelse(big$date>="2005-06-22",4,3),2),1)
chisq.test(table(unlist(big[big$type==4,5:9])),p=rep(1/56,56))
chisq.test(table(unlist(big[big$type==4,10])),p=rep(1/46,46))
z=table(unlist(big[big$type==4,][,5:9]))
z[order(z,decreasing=T)]
z=table(unlist(big[big$type==4,][,10]))
z[order(z,decreasing=T)]