A separate email concerning this post suggested an easier way to determine the needed sample size for significance is to feed higher numbers into Fisher's Exact Test until the pvalue is low enough. I did this using the online calculator mentioned previously and was able to confirm Zinj's linear regression results. The first three tables show proportionally increasing sample sizes that still do not indicate significant association. The last table shows sample sizes of 37 times those used by MythBusters and a 2-Tail pvalue (which is most appropriate given the data set) very close to .05.
------------------------------------------ TABLE = [ 40 , 120 , 100 , 240 ] Left : p-value = 0.17952061079853096 Right : p-value = 0.8714850889314807 2-Tail : p-value = 0.33731352248492574 ------------------------------------------ TABLE = [ 80 , 240 , 200 , 480 ] Left : p-value = 0.08412887957942149 Right : p-value = 0.9370868768109766 2-Tail : p-value = 0.1522036361266696 ------------------------------------------ TABLE = [ 120 , 360 , 300 , 720 ] Left : p-value = 0.04267499690899746 Right : p-value = 0.9675062534995688 2-Tail : p-value = 0.08429297979286202 ------------------------------------------ TABLE = [ 148 , 444 , 370 , 888 ] Left : p-value = 0.027144911945084876 Right : p-value = 0.9791738747189928 2-Tail : p-value = 0.05205262573928032 ------------------------------------------
No, you miss the point. The method used in the article was absolutely not sufficient to show anything other than degree of linear correlation. As stated at the beginning of this post, you admit you still not understand how R^2 and p-values (or your relevant test statistic) are associated, and how they are different from each other.
It is perfectly possible to obtain samples with extremely low R-squareds (like 0.00ish), that still easily pass linear F-tests. The reason behind this is despite a low degree of correlation, an extremely high sample size can shrink the relevant variances. R-squared does not take sample size into account at all (look at your formulae, the (n-1)'s cancel out.
The only thing behind your support is that you *could* use R to determine a p-value, because Chi-squared (or fischer) and R are so related. This would require modifying the values by the sample-size, at which point you could determine a "critical-R" for a specific alpha value (and specific sample size). Note that is not usually how statisticians think of things, but it is perfectly valid. However this is entirely not what you did. Your idea that R > .10 shows significance is flat out wrong, The mere fact that you (you personally that is, it is theoretically doable with a bit of arithmetic) cannot provide an alpha to this R threshold of yours pretty much proves it.
So basically, the only reason the article agrees with the correct answer is via dumb luck. The statistics cited are totally incorrect, which seems to be the point you're totally missing.

Add a Comment
Email This
Statistics

RSS


More (and perhaps more appropriate) statistical analysis
I received a number of emails concerning the statistical method I used (Pearson's correlation coefficient), which provided some insight but does not sufficiently address the issue of causation in the results. Personally, I don't understand how there can be so obviously not a correlation between two variables and there still be a chance there is causation involved, but with the aim of statistical appropriateness, I have included a number of alternative statistical methods below.
Association Test
An association test such as Fisher's Exact Test is appropriate. This method is specifically for determining any non-random association between two categorical (discrete) variables - which is exactly what we have in this instance. Its use, then, removes any issues there may have been in the Pearson analysis having to do with the data set not being continuous.
For those interested, the calculations are described in the link above. The results are easy to come by, however, using online tools such as this calculator at Matforsk.com. Inserting the MythBuster's data results in the following:
This corresponds to there being 4 non-seeded subjects who yawned, 10 seeded who yawned, 12 non-seeded to didn't yawn, and 24 seeded who didn't yawn. The resulting p-values are all well above the commonly accepted limit of .05 for significance.
Confidence Interval for the Difference in Rates
This method was recommended via email by Max Kuhn, a "Ph.D. statistician who works in industry." Max provided a very thorough and helpful analysis of the data, which I've included below:
p1 <- 10/34 p2 <- 4/16 n1 <- 34 n2 <- 16 q1 <- 1-p1 q2 <- 1-p2 p1 - p2 [1] 0.04411765 sqrt((p1*q1/n1) + (p2*q2/n2)) [1] 0.1335103 p1 - p2 + (qnorm(0.05) * sqrt((p1*q1/n1) + (p2*q2/n2))) [1] -0.1754872testStat <- function(index, data) { bootSample <- data[index,] p1 <- mean(bootSample[bootSample$group == "withSeed", "outcome"] == "yawn") p2 <- mean(bootSample[bootSample$group == "noSeed", "outcome"] == "yawn") p1 - p2 } noSeed <- rep(c("yawn", "none"), times = c(4, 12)) withSeed <- rep(c("yawn", "none"), times = c(10, 24)) mythBusters <- data.frame( outcome = factor(c(noSeed, withSeed), levels = c("yawn", "none")), group = factor(rep(c("noSeed", "withSeed"), times = c(16, 34)))) testStat(1:50, mythBusters) library(bootstrap) set.seed(1) results <- boott(1:50, theta = testStat, nboott = 5000, data = mythBusters, perc = 0.05)Linear Regression to Show Sample Size Needed for Significance
I received yet another very friendly and helpful email from Zinj Boisei who pointed out I was too hasty in dismissing the use of an increased sample size. By using a more appropriate analysis, linear regression in this case, Zinj confirmed there was little significance at the sample size of 50 - and even went on to find out large a sample size of the same makeup would need to be for the results to be significant:
Conclusion Addendum
While the statistical method used in the article was sufficient to show the yawn seed was responsible for a negligible amount of the variance, methods such as association tests, confidence interval analysis and linear regression provide more appropriate insight into the causation involved. In this case, all tests lend credence to the original conclusion: the results of the MythBuster's yawn experiment did not support their conclusion.
View Full Discussion