Friday, February 27, 2015

Null Hypothesis Significance Testing Procedure--Banned!

The psychology journal Basic and Applied Social Psychology has banned the use of the null hypothesis significance testing procedure (NHSTP).

I am not proficient in scientific method or statistics. However, I observe that the article at the link seems to criticize use of the NHSTP based on the fact that there are widespread confusions about the actual meaning of NHSTP. One would think that the solution to this problem is to better educate people about the actual meaning of NHSTP, rather than to ban its use in a journal. However, presumably the grounds for banning NHSTP are in part due to the misplaced degree of importance it is given based on those common misinterpretations. Or something like that.

Perhaps BASP's decision to ban the use of NHSTP is at least partly symbolic--as a way to signal their commitment to more rigorous standards for assessing the importance of research findings than the typically lazy or otherwise problematic uses of the old stand-by, NHSTP. And the ban will get people's attention more than just not requiring the use of NHSTP, as the journal had previously done.

On a related note, this video contains an informative critique of the use of p-values and a defense of the use of confidence intervals as an alternative criterion for assessing the importance of research findings (if that's the right way to put it):


2 comments:

Delta said...

Point: Hypothesis tests (NHSTP) and confidence intervals are in some sense just complementary ways of expressing the same thing. Casella/& Berger, "Statistical Inference", p. 407: "There it is, perhaps, more easily seen that both tests and intervals ask the same question, but from a slightly different perspective... The correspondence between acceptance regions of tests and confidence sets holds in general."

In the top link, to the extent that BSP has "banned the use of NHSTP and related statistical procedures", they have also had to ban confidence intervals: "Regarding confidence intervals, the problem is that, for example, a 95% confidence interval does not indicate that the parameter of interest has a 95% probability of being within the interval..." (again from your top link, quoted from the BSP journal).

Plus they've given a "lukewarm... case-by-case" acceptance of Bayesian procedures, which are customarily held out as the only alternative. So it looks to my eye like BSP has just entirely ruled out any inferential methods at all -- any way to make generalizations from data to the larger world.

My next suggestion would be that you look up John Ioannidis' essay "Why Most Published Research Findings Are False" (2005), his theory of what a "null field" would look like, and his suggestions for improving the situation.

Delta said...

And perhaps even more on topic: Valen Johnson, "Revised standards for statistical evidence" (2013), comparing classical significance tests to Bayesian tests, suggests that the P-value threshold for significance should more properly be 0.005 or 0.001 (i.e., ten times as strict or more). I tend to agree with this, particular in the context of many researchers vying for publication for career advancement purposes.