Recommended readings

Here is a growing list of some of my favourite articles on data analysis; mostly articles I recommend to students or when reviewing or editing papers.

There is no free lunch in inference

Meehl, P. E. (1997). The Problem is Epistemology, Not Statistics: Replace Significance Tests by Confidence Intervals and Quantify Accuracy of Risky Numerical Predictions. In L. L. H. Steiger Stanley A. Mulaik, James H. (Ed.), What If There Were No Significance Tests? Psychology Press.
Rouder, J. N., Morey, R. D., Verhagen, J., Province, J. M., & Wagenmakers, E.-J. (2016). Is There a Free Lunch in Inference? Topics in Cognitive Science, 8(3), 520–547.
Blume, J. D., Greevy, R. A., Welty, V. F., Smith, J. R., & Dupont, W. D. (2019). An Introduction to Second-Generation p-Values. The American Statistician, 73(sup1), 157–167.
Riesthuis P. Simulation-Based Power Analyses for the Smallest Effect Size of Interest: A Confidence-Interval Approach for Minimum-Effect and Equivalence Testing. Advances in Methods and Practices in Psychological Science. 2024;7(2).
Kruschke JK. Rejecting or Accepting Parameter Values in Bayesian Estimation. Advances in Methods and Practices in Psychological Science. 2018;1(2):270-280.

Review & position papers

Tukey, J.W. (1969) Analyzing Data – Sanctification or Detective Work? American Psychologist, 24, 83-91. [inspiring thoughts from a true visionary, hinting at what will become data science]
Shmueli, Galit. To Explain or to Predict?. Statist. Sci. 25 (2010), no. 3, 289–310. doi:10.1214/10-STS330. https://projecteuclid.org/euclid.ss/1294167961
Fiedler, K. (2011). Voodoo Correlations Are Everywhere—Not Only in Neuroscience. Perspectives on Psychological Science, 6(2), 163–171. https://doi.org/10.1177/1745691611400237
Button, K.S., Ioannidis, J.P., Mokrysz, C., Nosek, B.A., Flint, J., Robinson, E.S. & Munafo, M.R. (2013) Power failure: why small sample size undermines the reliability of neuroscience. Nature reviews. Neuroscience, 14, 365-376.
Gelman, A., & Carlin, J. (2014). Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors. Perspectives on Psychological Science, 9(6), 641–651. https://doi.org/10.1177/1745691614551642
Forstmeier, W., Wagenmakers, E.J. & Parker, T.H. (2016) Detecting and avoiding likely false-positive findings – a practical guide. Biol Rev Camb Philos Soc. [best summary of statistical, experimental and experimenter issues leading to false positives]

Understanding P values

Greenland, S., Senn, S.J., Rothman, K.J. et al. (2016) Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol 31: 337. https://doi.org/10.1007/s10654-016-0149-3
Ronald L. Wasserstein & Nicole A. Lazar (2016) The ASA’s Statement on p-Values: Context, Process, and Purpose, The American Statistician, 70:2, 129-133, DOI: 10.1080/00031305.2016.1154108
Wagenmakers, E.J. (2007) A practical solution to the pervasive problems of p values. Psychonomic bulletin & review, 14, 779-804. [one of the best summaries of problems with p values]

A world without mindless dichotomies (p<0.05)

Meehl, P.E. (1997) The Problem Is Epistemology, Not Statistics: Replace Significance Tests by Confidence Intervals and Quantify Accuracy of Risky Numerical Predictions. In: Harlow, L., Mulaik, S.A. and Steiger, J.H., Eds., What If There Were No Significance Tests? Erlbaum, Mahwah, NJ, 393-425.
Kruschke, J.K. & Liddell, T.M. (2018) The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective. Psychon Bull Rev 25: 178. https://doi.org/10.3758/s13423-016-1221-4
Gelman, A. (2018). The Failure of Null Hypothesis Significance Testing When Studying Incremental Changes, and What to Do About It. Personality and Social Psychology Bulletin, 44(1), 16–23. https://doi.org/10.1177/0146167217729162
Blakeley B. McShane, David Gal, Andrew Gelman, Christian Robert, Jennifer L. Tackett (2018) Abandon Statistical Significance. arXiv [there is no need for arbitrary thresholds, let’s embrace uncertainty!]
Valentin Amrhein, David Trafimow & Sander Greenland (2018) Inferential statistics as descriptive statistics: there is no replication crisis if we don’t expect replication. PeerJ Preprints [scientific life without zombie thresholds – refreshing]

Solutions to common issues

Jaeger, T.F. (2008) Categorical Data Analysis: Away from ANOVAs (transformation or not) and towards Logit Mixed Models. J Mem Lang, 59, 434-446.
MacCallum, R.C., Zhang, S., Preacher, K.J. & Rucker, D.D. (2002) On the practice of dichotomization of quantitative variables. Psychological Methods, 7, 19-40.
Sassenhagen, J. & Alday, P.M. (2016) A common misapplication of statistical inference: nuisance control with null-hypothesis significance tests. Brain and Language

Beyond power calculations: planning for precision

Gelman, A. & Carlin, J. (2014) Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors. Perspect Psychol Sci, 9, 641-651.
Maxwell, S.E., Kelley, K. & Rausch, J.R. (2008) Sample size planning for statistical power and accuracy in parameter estimation. Annu Rev Psychol, 59, 537-563.
Peters, G.-J.Y. & Crutzen, R. (2017) Knowing exactly how effective an intervention, treatment, or manipulation is and ensuring that a study replicates: accuracy in parameter estimation as a partial solution to the replication crisis. PsyArXiv. doi:10.31234/osf.io/cjsk2.
Rothman, K.J. & Greenland, S. (2018) Planning Study Size Based on Precision Rather Than Power. Epidemiology, 29, 599-603.

Data visualisation

Allen, E.A., Erhardt, E.B. & Calhoun, V.D. (2012) Data visualization in the neurosciences: overcoming the curse of dimensionality. Neuron, 74, 603-608.
Anscombe, F.J. (1973) Graphs in Statistical Analysis. Am Stat, 27, 17-21.
Weissgerber, T.L., Milic, N.M., Winham, S.J. & Garovic, V.D. (2015) Beyond bar and line graphs: time for a new data presentation paradigm. PLoS Biol, 13, e1002128.
Weissgerber, T.L., Garovic, V.D., Winham, S.J., Milic, N.M. & Prager, E.M. (2016) Transparent reporting for reproducible science. J Neurosci Res, 94, 859-864.
Wilcox, R.R. (2006) Graphical methods for assessing effect size: Some alternatives to Cohen’s d. Journal of Experimental Education, 74, 353-367.

Data description

DeCarlo, L.T. (1997) On the meaning and use of kurtosis. Psychol. Meth., 2, 292-307.

Interactions

Nieuwenhuis, S., Forstmann, B.U. & Wagenmakers, E.J. (2011) Erroneous analyses of interactions in neuroscience: a problem of significance. Nat Neurosci, 14, 1105-1107.
Gelman, A. & Stern, H. (2006) The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant. The American Statistician, 60, 328–331.
Loftus, G.R. On interpretation of interactions. Memory & Cognition 6, 312–319 (1978).
Wagenmakers EJ, Krypotos AM, Criss AH, Iverson G. On the interpretation of removable interactions: a survey of the field 33 years after Loftus. Mem Cognit. 2012 Feb;40(2):145-60.
Rohrer JM, Arslan RC. Precise Answers to Vague Questions: Issues With Interactions. Advances in Methods and Practices in Psychological Science. 2021;4(2).
Kellen, D., Davis-Stober, C. P., Dunn, J. C., & Kalish, M. L. (2021). The Problem of Coordination and the Pursuit of Structural Constraints in Psychology. Perspectives on Psychological Science, 16(4), 767-778.
Sommet N, Weissman DL, Cheutin N, Elliot AJ. How Many Participants Do I Need to Test an Interaction? Conducting an Appropriate Power Analysis and Achieving Sufficient Power to Detect an Interaction. Advances in Methods and Practices in Psychological Science. 2023;6(3).
For simulations to determine the (large) sample sizes required to compare correlation coefficients, see this blog post and this article.

Robust estimation

Hubert, M., Rousseeuw, P. J., & Van Aelst, S. (2008). High-breakdown robust multivariate methods. Statistical Science, 92-119. [includes robust alternatives to the Mahalanobis distance]
Wilcox, R.R. & Keselman, H.J. (2003) Modern Robust Data Analysis Methods: Measures of Central Tendency. Psychological Methods, 8, 254-274. [introduction to robust estimation – in particular how to deal with skewness and outliers]

1 thought on “Recommended readings”

Denis B. 2019-09-12 at 7:53 am

Bonjour,
Blog très intéressant que je découvre suite à la présentation « A simple cure to the p < 0.05 disease », sur la page "ouvrir la science: Repenser la robustesse et la fiabilité en recherche : les chercheurs face à la crise de la reproductibilité."

Je suis surpris de ne voir aucune référence aux travaux d'Edouard Tufte en ce qui concerne la visualisation de données. Il se peut toutefois que ces références soient centrées sur les neurosciences, auquel je comprends !

Denis

ps: il n'est pas nécessaire de publier ce post, c'est simplement un moyen d'engager le dialogue 😉

LikeLike

Reply ↓

basic statistics

simple steps to improve statistical analyses in neuroscience & psychology

Recommended readings

There is no free lunch in inference

Review & position papers

Understanding P values

A world without mindless dichotomies (p<0.05)

Solutions to common issues

Beyond power calculations: planning for precision

Data visualisation

Data description

Interactions

Robust estimation

1 thought on “Recommended readings”

Leave a comment Cancel reply

There is no free lunch in inference

Review & position papers

Understanding P values

A world without mindless dichotomies (p<0.05)

Solutions to common issues

Beyond power calculations: planning for precision

Data visualisation

Data description

Interactions

Robust estimation

Share this:

1 thought on “Recommended readings”

Leave a comment Cancel reply