Here is a growing list of some of my favourite articles on data analysis; mostly articles I recommend to students or when reviewing or editing papers.
There is no free lunch in inference
- Meehl, P. E. (1997). The Problem is Epistemology, Not Statistics: Replace Significance Tests by Confidence Intervals and Quantify Accuracy of Risky Numerical Predictions. In L. L. H. Steiger Stanley A. Mulaik, James H. (Ed.), What If There Were No Significance Tests? Psychology Press.
- Rouder, J. N., Morey, R. D., Verhagen, J., Province, J. M., & Wagenmakers, E.-J. (2016). Is There a Free Lunch in Inference? Topics in Cognitive Science, 8(3), 520–547.
- Blume, J. D., Greevy, R. A., Welty, V. F., Smith, J. R., & Dupont, W. D. (2019). An Introduction to Second-Generation p-Values. The American Statistician, 73(sup1), 157–167.
- Riesthuis P. Simulation-Based Power Analyses for the Smallest Effect Size of Interest: A Confidence-Interval Approach for Minimum-Effect and Equivalence Testing. Advances in Methods and Practices in Psychological Science. 2024;7(2).
- Kruschke JK. Rejecting or Accepting Parameter Values in Bayesian Estimation. Advances in Methods and Practices in Psychological Science. 2018;1(2):270-280.
Review & position papers
- Tukey, J.W. (1969) Analyzing Data – Sanctification or Detective Work? American Psychologist, 24, 83-91. [inspiring thoughts from a true visionary, hinting at what will become data science]
- Shmueli, Galit. To Explain or to Predict?. Statist. Sci. 25 (2010), no. 3, 289–310. doi:10.1214/10-STS330. https://projecteuclid.org/euclid.ss/1294167961
- Fiedler, K. (2011). Voodoo Correlations Are Everywhere—Not Only in Neuroscience. Perspectives on Psychological Science, 6(2), 163–171. https://doi.org/10.1177/1745691611400237
- Button, K.S., Ioannidis, J.P., Mokrysz, C., Nosek, B.A., Flint, J., Robinson, E.S. & Munafo, M.R. (2013) Power failure: why small sample size undermines the reliability of neuroscience. Nature reviews. Neuroscience, 14, 365-376.
- Gelman, A., & Carlin, J. (2014). Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors. Perspectives on Psychological Science, 9(6), 641–651. https://doi.org/10.1177/1745691614551642
- Forstmeier, W., Wagenmakers, E.J. & Parker, T.H. (2016) Detecting and avoiding likely false-positive findings – a practical guide. Biol Rev Camb Philos Soc. [best summary of statistical, experimental and experimenter issues leading to false positives]
Understanding P values
- Greenland, S., Senn, S.J., Rothman, K.J. et al. (2016) Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol 31: 337. https://doi.org/10.1007/s10654-016-0149-3
- Ronald L. Wasserstein & Nicole A. Lazar (2016) The ASA’s Statement on p-Values: Context, Process, and Purpose, The American Statistician, 70:2, 129-133, DOI: 10.1080/00031305.2016.1154108
- Wagenmakers, E.J. (2007) A practical solution to the pervasive problems of p values. Psychonomic bulletin & review, 14, 779-804. [one of the best summaries of problems with p values]
A world without mindless dichotomies (p<0.05)
- Meehl, P.E. (1997) The Problem Is Epistemology, Not Statistics: Replace Significance Tests by Confidence Intervals and Quantify Accuracy of Risky Numerical Predictions. In: Harlow, L., Mulaik, S.A. and Steiger, J.H., Eds., What If There Were No Significance Tests? Erlbaum, Mahwah, NJ, 393-425.
- Kruschke, J.K. & Liddell, T.M. (2018) The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective. Psychon Bull Rev 25: 178. https://doi.org/10.3758/s13423-016-1221-4
- Gelman, A. (2018). The Failure of Null Hypothesis Significance Testing When Studying Incremental Changes, and What to Do About It. Personality and Social Psychology Bulletin, 44(1), 16–23. https://doi.org/10.1177/0146167217729162
- Blakeley B. McShane, David Gal, Andrew Gelman, Christian Robert, Jennifer L. Tackett (2018) Abandon Statistical Significance. arXiv [there is no need for arbitrary thresholds, let’s embrace uncertainty!]
- Valentin Amrhein, David Trafimow & Sander Greenland (2018) Inferential statistics as descriptive statistics: there is no replication crisis if we don’t expect replication. PeerJ Preprints [scientific life without zombie thresholds – refreshing]
Solutions to common issues
- Jaeger, T.F. (2008) Categorical Data Analysis: Away from ANOVAs (transformation or not) and towards Logit Mixed Models. J Mem Lang, 59, 434-446.
- MacCallum, R.C., Zhang, S., Preacher, K.J. & Rucker, D.D. (2002) On the practice of dichotomization of quantitative variables. Psychological Methods, 7, 19-40.
- Sassenhagen, J. & Alday, P.M. (2016) A common misapplication of statistical inference: nuisance control with null-hypothesis significance tests. Brain and Language
Beyond power calculations: planning for precision
- Gelman, A. & Carlin, J. (2014) Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors. Perspect Psychol Sci, 9, 641-651.
- Maxwell, S.E., Kelley, K. & Rausch, J.R. (2008) Sample size planning for statistical power and accuracy in parameter estimation. Annu Rev Psychol, 59, 537-563.
- Peters, G.-J.Y. & Crutzen, R. (2017) Knowing exactly how effective an intervention, treatment, or manipulation is and ensuring that a study replicates: accuracy in parameter estimation as a partial solution to the replication crisis. PsyArXiv. doi:10.31234/osf.io/cjsk2.
- Rothman, K.J. & Greenland, S. (2018) Planning Study Size Based on Precision Rather Than Power. Epidemiology, 29, 599-603.
Data visualisation
- Allen, E.A., Erhardt, E.B. & Calhoun, V.D. (2012) Data visualization in the neurosciences: overcoming the curse of dimensionality. Neuron, 74, 603-608.
- Anscombe, F.J. (1973) Graphs in Statistical Analysis. Am Stat, 27, 17-21.
- Weissgerber, T.L., Milic, N.M., Winham, S.J. & Garovic, V.D. (2015) Beyond bar and line graphs: time for a new data presentation paradigm. PLoS Biol, 13, e1002128.
- Weissgerber, T.L., Garovic, V.D., Winham, S.J., Milic, N.M. & Prager, E.M. (2016) Transparent reporting for reproducible science. J Neurosci Res, 94, 859-864.
- Wilcox, R.R. (2006) Graphical methods for assessing effect size: Some alternatives to Cohen’s d. Journal of Experimental Education, 74, 353-367.
Data description
- DeCarlo, L.T. (1997) On the meaning and use of kurtosis. Psychol. Meth., 2, 292-307.
Interactions
- Nieuwenhuis, S., Forstmann, B.U. & Wagenmakers, E.J. (2011) Erroneous analyses of interactions in neuroscience: a problem of significance. Nat Neurosci, 14, 1105-1107.
- Gelman, A. & Stern, H. (2006) The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant. The American Statistician, 60, 328–331.
- Loftus, G.R. On interpretation of interactions. Memory & Cognition 6, 312–319 (1978).
- Wagenmakers EJ, Krypotos AM, Criss AH, Iverson G. On the interpretation of removable interactions: a survey of the field 33 years after Loftus. Mem Cognit. 2012 Feb;40(2):145-60.
- Rohrer JM, Arslan RC. Precise Answers to Vague Questions: Issues With Interactions. Advances in Methods and Practices in Psychological Science. 2021;4(2).
- Kellen, D., Davis-Stober, C. P., Dunn, J. C., & Kalish, M. L. (2021). The Problem of Coordination and the Pursuit of Structural Constraints in Psychology. Perspectives on Psychological Science, 16(4), 767-778.
- Sommet N, Weissman DL, Cheutin N, Elliot AJ. How Many Participants Do I Need to Test an Interaction? Conducting an Appropriate Power Analysis and Achieving Sufficient Power to Detect an Interaction. Advances in Methods and Practices in Psychological Science. 2023;6(3).
- For simulations to determine the (large) sample sizes required to compare correlation coefficients, see this blog post and this article.
Robust estimation
- Hubert, M., Rousseeuw, P. J., & Van Aelst, S. (2008). High-breakdown robust multivariate methods. Statistical Science, 92-119. [includes robust alternatives to the Mahalanobis distance]
- Wilcox, R.R. & Keselman, H.J. (2003) Modern Robust Data Analysis Methods: Measures of Central Tendency. Psychological Methods, 8, 254-274. [introduction to robust estimation – in particular how to deal with skewness and outliers]
Bonjour,
Blog très intéressant que je découvre suite à la présentation « A simple cure to the p < 0.05 disease », sur la page "ouvrir la science: Repenser la robustesse et la fiabilité en recherche : les chercheurs face à la crise de la reproductibilité."
Je suis surpris de ne voir aucune référence aux travaux d'Edouard Tufte en ce qui concerne la visualisation de données. Il se peut toutefois que ces références soient centrées sur les neurosciences, auquel je comprends !
Denis
ps: il n'est pas nécessaire de publier ce post, c'est simplement un moyen d'engager le dialogue 😉
LikeLike