{"id":6569,"date":"2022-04-08T07:00:46","date_gmt":"2022-04-08T11:00:46","guid":{"rendered":"https:\/\/datacolada.org\/?p=6569"},"modified":"2022-04-09T08:48:29","modified_gmt":"2022-04-09T12:48:29","slug":"100-groundhog-2-0-further-addressing-the-threat-r-poses-to-reproducible-research","status":"publish","type":"post","link":"https:\/\/datacolada.org\/100","title":{"rendered":"[100] Groundhog 2.0: Further addressing the threat R poses to reproducible research"},"content":{"rendered":"<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">About a year ago I wrote <a href=\"https:\/\/datacolada.org\/95\" target=\"_blank\" rel=\"noopener\">Colada[95]<\/a>, a post on the threat R poses to reproducible research. The core issue is the 'packages'. When using R, you can run <code>library(some_package)<\/code> and R can all of a sudden scrape a website, cluster standard errors, maybe even help you levitate. The problem is that packages get updated often, and on occasion in 'backwards incompatible' ways, making your existing code obsolete. The code that works today, may not work tomorrow.\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">In that post I introduced a solution to this problem with R packages: a new R package. It's called groundhog.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">With groundhog, the only thing you need to change to make your R code reproducible is:<\/span><\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\">Instead of: <code>library(pkg)<\/code><\/span><br \/>\n<span style=\"font-family: helvetica, arial, sans-serif;\">Do this: \u00a0 \u00a0\u00a0 <code>groundhog.library(pkg, date)<\/code><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">Now every time you run that code, you load the version of the package that was available on that <code>date<\/code>, regardless of when you run it. That's really <span style=\"text-decoration: underline;\">all<\/span> you need to do to dramatically improve the reproducibility of your R Code.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">This post is an update on three fronts:<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">1) Share new evidence on iRreproducibility (geRit?)<br \/>\n2) Announce groundhog 2.0 (key new feature: it works with GitHub packages)<br \/>\n3) A 'show me the money' moment.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">If you are familiar with <code>renv<\/code> and wonder how it compares to groundhog, <a href=\"http:\/\/groundhogr.com\/renv\" target=\"_blank\" rel=\"noopener\">check this out (.htm)<\/a><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><strong>1) New evidence that R is a threat to research reproducibility<br \/>\n<\/strong>Over the past year I have learned a few things that made me <em>more<\/em> concerned about R's reproducibility (more concerned than when I was motivated to spend months of my life developing a fricking R package to address that concern). I will mention three things.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><strong><em>Thing #1: Most posted R scripts apparently don't even run<br \/>\n<\/em><\/strong>A paper published a few weeks ago in <em>Nature: Scientific Data (.<a href=\"https:\/\/www.nature.com\/articles\/s41597-022-01143-6\" target=\"_blank\" rel=\"noopener\">htm<\/a>)<\/em> attempted to automatically re-execute 2335 R scripts posted as supporting materials for published papers. After cleaning the scripts (installing necessary packages and fixing paths to local files) only\u00a0 44% of scripts run without generating errors. So, <em>most<\/em> scripts did not run. M<\/span><span style=\"font-family: helvetica, arial, sans-serif;\">oreover, 21% of <em>all<\/em> failures were attributed to packages not loading. This is an underestimate of the problem caused by packages, because a package may load successfully, but reproduce different results, or produce an error elsewhere (see \"show me the money\" example at the end of this post).<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">The script cleaning was done in an automated way, and this is important. It seems likely that some of the scripts that did not run, could run if a person were reading and editing the code before executing it. But there is one key finding that is free of this ambiguity: the <span style=\"text-decoration: underline;\">same<\/span> scripts run in some but not other versions of R.\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><a href=\"https:\/\/datacolada.org\/wp-content\/uploads\/R-reproducibility-rates.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-6575\" src=\"https:\/\/datacolada.org\/wp-content\/uploads\/R-reproducibility-rates.png\" alt=\"\" width=\"1886\" height=\"1174\" srcset=\"https:\/\/datacolada.org\/wp-content\/uploads\/R-reproducibility-rates.png 1886w, https:\/\/datacolada.org\/wp-content\/uploads\/R-reproducibility-rates-300x187.png 300w, https:\/\/datacolada.org\/wp-content\/uploads\/R-reproducibility-rates-1024x637.png 1024w, https:\/\/datacolada.org\/wp-content\/uploads\/R-reproducibility-rates-768x478.png 768w, https:\/\/datacolada.org\/wp-content\/uploads\/R-reproducibility-rates-1536x956.png 1536w, https:\/\/datacolada.org\/wp-content\/uploads\/R-reproducibility-rates-850x529.png 850w\" sizes=\"auto, (max-width: 1886px) 100vw, 1886px\" \/><\/a><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><strong>Fig 1.<\/strong> Uri made this figure based on numbers reported in Figure 11 (.<a href=\"https:\/\/datacolada.org\/wp-content\/uploads\/fig11.png\" target=\"_blank\" rel=\"noopener\">png<\/a>) of Trisovic et al. (2022).<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">R-3.2 may feel like ancient history, but when Trump was running for president, R was running 3.2. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">Groundhog also helps with this source of irreproducibility. Specifically, when the date entered in <code>groundhog.library()<\/code> does not match the R version being used, it gives a warning and suggests a date to switch the groundhog day to, or the version of R to use for that groundhog day.<br \/>\n<\/span><\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\">Running multiple versions of R on the same computer is trivial in Windows and quite easy in a Mac. Groundhog's website provides<a href=\"http:\/\/groundhogr.com\/many\" target=\"_blank\" rel=\"noopener\"> step-by-step<\/a> instructions.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><strong><em>Thing 2. Come think of it: this is a big pain in the Rs for books<br \/>\n<\/em><\/strong>I recently noticed that books about R, or that rely on R, often include printed out lists of the version of each package they used (see figure below). This highlights that I am not the only one worried about the issue of R stability. And just think about it. You may work for years on a book on causal inference, say, or interpreting interactions, and the examples you include in your book, if they involve R code, may stop working shortly after the book is finished (possibly even before it is published). This seems tolerable for a book <em>about<\/em> R, say \"R for Data Science\" (as planned obsolescence is not a terrible business model), but it seems less tolerable for a book on anything else that happens to use R to give concrete examples. A book on causal inference is not going to be revised every 6 months just to make sure all the R examples still run. In fact, it may <em>never<\/em> get revised.\u00a0 <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">Attempting to fix a book's reproducibility problem by printing a list of used package versions is both impractical and dangerous (see footnote for why it is impractical and dangerous [<a href=\"#footnote_0_6569\" id=\"identifier_0_6569\" class=\"footnote-link footnote-identifier-link\" title=\"Impractical because the kind of person who needs an R book will probably struggle understanding what that&#039;s all about or how to handle the list. Dangerous because if a package does change, and does break an example in the book, simply installing the old version will, OK, maybe fix this book, but probably break other scripts written based on the newer version of the package (e.g., examples in newer books).\">1<\/a>]). An alternative is to write books relying on groundhog.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">In fact, textbooks provide a textbook example of groundhog's simplicity:<\/span><\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\"><a href=\"https:\/\/datacolada.org\/wp-content\/uploads\/ppt2-book-2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-6581 size-full\" style=\"border: 1px solid #000000;\" src=\"https:\/\/datacolada.org\/wp-content\/uploads\/ppt2-book-2.png\" alt=\"\" width=\"1432\" height=\"802\" srcset=\"https:\/\/datacolada.org\/wp-content\/uploads\/ppt2-book-2.png 1432w, https:\/\/datacolada.org\/wp-content\/uploads\/ppt2-book-2-300x168.png 300w, https:\/\/datacolada.org\/wp-content\/uploads\/ppt2-book-2-1024x573.png 1024w, https:\/\/datacolada.org\/wp-content\/uploads\/ppt2-book-2-768x430.png 768w, https:\/\/datacolada.org\/wp-content\/uploads\/ppt2-book-2-850x476.png 850w\" sizes=\"auto, (max-width: 1432px) 100vw, 1432px\" \/><\/a><\/span><\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\">That figure in the left is actually a small portion of the full list. Check. It. Out.:<\/span><\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\"><a href=\"https:\/\/datacolada.org\/wp-content\/uploads\/Long-list.png\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-6582 aligncenter\" src=\"https:\/\/datacolada.org\/wp-content\/uploads\/Long-list.png\" alt=\"\" width=\"267\" height=\"971\" \/><\/a><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><strong><em>Thing 3. Personal Experience with abandoned packages<\/em><em><br \/>\n<\/em><\/strong>While working on a research project I needed to revisit my own code from just a few months earlier. But, I could not install a package that I <em>needed <\/em>because it was no longer available on CRAN (it had been archived because one of its dependencies was no longer being maintained). My own code would not have run in my own machine. But I had groundhog, and so it did.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><strong>2. New features since groundhog 1.0.0<br \/>\n<\/strong>The current version of groundhog on CRAN is v1.5.0. It has some neat features introduced since the original release v1.0.0, including the possibility of loading\/installing a set of packages in a single call:<em>\u00a0<\/em><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><code>pkgs &lt;- c('metafor','pwr', 'jsonlite')<br \/>\n<\/code><code>groundhog.library(pkgs,'2022-03-01')<\/code><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">And having version-control for groundhog itself<em>\u00a0<\/em><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><code>meta.groundhog('2022-01-01')\u00a0<\/code> <span style=\"color: #008000;\">#load the version of groundhog available that date.<\/span>\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">A bigger innovation is that the <em><u>next<\/u><\/em> release of groundhog, v2.0.0, will work not just with CRAN but also with git repositories (GitHub and GitLab).<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">Like this:<br \/>\n<code>groundhog.library('crsh\/papaja' , '2022-03-01')<\/code><\/span><br \/>\n<span style=\"font-family: helvetica, arial, sans-serif;\"><code>groundhog.library('gitlab::jimhester\/covr', '2022-03-01')<\/code><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">Git packages arguably need version control even more than CRAN packages do; see footnote [<a href=\"#footnote_1_6569\" id=\"identifier_1_6569\" class=\"footnote-link footnote-identifier-link\" title=\"Git packages need version control even more than CRAN packages for &nbsp;two opposing reasons. First, some git packages get edited very frequently, some daily, but their version numbers do not get updated; thus the same package-version is actually a different package on Monday vs Tuesday. With groundhog packages are identified by date, so that&#039;s fixed. Second, and on the other hand, some packages are updated very infrequently. Which means they can become unusable when the CRAN packages they depend on get updated but the git package does not. Loading git packages with groundhog, you will always load the dependencies that package actually needs and relied on when you first wrote your script. Having said that, it is not possible to version control git packages as reliably as CRAN packages. See why here: (.htm\">2<\/a>]<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">Groundhog 2.0.0 should be on CRAN by May 2022.<br \/>\nIn the meantime, you can use the almost there version: v1.9.9.9999 available on GitHub:<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><code>remotes::install_github('CredibilityLab\/groundhog')<\/code><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><strong>3) Show me the money<br \/>\n<\/strong>I thought it would be interesting to try and see if groundhog could 'rescue' an R script that the <em>Nature: Scientific Data<\/em> paper reported as failing to reproduce.\u00a0<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">I downloaded one R script flagged as non-reproducible, supporting a paper published in <em>Political Analysis<\/em> (.<a href=\"https:\/\/dataverse.harvard.edu\/file.xhtml?persistentId=doi:10.7910\/DVN\/SVAONZ\/F2GJTE&amp;version=1.0\">htm<\/a><a href=\"https:\/\/dataverse.harvard.edu\/file.xhtml?persistentId=doi:10.7910\/DVN\/SVAONZ\/F2GJTE&amp;version=1.0\">)<\/a> [<a href=\"#footnote_2_6569\" id=\"identifier_2_6569\" class=\"footnote-link footnote-identifier-link\" title=\"I had two attempts prior to this one. Both actually ran as-is, so I did not reproduce their irreproducibility; in these cases there was no room for groundhog to improve anything. &nbsp;The third script I tried did produce an error. That&#039;s the one discussed above.\">3<\/a>]. The script\u00a0 has just 9 active lines of code; it is supposed to estimate a regression with fixed effects using the package '<code>bife<\/code>' (after loading the data with '<code>foreign<\/code>'). <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">After running that script in R-4.1.3 (current version as of this writing), however, all I got was this ugly thing:<\/span><\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\"><a href=\"https:\/\/datacolada.org\/wp-content\/uploads\/error-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-6601 size-full\" style=\"border: 1px solid #000000;\" src=\"https:\/\/datacolada.org\/wp-content\/uploads\/error-1.png\" alt=\"\" width=\"884\" height=\"89\" srcset=\"https:\/\/datacolada.org\/wp-content\/uploads\/error-1.png 884w, https:\/\/datacolada.org\/wp-content\/uploads\/error-1-300x30.png 300w, https:\/\/datacolada.org\/wp-content\/uploads\/error-1-768x77.png 768w, https:\/\/datacolada.org\/wp-content\/uploads\/error-1-850x86.png 850w\" sizes=\"auto, (max-width: 884px) 100vw, 884px\" \/><\/a><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">The R script had been uploaded to the <em>Dataverse<\/em> on January 2019, so I thought to try late 2018 as a reference date. Back then R was running version R-3.5. So I started up R-3.5 in R Studio (again, it is easy to have multiple versions of R in the same computer; <a href=\"http:\/\/groundhogr.com\/many\" target=\"_blank\" rel=\"noopener\">see how<\/a>).<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">I then did <code>install.packages(c('foreign','bife'))<\/code>, loaded both packages with <code>library()<\/code> and ran the 9 lines again. And&#8230;<br \/>\n&#8230;no luck. Got the same error message. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">So I used <code>groundhog.libray()<\/code> instead.And &#8230;<br \/>\n&#8230;that <em>did<\/em> work. <\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">Specifically, I installed &amp; loaded the packages like this:<\/span><\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\"><a href=\"https:\/\/datacolada.org\/wp-content\/uploads\/error2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-6602 size-full\" style=\"border: 1px solid #000000;\" src=\"https:\/\/datacolada.org\/wp-content\/uploads\/error2.png\" alt=\"\" width=\"401\" height=\"68\" srcset=\"https:\/\/datacolada.org\/wp-content\/uploads\/error2.png 401w, https:\/\/datacolada.org\/wp-content\/uploads\/error2-300x51.png 300w\" sizes=\"auto, (max-width: 401px) 100vw, 401px\" \/><\/a><\/span><\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\">And the 9 lines of code produced this beautiful, oddly self-affirming, regression table:<\/span><\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\"><a href=\"https:\/\/datacolada.org\/wp-content\/uploads\/error-3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-6603\" style=\"border: 1px solid #000000;\" src=\"https:\/\/datacolada.org\/wp-content\/uploads\/error-3.png\" alt=\"\" width=\"433\" height=\"214\" srcset=\"https:\/\/datacolada.org\/wp-content\/uploads\/error-3.png 619w, https:\/\/datacolada.org\/wp-content\/uploads\/error-3-300x148.png 300w\" sizes=\"auto, (max-width: 433px) 100vw, 433px\" \/><\/a><\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\">If curious, this is the <span style=\"font-size: 12pt;\"><a href=\"https:\/\/datacolada.org\/appendix\/100\/colada%20100%20reproducible%20with%20groundhog%20only.R\" target=\"_blank\" rel=\"noopener\">R Code<\/a><\/span> \u00a0for the above example.<br \/>\nSee footnote for why the code ran with <code>groundhog.library()<\/code> but not with <code>library()<\/code> [<a href=\"#footnote_3_6569\" id=\"identifier_3_6569\" class=\"footnote-link footnote-identifier-link\" title=\"When you run install.packages() in R, you get whatever version of the package happens to be the most recently available on CRAN (for the version of R that you are running). For the package &#039;bife&#039;, CRAN&#039;s most recent version of bife, for R-3.5.3, is bife_0.7, which was released nearly a year after the Political Analysis paper was published; back then the current version was bife_0.5. So, running install.packages() in 2022 in R-3.5.3, I get bife_0.7, but I needed version bife_0.5, the version loaded by groundhog\">4<\/a>].<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><strong>Conclusions<br \/>\n<\/strong>A big lesson from Bill Murray's conundrum in the movie Groundhog Day, is that if you want different results, you need to try something different. We know by now that relying on R alone for package management (<code>install.packages()<\/code> + <code>library()<\/code>) will lead researchers to share R scripts that do not work. So if we want the R scripts we share to work we need to try something different. Groundhog is something different. Nothing is perfect, but groundhog delivers a huge increase in reproducibility with absolutely minimal effort. <\/span><\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\"><code>groundhog.library(pkg, date)<\/code><\/span><\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\"><a href=\"https:\/\/datacolada.org\/wp-content\/uploads\/1993-GROUNDHOG-DAY-010.jpg\"><img decoding=\"async\" class=\"size-full wp-image-6647 aligncenter\" src=\"https:\/\/datacolada.org\/wp-content\/uploads\/1993-GROUNDHOG-DAY-010.jpg\" alt=\"\" width=\"250\" srcset=\"https:\/\/datacolada.org\/wp-content\/uploads\/1993-GROUNDHOG-DAY-010.jpg 460w, https:\/\/datacolada.org\/wp-content\/uploads\/1993-GROUNDHOG-DAY-010-300x180.jpg 300w\" sizes=\"(max-width: 460px) 100vw, 460px\" \/><\/a><\/span><\/p>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center;\"><span style=\"font-family: helvetica, arial, sans-serif;\"><strong>Learn more about groundhog:\u00a0<\/strong><\/span><br \/>\n<span style=\"font-family: helvetica, arial, sans-serif;\"><a href=\"https:\/\/datacolada.org\/95\">Colada[95] on Groundhog<\/a>\u00a0 \u00a0|\u00a0 \u00a0 <a href=\"http:\/\/groundhogr.com\">groundhogr.com<\/a>\u00a0 \u00a0 |\u00a0 \u00a0 <a href=\"http:\/\/github.com\/CredibilityLab\/groundhog\">GitHub<\/a><\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-376\" src=\"https:\/\/datacolada.org\/wp-content\/uploads\/2014\/02\/Wide-logo-300x145.jpg\" alt=\"Wide logo\" width=\"78\" height=\"38\" srcset=\"https:\/\/datacolada.org\/wp-content\/uploads\/2014\/02\/Wide-logo-300x145.jpg 300w, https:\/\/datacolada.org\/wp-content\/uploads\/2014\/02\/Wide-logo.jpg 320w\" sizes=\"auto, (max-width: 78px) 100vw, 78px\" \/><\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"color: #0000ff; font-family: helvetica, arial, sans-serif;\"><strong>Author feedback<br \/>\n<\/strong>I shared an early draft of this post with Ana Trisovic (.<a href=\"https:\/\/anatrisovic.com\/\" target=\"_blank\" rel=\"noopener\">htm<\/a>), 1st author of the <em>Nature: Scientific Data<\/em> paper and she provided useful feedback and clarifications. The post did change substantially since she read it.\u00a0<\/span><\/p>\n<hr \/>\n<p><span style=\"font-family: helvetica, arial, sans-serif;\"><div class=\"jetpack_subscription_widget\"><h2 class=\"widgettitle\">Subscribe to Blog via Email<\/h2>\n\t\t\t<div class=\"wp-block-jetpack-subscriptions__container\">\n\t\t\t<form action=\"#\" method=\"post\" accept-charset=\"utf-8\" id=\"subscribe-blog-1\"\n\t\t\t\tdata-blog=\"58049591\"\n\t\t\t\tdata-post_access_level=\"everybody\" >\n\t\t\t\t\t\t\t\t\t<div id=\"subscribe-text\"><p>Enter your email address to subscribe to this blog and receive notifications of new posts by email.<\/p>\n<\/div>\n\t\t\t\t\t\t\t\t\t\t<p id=\"subscribe-email\">\n\t\t\t\t\t\t<label id=\"jetpack-subscribe-label\"\n\t\t\t\t\t\t\tclass=\"screen-reader-text\"\n\t\t\t\t\t\t\tfor=\"subscribe-field-1\">\n\t\t\t\t\t\t\tEmail Address\t\t\t\t\t\t<\/label>\n\t\t\t\t\t\t<input type=\"email\" name=\"email\" autocomplete=\"email\" required=\"required\"\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tvalue=\"\"\n\t\t\t\t\t\t\tid=\"subscribe-field-1\"\n\t\t\t\t\t\t\tplaceholder=\"Email Address\"\n\t\t\t\t\t\t\/>\n\t\t\t\t\t<\/p>\n\n\t\t\t\t\t<p id=\"subscribe-submit\"\n\t\t\t\t\t\t\t\t\t\t\t>\n\t\t\t\t\t\t<input type=\"hidden\" name=\"action\" value=\"subscribe\"\/>\n\t\t\t\t\t\t<input type=\"hidden\" name=\"source\" value=\"https:\/\/datacolada.org\/wp-json\/wp\/v2\/posts\/6569\"\/>\n\t\t\t\t\t\t<input type=\"hidden\" name=\"sub-type\" value=\"widget\"\/>\n\t\t\t\t\t\t<input type=\"hidden\" name=\"redirect_fragment\" value=\"subscribe-blog-1\"\/>\n\t\t\t\t\t\t<input type=\"hidden\" id=\"_wpnonce\" name=\"_wpnonce\" value=\"b4cdbc0b54\" \/><input type=\"hidden\" name=\"_wp_http_referer\" value=\"\/wp-json\/wp\/v2\/posts\/6569\" \/>\t\t\t\t\t\t<button type=\"submit\"\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tclass=\"wp-block-button__link\"\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\tstyle=\"margin: 0; margin-left: 0px;\"\n\t\t\t\t\t\t\t\t\t\t\t\t\t\tname=\"jetpack_subscriptions_widget\"\n\t\t\t\t\t\t>\n\t\t\t\t\t\t\tSubscribe\t\t\t\t\t\t<\/button>\n\t\t\t\t\t<\/p>\n\t\t\t\t\t\t\t<\/form>\n\t\t\t\t\t\t<\/div>\n\t\t\t\n<\/div><strong>Footnotes.<\/strong><\/span><\/p>\n<ol class=\"footnotes\">\n<li id=\"footnote_0_6569\" class=\"footnote\">Impractical because the kind of person who needs an R book will probably struggle understanding what that's all about or how to handle the list. Dangerous because if a package does change, and does break an example in the book, simply installing the old version will, OK, maybe fix this book, but probably break other scripts written based on the newer version of the package (e.g., examples in newer books). [<a href=\"#identifier_0_6569\" class=\"footnote-link footnote-back-link\">&#8617;<\/a>]<\/li>\n<li id=\"footnote_1_6569\" class=\"footnote\">Git packages need version control even more than CRAN packages for \u00a0two opposing reasons. First, some git packages get edited very frequently, some daily, but their version numbers do not get updated; thus the same package-version is actually a different package on Monday vs Tuesday. With groundhog packages are identified by date, so that's fixed. Second, and on the other hand, some packages are updated very infrequently. Which means they can become unusable when the CRAN packages they depend on get updated but the git package does not. Loading git packages with groundhog, you will always load the dependencies that package actually needs and relied on when you first wrote your script. Having said that, it is not possible to version control git packages as reliably as CRAN packages. See why here: (.<a href=\"https:\/\/groundhogr.com\/github\/\" target=\"_blank\" rel=\"noopener\">htm<\/a> [<a href=\"#identifier_1_6569\" class=\"footnote-link footnote-back-link\">&#8617;<\/a>]<\/li>\n<li id=\"footnote_2_6569\" class=\"footnote\">I had two attempts prior to this one. Both actually ran as-is, so I did not reproduce their irreproducibility; in these cases there was no room for groundhog to improve anything. \u00a0The third script I tried did produce an error. That's the one discussed above. [<a href=\"#identifier_2_6569\" class=\"footnote-link footnote-back-link\">&#8617;<\/a>]<\/li>\n<li id=\"footnote_3_6569\" class=\"footnote\">When you run <code>install.packages()<\/code> in R, you get whatever version of the package happens to be the most recently available on CRAN (for the version of R that you are running). For the package '<code>bife<\/code>', CRAN's most recent version of bife, for R-3.5.3, is <code>bife_0.7<\/code>, which was released nearly a year after the <em>Political Analysis<\/em> paper was published; back then the current version was <code>bife_0.5<\/code>. So, running <code>install.packages()<\/code> in 2022 in R-3.5.3, I get <code>bife_0.7<\/code>, but I needed version <code>bife_0.5<\/code>, the version loaded by <code>groundhog<\/code> [<a href=\"#identifier_3_6569\" class=\"footnote-link footnote-back-link\">&#8617;<\/a>]<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>About a year ago I wrote Colada[95], a post on the threat R poses to reproducible research. The core issue is the 'packages'. When using R, you can run library(some_package) and R can all of a sudden scrape a website, cluster standard errors, maybe even help you levitate. The problem is that packages get updated&#8230;<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"_wp_rev_ctl_limit":""},"categories":[84,88,87],"tags":[],"class_list":["post-6569","post","type-post","status-publish","format-standard","hentry","category-credibility-lab","category-r","category-reproducibility"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/datacolada.org\/wp-json\/wp\/v2\/posts\/6569","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/datacolada.org\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/datacolada.org\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/datacolada.org\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/datacolada.org\/wp-json\/wp\/v2\/comments?post=6569"}],"version-history":[{"count":5,"href":"https:\/\/datacolada.org\/wp-json\/wp\/v2\/posts\/6569\/revisions"}],"predecessor-version":[{"id":6698,"href":"https:\/\/datacolada.org\/wp-json\/wp\/v2\/posts\/6569\/revisions\/6698"}],"wp:attachment":[{"href":"https:\/\/datacolada.org\/wp-json\/wp\/v2\/media?parent=6569"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/datacolada.org\/wp-json\/wp\/v2\/categories?post=6569"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/datacolada.org\/wp-json\/wp\/v2\/tags?post=6569"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}