Skip to content

Releases: gabors-data-analysis/da_case_studies

v. 0.9.0. "Frank Exchange of Views"

14 Aug 14:52

Choose a tag to compare

Changelog: version 0.9.0. (2025-08-14)

Evolution of code by language since v0.8.3

Changelog summary -- how the content of the repository evolved for each programming language (Python, R and Stata) between the previous release v0.8.3 (25 Nov 2022) and the latest release v0.9.0.

Overall, the transition to seaborn and pyfixest drove most of the Python‑side evolution, while the R side adopted fixest/marginaleffects. Stata materials remained largely stable, reflecting a focus on modernizing the Python and R components for reproducibility and ease of use. No Julia yet.

Python

Python notebooks underwent substantial refactoring and feature additions:

  • Seaborn as the plotting backbone: All chapters migrated from the old plotnine library to seaborn. This included developing a custom da_theme, adding functions for time‑series plots (tsplots), and standardizing default figure sizes. The change eliminated the dependency on plotnine and simplified the plotting pipeline.
  • Regression engine upgrade: Regression examples were moved from statsmodels to the pyfixest package. Code was refactored to accommodate the new API, and formulas were updated to match the textbook notation. Throughout the migration several minor bug fixes were applied, and later updates bumped pyfixest to version 0.30.2.
  • New model‑interpretation tools: A LIME explainer was introduced to help interpret machine‑learning models. Other helper functions were added to improve variable importance and spline calculations.
  • Environment and dependency clean‑up: The Python environment was modernized with new conda/macOS/Windows YAML files, support for Python 3.12, and removal of obsolete packages such as plotnine and shap. prophet, lime and other dependencies were updated.
  • Testing and reproducibility: Scripts were added to automate environment creation and test notebooks. OSF paths/links were integrated, and default data‑loading paths were standardized. Numerous notebooks were tidied up, including fixes to bar‑plot axes and cleaning of gender/earnings data.

R

Changes in R code were more targeted but still significant:

  • Adoption of fixest and marginaleffects: Examples previously using base R lm() were rewritten to use feols() from the fixest package. The marginaleffects package was added for computing marginal effects, and formulas were aligned with the notation in the textbook. Minor adjustments were made to ensure compatibility with matchit and dplyr syntax.
  • SHAP support (experimental): An early experiment added SHAP value calculations for R models; although later the focus shifted back to Python, the code remains available for reference.
  • General maintenance: A few bug fixes and readability improvements were made across R scripts, but no major structural changes occurred.

Stata

Stata materials saw minimal changes during this release cycle:

  • Code stability: Most .do files remained unchanged. A small number of scripts were updated to improve labels or path handling; for example, the football‑manager‑success chapter received a minor update to correct a plot option.
  • Consistency with new data paths: Where necessary, OSF links and standardized data paths were incorporated to ensure that Stata examples work seamlessly across operating systems.

v0.8.3-pre-release "Ethics Gradient"

25 Nov 10:21

Choose a tag to compare

Pre-release

Date: 25 November, 2022

Minor changes, refactoring

Python

  • plot fixes
  • code quality improvements, refactoring
  • edit some ML codes
  • add SHAP to RF codes

R

  • edits re fixest + marginaleffects

Stata

  • no change

v0.8.2-pre-release "Very Little Gravitas Indeed"

21 Mar 14:51
43e1939

Choose a tag to compare

Date: 2022-03-21

Several minor upgrades, corrections of bugs. Some valuable change though, so may worth update.

R

There are substantive changes to the R version.

Python

Mostly minor changes

  • correct ROC curve graph
  • change random forest from skranger to sklearn - becuase sklearn did not run properly on Windows
  • Minor edits to pipenv react to changes in numpy

Stata

just cosmetic change

v0.8.1-pre-release "Sweet and Full of Grace"

22 Oct 10:18

Choose a tag to compare

This is a minor update with mostly bug fixes vs v0.8.0

Important changes

  • ch06B new short case study now separate
  • R: modelsummary introduced for summary stats, small edits, error fixes.
  • R: renv cleaned up.
  • Python: minor changes

v0.8.0-pre-release "What Are The Civilian Applications?"

15 Jul 13:08

Choose a tag to compare

v0.8.0-pre-release "What Are The Civilian Applications?"

v0.8.0-pre-release "What Are The Civilian Applications?" is a heavily revised update, now including most scripts in Python.

Key novelties

  • There are a few missing bits, and bugs, but should mostly work.
  • Environments added for R, Python.
  • Tested on Windows and Mac.
  • There will be additional bug fixes, edits, next update is expected late summer/early Fall.

Language specific issues

  • R -- All code ready. Used for graphs in textbook. An environment with all libraries necessary, as renv is now available.
  • Stata -- All code ready. In the lack of native machine learning capabilities, no code for chapters 15,16,17, some limitations for chapter 18. This shall change, once we test the new python link-ups. Only planned for version v1.1.0 expected late 2021.
  • Python -- Almost all codes are ready. Moderate differences to book and R are in ch14-ch17, some additional checks planned. An environment with all libraries necessary, as pipenv is now available.

v0.7.2-pre-release "Limiting Factor"

06 May 13:26
442938a

Choose a tag to compare

Pre-release
  • Small changes to ch18 Stata and R.
  • Python improved ch01-ch11
  • Python pip install requirements.txt added.
  • Half-baked python code moved to dev branch till ready

v0.7.1-pre-release "Little Rascal"

09 Mar 07:41

Choose a tag to compare

Pre-release

Minor bug corrections, edits since v0.7.0; mostly chapters 19-24 + new python for ch02.

v0.7.0-pre-release "Clear Air Turbulence"

08 Jan 14:22

Choose a tag to compare

Second pre-release. R, Stata mostly ready and checked, with some minor edits from v0.6.0.
Python is now available for many but not all case studies, and handle with care.

v0.6.0-pre-release "Nervous Energy"

21 Sep 07:00

Choose a tag to compare

Pre-release

This is the release that points to the status when switching to public.
R, Stata mostly ready and checked. Python ch01-10 ready, Ch11-12 close.