Skip to content

datapinions/evldata

Repository files navigation

evldata

The repository contains code to build a dataset of eviction rates, demographics, and income. The purpose is to demonstrate impact charts, as implemented in the impactchart repository. The code to build impact charts using the data this repository generates can be found in the evlcharts repository.

Dependencies

The key binary requirements for this project are Python 3.11 or higher and GNU make. We test with Gnu make version 4.4, but other versions may work.

The Python requirements for the project are listed in requirements.txt and should all be installable in your virtual environment via

pip install -r ./requirements.txt

During development we use poetry to manage dependencies but we try to always keep the requirements.txt up to date.

Generating the Data Set

The work done by this project is all coordinated through a Makefile.

The entire dataset can be built with the single command

gmake

If you want to remove the data set and all intermediate files,

gmake clean

will do that for you.

The Makefile automates the following steps:

  1. Download data on eviction rates from the Eviction Lab at Princeton University. Specifically, it uses what the Eviction Lab calls proprietary data within their eviction-lab-data-downloads repository.
  2. Download demographic and income data from the U.S. Census for the years covered by the eviction data.
  3. Join the data together at the census tract level. The final result has one row per census tract per year. On that row, it has all relevant fields from the two downloaded data sets for that tract and year. In cases where data for a given tract for a given year is not present in both downloaded data sets, that tract and year combination does not appear in the final dataset.
  4. Compute inflation-adjusted median renter household income in constant 2018 dollars.
  5. Compute fractional values for each of the census demographic fields. For example, there is a field B25003B_003E in the census data that represents a count of the number of renters in a tract who identify as white and not Hispanic or Latino. We add a field frac_B25003B_003E that represents this as a fraction of the total number of renter households. This new field is always between 0.0 and 1.0.

Once the Makefile has successfully run, the final data set will be in a file data/evl_census.csv. From there, you can either use it with the code in evlcharts or whatever further analysis you wish to do.

Data Sources

The Eviction Lab

The data we download from the eviction lab is a file called tract_proprietary_valid_2000_2018.csv. The file, along with a corresponding codebook can be found at https://data-downloads.evictionlab.org/#data-for-analysis/.

The Eviction Lab's preferred citation for this data is

U.S. Census

The groups of variables we use are B25119 for median household income for renters and B25003 and B25003A through B25003I for the population of renters overall and renters of different racial and ethnic groups.

For total population, not just renters, we used B03002.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors