Skip to content

tianalongjam/loans_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Loan Analysis

Overview

This project analyzes loan applications in Wisconsin for the year 2020, using the Home Mortgage Disclosure Act (HMDA) dataset. The goal is to explore lending patterns, identify potential discrimination, and practice Object-Oriented Programming (OOP) and data structures in Python.

Specifically, this project focuses on:

  • Creating custom classes for Applicant, Loan, and Bank to efficiently handle loan data.
  • Implementing a Binary Search Tree (BST) for efficient loan lookups.
  • Performing statistical analysis on interest rates, applicant demographics, and loan characteristics.
  • Benchmarking BST performance versus naive approaches.

Project Structure

├── loan.py # Classes for Applicant, Loan, and Bank

├── search.py # Node and BST classes

├── mp3.ipynb # Notebook with analysis and questions

├── banks.json # Bank metadata

├── wi.zip # HMDA loan data (CSV inside)

└── README.md # Project documentation

Classes

Applicant

Represents a loan applicant or co-applicant.
Attributes:

  • age (str): Age or age range of the applicant.
  • race (set): Set of racial identities for the applicant.

Methods:

  • lower_age(): Returns the lower bound of the age range as an integer.
  • __repr__(): String representation of the applicant.

Loan

Represents a single loan application.
Attributes:

  • loan_amount (float)
  • property_value (float)
  • interest_rate (float)
  • applicants (list of Applicant objects)

Methods:

  • yearly_amounts(yearly_payment): Generator for yearly outstanding loan amounts.
  • __str__() / __repr__(): Human-readable representation.

Bank

Represents a bank and its loans.
Attributes:

  • bank (str): Bank name
  • lei (str): Legal Entity Identifier
  • loan_list (list): All Loan objects for this bank

Special Methods:

  • __len__() → Returns number of loans
  • __getitem__(index) → Enables indexing: bank[0]

Node and BST (search.py)

Custom Binary Search Tree for storing loans keyed by interest rate.

Node Attributes:

  • key (float)
  • values (list of Loan objects)
  • left, right (Node)

BST Methods:

  • add(key, val): Adds a loan to the tree.
  • __getitem__(key): Returns number of loans with the specified key.
  • height(): Returns tree height.
  • count_leaves(): Counts the number of leaf nodes.
  • find_top_n(n): Returns top N interest rates.

Analysis Highlights

  1. Average Interest Rate
    Calculated per bank, ignoring missing values.

  2. Applicants per Loan
    Average number of applicants (applicant + co-applicant).

  3. Age Distribution
    Frequency of applicants in each age bracket.

  4. BST Analysis

    • Count missing interest rates without looping through all loans.
    • Compute tree height and number of leaves.
    • Efficient lookup for specific interest rates.
  5. Performance Benchmarking

    • Time to add first 15,000 loans to BST.
    • Compare lookup time using BST vs naive iteration.
  6. Racial Identity Distribution

    • Bar chart showing the number of racial identities per applicant.

How to Run

  1. Install required packages:
pip install matplotlib pandas
  1. Run the notebook for analysis:
jupyter notebook mp3.ipynb

About

Analyze 2020 Wisconsin loan applications to explore lending patterns, detect potential biases, and practice Python OOP and data structures with Loan, Applicant, Bank, and BST classes.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors