Skip to content

iffataz/extreme-weather-australia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AIDR Disaster Event Scraper

A lightweight scaffold for scraping the AIDR disaster events API, normalizing data, and persisting it with a CLI orchestrator.

Quick start

  1. Create a virtual environment and install dependencies:
python -m venv .venv
# Windows PowerShell
.\.venv\Scripts\Activate.ps1
# macOS/Linux
source .venv/bin/activate
pip install -r requirements.txt
  1. Make the src package importable and load environment settings:
# Windows PowerShell
$env:PYTHONPATH = "src"
copy .env.example .env
# macOS/Linux
export PYTHONPATH=src
cp .env.example .env
  1. Run the scraper pipeline (fetch -> normalize -> store):
python -m aidr_scraper.main scrape --start-year 2005 --end-year 2025
  1. Refresh analytics/materialized views and show category counts:
python -m aidr_scraper.main refresh-views
python -m aidr_scraper.main analytics
  1. Preview a few rows as CSV (defaults to stdout, or pass --output to save):
python -m aidr_scraper.main sample-csv --limit 5
python -m aidr_scraper.main sample-csv --limit 10 --output data/sample.csv

Environment variables

  • DATABASE_URL (optional): SQLAlchemy URL. Defaults to sqlite:///data/aidr.db.
  • AIDR_API_URL (optional): Override the AIDR resource search endpoint.
  • AIDR_TIMEOUT (optional): Request timeout in seconds (default 30).

Project layout

  • src/aidr_scraper/ - package with fetch, normalize, storage, transform, and CLI orchestration.
  • migrations/ - SQL DDL scripts to bootstrap the database schema.
  • web_scraper.py - reference script the scaffold was based on.

Notes

  • The CLI uses Typer for ergonomics and python-dotenv to load .env automatically.
  • BeautifulSoup is used to safely strip any HTML fragments in summaries returned by the API.
  • The storage layer uses SQLAlchemy with an idempotent upsert to deduplicate events.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages