FishSave helps governments understand how overfishing in their country affects seafood price inflation. It forecasts overfishing and inflation for the next 5 and 10 years, scores risk by country, and recommends policy implementations from a database—tailored by risk level—with reasoning, example actions, and optional audio explanations. Officials can accept or reject recommendations; when rejected, the next policy from the database is shown on the dashboard.
- Forecast next 5 and 10 years of overfishing rate (OFR) and seafood price inflation per country using historical catch data and time-series models.
- Score risk so governments can see how significantly their overfishing activity impacts seafood price inflation (risk score in [0, 1]).
- Recommend policy implementations from a policy database, filtered/tailored by risk level (Low / Medium / High).
- Policy popup: For each recommendation, a popup shows:
- What the policy means — reasoning and explanation.
- Plan — example measurable actions to implement the policy.
- Explain: Explains how the listed policy and planned actions help reduce overfishing and by how much they reduce overfishing’s impact on seafood price inflation (text + optional audio via TTS).
- Accept or reject: Users can accept a recommendation or reject it. On reject, the app fetches the next policy from the database and displays it on the dashboard; they can accept when they find the best option to slow seafood price inflation caused by overfishing.
- Primary: Worldwide Fishing Catch Statistics 1950–2018 (Kaggle)
Columns: Country, Year, Species, Catch (tons).
The pipeline uses the following definitions and models.
For country (c) and year (t), catch relative to initial (first non-zero) catch:
[ \text{OFR}{c,t} = \frac{\text{catch}{c,t}}{\text{catch}_{c,\text{initial}}} \times 100 ]
- Used for: Measuring pressure on fish stocks; base for OFR change and proxy price index.
[ \text{OFR_change}{c,t} = \frac{\text{OFR}{c,t} - \text{OFR}{c,t-1}}{\text{OFR}{c,t-1}} \times 100 ]
- Used for: Time-series inputs and risk components (slope/sum over forecast windows).
Iterative update with sensitivity (\beta) (e.g. 0.5):
[ \text{PriceIndex}t = \text{PriceIndex}{t-1} \times \left(1 + \beta \times \frac{\text{OFR_change}_t}{100}\right) ]
- Used for: Proxy for seafood price level; regression target in the downstream ML model; basis for inflation %.
[ \text{Inflation_pct}t = \frac{\text{PriceIndex}t - \text{PriceIndex}{t-1}}{\text{PriceIndex}{t-1}} \times 100 ]
- Used for: Forecast outputs (next 5/10 years) and as a component of risk score.
From forecast tables (e.g. forecast_next5years.csv, forecast_next10years.csv) with columns Country, Year, OFR_change, Inflation_pct:
-
Per country: Compute over the forecast window:
slope_OFR_change— linear regression slope of OFR_change vs yearslope_Inflation_pct— slope of Inflation_pct vs yearsum_OFR_change— sum of OFR_changesum_Inflation_pct— sum of Inflation_pct
-
Normalize each of these four components to [0, 1] across countries (min–max). Constant series → 0.5.
-
Risk score = mean of the four normalized components:
[ \text{risk_score} = \frac{1}{4}\big(\tilde{s}{\text{OFR}} + \tilde{s}{\text{inf}} + \tilde{S}{\text{OFR}} + \tilde{S}{\text{inf}}\big) ]
where (\tilde{s}) are normalized slopes and (\tilde{S}) are normalized sums. Done separately for 5-year and 10-year windows → risk_score_5y, risk_score_10y.
- Used for: Ranking countries by “inflation due to overfishing” risk; tailoring policy recommendations by risk level.
- Catch series: ARIMA models (orders chosen from a small set, e.g. (0,1,0), (0,1,1), (1,1,1), …) fit on historical catch; best order by validation MAE. Forecast steps = from last data year through next 5 or 10 years.
- From forecast catch: Same formulas as above to get OFR → OFR_change → PriceIndex → Inflation_pct for each future year.
- Outputs:
forecast_next5years.csv,forecast_next10years.csv(Country, Year, OFR_change, Inflation_pct).
- Purpose: Predict PriceIndex from a fixed 10-year window of catch (or derived features).
- Inputs:
train_features_updated.csv,val_features_updated.csv,test_features_updated.csv(10-year windows: train 1991–2000, val 2001–2010, test 2016–2025). - Model:
RandomForestRegressor(e.g. 100 trees). - Target: PriceIndex.
- Outputs:
model.joblib,eval_report.json(MAE/RMSE).
.
├── index.html # Frontend: dashboard, country/horizon selection, policy popup, Explain, Accept/Reject
├── 404.html
├── firebase.json # Firebase Hosting (public: .)
├── .firebaserc # Firebase project (overfishing-2d415)
├── package.json # npm (Firebase CLI)
├── requirements.txt # Python backend deps
├── Dockerfile # Cloud Run TTS API
├── .env # API keys (not committed): COHERE, ELEVENLABS, etc.
│
├── Data & features
├── FishStats2018.csv # Raw catch data (Kaggle)
├── data_agent.py # Load/aggregate catch data
├── feature_agent.py # OFR, OFR_change, PriceIndex → country_features.csv
├── country_features.csv
├── split_train_val_test.py # Train/val/test splits
├── train_features.csv, val_features.csv, test_features.csv
├── update_pipeline.py # Updated feature CSVs
├── train_features_updated.csv, val_features_updated.csv, test_features_updated.csv
│
├── Forecasting
├── forecast_agent.py
├── forecast_ofr_2019_2025.py
├── forecast_next_5_10_years.py # ARIMA → forecast_next5years.csv, forecast_next10years.csv
├── forecast_next5years.csv, forecast_next10years.csv
├── enrich_forecast_predictions.py
├── evaluate_forecast_model.py
│
├── Risk & policy
├── compute_risk_score.py # risk_score_5y, risk_score_10y from slopes/sums + min-max norm
├── risk_score.csv
├── train_risk_classifier.py # Risk category (Low/Medium/High)
├── risk_score_with_category.csv
├── policy_chunks.py # Policy DB chunks
├── ingest_policy_chroma.py # ChromaDB policy embeddings
├── query_policy_chroma.py
├── recommend_by_risk.py # Risk-based policy recommendations (Chroma + Gemini)
├── explain_why_reduces_overfishing.py
├── explain_policy_impact_for_elevenlabs.py # Cohere explanation for TTS
│
├── ML model (PriceIndex)
├── train_model.py # RandomForestRegressor on 10-year windows → model.joblib
├── model.joblib
├── eval_report.json
│
├── TTS API (Explain audio)
├── policy_tts_api.py # FastAPI: /api/policy-explanation-audio, /api/risk-explanation-audio
│
├── DB / optional
├── neon_schema.sql, neon_pgvector.sql
├── load_country_risk_to_neon.py, load_forecast_metrics_to_neon.py
├── embed_and_store.py, query_embeddings.py
├── scrape_policy_sources.py
└── DEPLOY.md # Deployment notes
(Config and generated CSVs like forecast_*.csv, risk_score*.csv, *_updated.csv live in project root; node_modules/ and Firebase cache are omitted.)
- Open
index.htmlin a browser, or serve the directory with any static server. - For the live app: https://overfishing-2d415.web.app (Firebase Hosting).
- Env: Create
.envwithGEMINI_API_KEY,OPENAI_API_KEY,COHERE_API_KEY,ELEVENLABS_API_KEY, optionalELEVENLABS_VOICE_ID. - Install:
pip install -r requirements.txt - Run:
uvicorn policy_tts_api:app --reload --port 8010 - Endpoints:
POST /api/policy-explanation-audio— policy explanation audioPOST /api/risk-explanation-audio— risk explanation audioGET /health,GET /— health and service info
- Features:
python feature_agent.py(and split/update pipeline as needed). - Forecasts:
python forecast_next_5_10_years.py→forecast_next5years.csv,forecast_next10years.csv. - Risk:
python compute_risk_score.py→risk_score.csv. - (Optional) Risk category:
train_risk_classifier.py→risk_score_with_category.csv. - (Optional) PriceIndex model:
python train_model.py→model.joblib,eval_report.json.
- Frontend (live): https://overfishing-2d415.web.app
- Public GitHub (hackathon):
index.htmluses placeholder__FIREBASE_WEB_API_KEY__. CI injects the real key — see HACKATHON_PUBLIC_REPO.md. - Local / manual deploy: Add
FIREBASE_WEB_API_KEY=...to.env, then:npm run firebase:inject→npx firebase deploy --only hosting(orfirebase serve)- Before commit:
npm run firebase:restore-placeholder
- Firebase Web API key security: FIREBASE_SECURITY_SETUP.md (HTTP referrers, rotate, etc.).
- TTS API: Deploy
policy_tts_apito Google Cloud Run (e.g.gcloud run deploy fishsave-tts-api --source . --region us-central1 --allow-unauthenticated). Set env varsELEVENLABS_API_KEYand optionallyELEVENLABS_VOICE_IDin the Cloud Run service. - Point the frontend’s
API_BASEinindex.htmlto your Cloud Run URL.
- Dataset: Worldwide Fishing Catch Statistics 1950–2018 (Kaggle).
- Built for sustainability and government decision support; suitable for submission and reuse with attribution.