MarketPulse UK is a Damm x Engineering Hub hackathon project that forecasts UK beer sales for May–December 2026, compares the outlook against the 2025 target baseline, explains deviations by brand/channel/month, and generates ranked commercial recommendations with GBP impact estimates.
This repository intentionally does not store raw Damm workbooks. The private files stay local and are loaded from DAMM_RAW_DATA_DIR, which defaults to ../Repte internacional.
- Build a real working demo, not a static mockup.
- Forecast monthly UK sales from history, target baseline, and weekly promotion signals.
- Compare forecast against budget or target estimate.
- Explain deviations by month, channel, brand, promotion, and external signals.
- Recommend actions that help close the gap with estimated GBP impact.
- Document sources, assumptions, limitations, and run instructions.
marketpulse prepare --raw-dir "../Repte internacional" # 1. Clean & anonymize sales + promos
marketpulse enrich # 2. Build external signals table
marketpulse train # 3. Validate model on Jan-Apr 2026
marketpulse forecast --year 2026 --start-month 5 --end-month 12 # 4. Forecast May-Dec 2026
marketpulse recommend # 5. Rank commercial recommendations
marketpulse impact # 6. Add GBP impact & scenariosOr run everything at once:
marketpulse build-all --raw-dir "../Repte internacional"The enrichment step builds a monthly context table from seven sources:
| Signal | Source | Status |
|---|---|---|
| UK bank holidays | GOV.UK bank-holidays.json API | Live (cached) |
| Holiday sensitivity | Weighted by type — Summer Bank=0.9, Easter=0.6, Christmas=0.1 | Static |
| Temperature & precipitation | Open-Meteo Archive API — London proxy | Live / climatological fallback |
| CPI inflation pressure | ONS CPI D7BT (12-month rate) — 2023–2024 published actuals, 2025–2026 BoE-aligned projection | Real ONS data |
| Google Trends — "beer" UK | pytrends (GB geo) | Live / seasonal fallback |
| Sports uplift — England | Six Nations 2023–2026, Euro 2024, World Cup 2026 (expected value with Win/Draw/Loss probabilities) | Static catalogue |
| Festival uplift — UK music | Glastonbury, Reading, Leeds, Creamfields, Download, Latitude, TRNSMT, Boomtown, Isle of Wight (2023–2026) | Static catalogue |
Festival uplift is channel-weighted: OFF TRADE ×1.2 vs ON TRADE ×0.7, reflecting pre-festival purchasing behaviour vs on-premise consumption. Sports uplift peaks in June–July 2026 due to the World Cup.
- UK rows are selected through the customer dimension (
Cod. Pais == GB/ United Kingdom labels). - Forecast history uses positive-HL demand rows only. Returns and credits are excluded from training.
- The default target is same segment and month in 2025 (prior-year baseline).
- Gross price per HL: total gross sales ÷ total Hl at segment level. Source values in k£ are scaled ×1000 to GBP.
- The forecast is iterative — each predicted month feeds the next month's lag features.
- Promotion uplifts are estimated from observed Jan–Apr 2026 actuals vs prior-year baseline (OFF TRADE only).
- The dashboard hides segments with target = 0 to keep charts focused on meaningful comparisons.
- Validation is back-tested on Jan–Apr 2026 actuals. Monthly WAPE (17%) aggregates the 4 monthly totals — the level at which commercial decisions are made. Segment WAPE (33.9%) covers 204 subchannel × brand rows and is disclosed in the model-validation panel.
- The forecast is +52% above the 2025 target baseline, driven by three factors: stronger Jan–Apr 2026 run-rate vs prior year, World Cup 2026 expected uplift (probability-weighted), and UK summer festival season. December is the only month below target.
app/ Streamlit demo entrypoint
src/marketpulse/ Python package — pipeline, model, recommendations, API, CLI
├── data_pipeline.py Sales & promotions ingestion and anonymization
├── enrichment.py External signals (holidays, weather, sports, festivals, trends)
├── forecasting.py Iterative forecast model with explainable drivers
├── recommendations.py Ranked gap-closing actions
├── impact.py GBP impact estimation and scenario analysis
└── web_api.py FastAPI backend for the frontend demo
frontend/ TypeScript + Vite demo dashboard
tests/ Unit and smoke tests
docs/ Data contract, report, demo script, and source notes
data/sample/ Tiny anonymized fixtures only
data/processed/ Local generated outputs — ignored by Git
data/cache/ Local API cache files — ignored by Git
python -m venv .venv
source .venv/bin/activate
python -m pip install -e ".[dev]"
cp .env.example .envEdit .env if your raw files are not in ../Repte internacional.
Expected private files (not committed):
UK DATA.xlsxDamm Trade Plan - promotions.xlsx
Polished TypeScript + FastAPI demo:
make webOpen http://localhost:5173 — API runs at http://localhost:8000.
The Commercial Copilot works locally by default. If HF_TOKEN is set, it calls Hugging Face Inference with HF_MODEL=meta-llama/Llama-3.3-70B-Instruct. Only compact anonymized segment context is ever sent — no raw customer or retailer data leaves the machine.
Click the "Take a tour" button in the hero band to launch a 12-step interactive walkthrough designed for a first-time commercial user. The tour covers:
- Hero KPI band — total forecast, gap vs target, revenue, peak month
- Four key commercial KPIs
- Forecast narrative card — why the forecast is above the 2025 target
- Monthly volume chart vs target
- Channel breakdown table
- Scenario simulator — six ranked promotion levers
- Model validation panel — monthly WAPE and segment WAPE
- Deviation Drivers view — largest risk pockets by brand/channel
- Scenario Simulator view — interactive promo calendar
- Action Plan view — 20 ranked recovery actions
- Commercial Copilot — natural language Q&A
- Filters panel — brand/channel/month scope
The tour uses a spotlight overlay with a dark backdrop and renders a floating card adjacent to each highlighted element. It handles window resize and scroll automatically.
Streamlit fallback:
make streamlitThe raw Damm datasets are confidential hackathon assets. Do not commit raw Excel files, generated customer-level exports, API cache files, or local secrets. The .gitignore enforces this by excluding workbook formats and generated data/ output folders. Retailer names are anonymized automatically during marketpulse prepare (Retailer A/B/C/D/E).