Inspiration

Financial markets react heavily to the tone and wording of Federal Reserve communication, especially during FOMC press conferences. We wanted to explore whether the sentiment inside Jerome Powell’s speeches contains predictive signals for next-day S&P 500 movement.

What it does

This project predicts next-day S&P 500 return direction (up or down) using the language inside Jerome Powell’s FOMC press conference transcripts. It converts each speech into sentiment scores and keyword features, then uses machine learning models to forecast how the market will react the following trading day. In short: it turns central bank communication into a quant trading signal.

How we built it

1. Data Collection

  • Gathered full FOMC press conference transcripts from Kaggle dataset
  • Pulled next-day S&P 500 returns to align speech dates with market reactions (Yahoo Finance)

2. NLP Feature Extraction

  • Using FinBERT for finance-specific sentiment (positive / negative / neutral)
  • Applied RAKE to extract key phrases and theme from each Powell speech
  • Engineered additional features such as speech length and sentiment ratios

3. Modeling

  • Created labels for next-day S&P 500 direction (1 = up / 0 = down)
  • Trained ML models (Random Forest) using both NLP features and engineered features

4. Evaluation

  • Measured accuracy, precision, recall, and ROC-AUC
  • Tested the impact of sentiment-only vs combined NLP & macro features

Challenges we ran into

Financial markets are extremely noisy, meaning most movements are driven by macro events, risk sentiment, or randomness rather than one single speech. Extracting a clean signal from Powell’s language (while avoiding overfitting) is difficult.

Accomplishments that we're proud of

  • Demonstrated that central bank language contains measurable signals, even in a highly noisy financial environment
  • Successfully extracted financial sentiment using FinBERT and combined it with keyword and semantic features from RAKE
  • Gained experience working with real financial data and proved we could build something, despite the complexity and noise of financial prediction

What's next for When Words Move Market

  • Incorporating more macro variables. After seeing how noisy and unpredictable financial markets are, adding more macro context can help separate real signals from randomness
  • Building a regime detection layer to understand whether Powell’s speeches matter more during high volatility periods or tightening cycles.
  • Exploring more advanced NLP models

Built With

Share this project:

Updates