Skip to content

ArdiaD/OptimizeAttention

Repository files navigation

2025-06-12

Overview

This README file provides information about the replication repository (computer code and data) used to generate the results presented in Ardia & Bluteau (2025), Optimal Text-Based Time-Series Indices, conditionally accepted at the International Journal of Forecasting. The latest version of the paper optimize_attention.pdf is available in the replication repository.

Contact Information

  • David Ardia – CIRANO & GERAD & HEC Montréal, Canada
  • Keven Bluteau – University of Sherbrooke, Canada

Data Availability and Provenance Statements

Textual data used in this project are proprietary and cannot be shared publicly due to licensing restrictions. Access to these data is granted exclusively to the Editor of the International Journal of Forecasting for review purposes. For all other users, we provide pseudo-data.

Computational Requirements

The computations are very demanding. On a modern computer, it takes about one day to generate an illustrative setup and several days to produce the complete set of results.

You must use a Windows machine with R version 4.2.3, RStudio, Rtools, and at least 64 GB of RAM. Compatibility with this specific R version is critical; we recommend using rig to manage R versions.

We use the R package renv to install the exact versions of the packages used. If installation with renv fails, you can run 99_run_install_packages.R to install dependencies manually.

See the file session_info.txt in the repository for the full session details that generated the results.

Instructions to Replicators

  1. Clone the repository to your computer.
  2. Open the R project optimize_attention.Rproj.
  3. Run:
    renv::restore()
    and confirm with “y”.
  4. If needed, install any failed packages using their specific versions from CRAN archives as per session_info.txt; see 99_run_install_packages.R below.
  5. Run 00_run_all.R to generate all results, or run individual scripts as described below.

Description of Programs/Code

Script Description
00_run_all.R Master wrapper to run all scripts sequentially
01_replicate_epu.R Generates results of Section 4.2
02_plot_epu.R Generates Figure 2
03_forecast_inflation.R Generates forecasting results (Section 5.3)
04_nowcast_inflation.R Generates nowcasting results (Section 5.3)
05_measure_performance.R Generates Table 1 and Table 2
06_analyze_topics.R Generates Figure 4
07_plot_sentiment.R Generates Figure 3
99_run_install_packages.R Fallback installer if renv fails

Folder Structure

Folder Content
data/ Contains the various datasets
figures/ Populated by figures generated by the scripts
functions/ R functions used by the scripts
output/ Outputs generated by the scripts
renv/ Metadata for renv setup
tables/ Tables generated by the scripts

The data/ folder includes several precomputed .rda files:

  • dfm_filtered_resolved_unigram_ManualvocFilt_sentiment_accronym.rda: Original DFM that is granted exclusively to the Editor of the International Journal of Forecasting for review purposes. This dataset is not available in the public repository.
  • dfm_filtered_resolved_unigram_ManualvocFilt_sentiment_accronym_pseudo.rda: Pseudo-data DFM, available in the public repository.
  • wv_keywords_fintext.rda: Pretrained FinText word vectors filtered to include only the keywords used in this project.
  • T5YIEM.csv: 5-Year Breakeven Inflation Rate — downloaded from the Federal Reserve Economic Data (FRED).
  • CPILFESL.csv: Core CPI for Urban Consumers (Excluding Food and Energy) — also from FRED.

Key Files

  • renv.lock – Used by renv to reproduce the R environment
  • session_info.txt – Records full session details of the system used

References

Ardia, D., & Bluteau, K. (2025). Optimal Text-Based Time-Series Indices, International Journal of Forecasting (conditionally accepted).
Available at SSRN: https://dx.doi.org/10.2139/ssrn.4830848

Acknowledgements

We appreciate your interest in our work and encourage you to reach out if you have any questions regarding the replication process.

About

Code repository for the paper "Optimal text-based time-series indices"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages