Skip to content

Experimental code of the article "Design-based Bandits Under Network Interference: Trade-off Between Regret And Statistical Inference" (NeurIPS 2025))

Notifications You must be signed in to change notification settings

TheoryMagic/Design-based-Bandits

Repository files navigation

This repository contains the experimental code for the NeurIPS 2025 paper:

"Design-based Bandits Under Network Interference: Trade-off Between Regret and Statistical Inference".

Overview

We provide the experimental scripts:

  • Experiments in the main body of the paper:

    • exp_101units_adv.py: Simulates a 101-unit star network for multi-armed bandit (MAB) experiments under network interference.
  • Experiments in Appendix E (Additional Experimental Results):

    • exp_singleunit_adv.py: Instance 1
    • exp_6unit_adv_222.py: Instance 2
    • exp_10unit_adv_145.py: Instance 3
    • exp_10unit_adv_1333.py: Instance 4

All scripts evaluate the performance of three exploration-exploitation strategies:

  • Uniform: Fully exploratory approach.
  • Standard EXP3: Traditional EXP3 method for regret minimization.
  • EXP3-N-CS: Our proposed method balancing regret minimization and statistical inference.

Running the Experiments

Execute the following commands to run the experiments:

# Example:

# Run the 101-unit star network experiment
python exp_101units_adv.py

# Run the 5-arm multi-armed bandit experiment
python exp_singleunit_adv.py

Experimental Setup

Take exp_101units_adv.py as an example:

  • Simulates a 101-unit star network with 1 central node and 5 outer clusters.
  • Each cluster follows a shared action assignment.
  • Implements exposure mapping to define unit responses based on neighboring actions.
  • Compares Uniform, Standard EXP3, and EXP3-N-CS under an adversarial reward schedule.

Results and Interpretation

  • Cumulative Regret: Lower values indicate better exploration-exploitation balance.
  • CS Width: Smaller widths indicate more precise statistical inference.
  • ATE Estimation Error: Smaller error means better inference performance.

The EXP3-N-CS method demonstrates a trade-off between minimizing regret and improving statistical inference, outperforming Standard EXP3 in inference accuracy while still maintaining competitive regret.

Customizing the Experiments

  • Modify the network topology in exp_101units_adv.py by adjusting the adjacency matrix in section (A).
  • Adjust the horizon T and number of replicates N_exp in both scripts to test different settings.
  • Change the exploration parameter delta_t in EXP3-N-CS to observe different trade-offs.

Output Plots

After running the scripts, you will see:

  1. Cumulative Regret comparison across methods.
  2. CS Width evolution for the most challenging inference pairs.
  3. ATE Estimation Error distribution.

License

This code is for academic and research purposes.

About

Experimental code of the article "Design-based Bandits Under Network Interference: Trade-off Between Regret And Statistical Inference" (NeurIPS 2025))

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages