Skip to content

[NeurIPS2025 submission] Official Pytorch implementation of "Connecting Jensen–Shannon and Kullback–Leibler Divergences: A New Bound for Representation Learning""

License

Notifications You must be signed in to change notification settings

ReubenDo/JSDlowerbound

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Connecting Jensen–Shannon and Kullback–Leibler Divergences: A New Bound for Representation Learning

Reuben Dorent, Polina Golland, Williams Wells III

Official repository of the submission "Connecting Jensen–Shannon and Kullback–Leibler Divergences: A New Bound for Representation Learning" at NeurIPS 2025.

JSD-LB is a novel lower bound on mutual information as a function of the Jensen-Shannon-based information that is differentiable and with low variance. our results provide new theoretical justifications and strong empirical evidence for using discriminative learning in MI-based representation learning.


📈 Important results from our paper

Our bound is tight!

Mutual information and its JSD-based lower bound $\Xi({\rm I_{\rm JS}})$ for a parameterized family of discrete joint distributions with known MI and JSD, varying in dependence strength ($\alpha$) and number of categories ($k$). The presence of MIs near the lower bound across settings empirically demonstrates the tightness of our JSD-based estimate.

Availability of an approximation of the implicit function $\Xi$

Low variance and low bias Variational Lower Bound (VLB) estimate of MI

The figures below show the comparison between the performance of well-known VLBs and ours.

Gaussian and Cubic

Half-cube, Asinh, Uniform and Student


💻 How to run the code

The file main.py runs all the experiments. There are four running modalities that are accepted by the argument parser:

  • "staircase": target MI has a staircase shape and the scenario;
  • "uniform": MI of uniform random variables;
  • "student": MI of the multivariate student distribution scenario.

There are four possible modes in staircase: Gaussian, Cubic, Asinh, and Half-cube.

The types of architectures implemented are: "joint", "deranged", and "separable", which can be set modifying the variable architectures_list.

To test various MI estimators, the field 'divergences' in the dictionary proc_params can be set to "MINE", "NWJ", "SMILE", "CPC", "KL", "HD", "GAN", and "SL" for fDIME for the related works, and 'JSD-LB' for ours.

You can run main.py by setting the argument "mode":

python main.py --scenario staircase student uniform --staircase_mode gaussian cubic asinh half-cube


🤓 General description

The code comprises the implementation of various existing mutual information (MI) estimators (e.g. MINE, NWJ, InfoNCE, SMILE, NJEE, $f$-DIME) that are compared with our proposed new estimator of variational lower bound on MI : JSD-LB.

📋 Acknowledgments

The implementation is based on

Which was based/ inspired by:


Citation

If you find our bound useful for your work, please cite our paper:

@article{dorent2025jsdlb,
  title={{Connecting Jensen–Shannon and Kullback-Leibler Divergences: A New Bound for Representation Learning}},
  author={Dorent, Reuben and Golland, Polina and Wells III, William},
  journal={Advances in Neural Information Processing Systems},
  year={2025}
}

About

[NeurIPS2025 submission] Official Pytorch implementation of "Connecting Jensen–Shannon and Kullback–Leibler Divergences: A New Bound for Representation Learning""

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published