{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T21:09:56Z","timestamp":1777496996495,"version":"3.51.4"},"reference-count":160,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2022,2,21]],"date-time":"2022-02-21T00:00:00Z","timestamp":1645401600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>The free energy principle, and its corollary active inference, constitute a bio-inspired theory that assumes biological agents act to remain in a restricted set of preferred states of the world, i.e., they minimize their free energy. Under this principle, biological agents learn a generative model of the world and plan actions in the future that will maintain the agent in an homeostatic state that satisfies its preferences. This framework lends itself to being realized in silico, as it comprehends important aspects that make it computationally affordable, such as variational inference and amortized planning. In this work, we investigate the tool of deep learning to design and realize artificial agents based on active inference, presenting a deep-learning oriented presentation of the free energy principle, surveying works that are relevant in both machine learning and active inference areas, and discussing the design choices that are involved in the implementation process. This manuscript probes newer perspectives for the active inference framework, grounding its theoretical aspects into more pragmatic affairs, offering a practical guide to active inference newcomers and a starting point for deep learning practitioners that would like to investigate implementations of the free energy principle.<\/jats:p>","DOI":"10.3390\/e24020301","type":"journal-article","created":{"date-parts":[[2022,2,21]],"date-time":"2022-02-21T20:24:21Z","timestamp":1645475061000},"page":"301","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":38,"title":["The Free Energy Principle for Perception and Action: A Deep Learning Perspective"],"prefix":"10.3390","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3319-5986","authenticated-orcid":false,"given":"Pietro","family":"Mazzaglia","sequence":"first","affiliation":[{"name":"IDLab, Ghent University, 9052 Gent, Belgium"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2731-7262","authenticated-orcid":false,"given":"Tim","family":"Verbelen","sequence":"additional","affiliation":[{"name":"IDLab, Ghent University, 9052 Gent, Belgium"}]},{"given":"Ozan","family":"\u00c7atal","sequence":"additional","affiliation":[{"name":"IDLab, Ghent University, 9052 Gent, Belgium"}]},{"given":"Bart","family":"Dhoedt","sequence":"additional","affiliation":[{"name":"IDLab, Ghent University, 9052 Gent, Belgium"}]}],"member":"1968","published-online":{"date-parts":[[2022,2,21]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1007\/s11229-007-9237-y","article-title":"Free-energy and the brain","volume":"159","author":"Friston","year":"2007","journal-title":"Synthese"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"862","DOI":"10.1016\/j.neubiorev.2016.06.022","article-title":"Active inference and learning","volume":"68","author":"Friston","year":"2016","journal-title":"Neurosci. Biobehav. Rev."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"61","DOI":"10.3389\/fnhum.2018.00061","article-title":"Computational Neuropsychology and Bayesian Inference","volume":"12","author":"Parr","year":"2018","journal-title":"Front. Hum. Neurosci."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"30","DOI":"10.3389\/fncom.2020.00030","article-title":"An Investigation of the Free Energy Principle for Emotion Recognition","volume":"14","author":"Demekas","year":"2020","journal-title":"Front. Comput. Neurosci."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"549187","DOI":"10.3389\/fpsyg.2020.549187","article-title":"Variational Free Energy and Economics Optimizing with Biases and Bounded Rationality","volume":"11","author":"Henriksen","year":"2020","journal-title":"Front. Psychol."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"20170685","DOI":"10.1098\/rsif.2017.0685","article-title":"A variational approach to niche construction","volume":"15","author":"Constant","year":"2018","journal-title":"J. R. Soc. Interface"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/j.jtbi.2018.07.002","article-title":"Free-energy minimization in joint agent-environment systems: A niche construction perspective","volume":"455","author":"Bruineberg","year":"2018","journal-title":"J. Theor. Biol."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"777","DOI":"10.1007\/s00422-014-0620-8","article-title":"Active inference, eye movements and oculomotor delays","volume":"108","author":"Perrinet","year":"2014","journal-title":"Biol. Cybern."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"334","DOI":"10.1016\/j.neuropsychologia.2018.01.041","article-title":"Active inference and the anatomy of oculomotion","volume":"111","author":"Parr","year":"2018","journal-title":"Neuropsychologia"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"218","DOI":"10.3389\/fpsyg.2011.00218","article-title":"Active Inference, Attention, and Motor Preparation","volume":"2","author":"Brown","year":"2011","journal-title":"Front. Psychol."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"14678","DOI":"10.1038\/s41598-017-15249-0","article-title":"Working memory, attention, and salience in active inference","volume":"7","author":"Parr","year":"2017","journal-title":"Sci. Rep."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"56","DOI":"10.3389\/fncom.2016.00056","article-title":"Scene Construction, Visual Foraging, and Active Inference","volume":"10","author":"Mirza","year":"2016","journal-title":"Front. Comput. Neurosci."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"81","DOI":"10.3389\/frai.2020.509354","article-title":"Deep Active Inference and Scene Construction","volume":"3","author":"Heins","year":"2020","journal-title":"Front. Artif. Intell."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Biehl, M., Pollock, F.A., and Kanai, R. (2021). A Technical Critique of Some Parts of the Free Energy Principle. Entropy, 23.","DOI":"10.3390\/e23030293"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Friston, K.J., Da Costa, L., and Parr, T. (2021). Some Interesting Observations on the Free Energy Principle. Entropy, 23.","DOI":"10.3390\/e23081076"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"20130475","DOI":"10.1098\/rsif.2013.0475","article-title":"Life as we know it","volume":"10","author":"Friston","year":"2013","journal-title":"J. R. Soc. Interface"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"20170792","DOI":"10.1098\/rsif.2017.0792","article-title":"The Markov blankets of life: Autonomy, active inference and the free energy principle","volume":"15","author":"Kirchhoff","year":"2018","journal-title":"J. R. Soc. Interface"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"20200503","DOI":"10.1098\/rsif.2020.0503","article-title":"Future climates: Markov blankets and active inference in the biosphere","volume":"17","author":"Rubin","year":"2020","journal-title":"J. R. Soc. Interface"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Maturana, H.R., Varela, F.J., and Maturana, H.R. (1980). Autopoiesis and Cognition: The Realization of the Living, D. Reidel Pub. Co.","DOI":"10.1007\/978-94-009-8947-4"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"2519","DOI":"10.1007\/s11229-016-1100-6","article-title":"Autopoiesis, free energy, and the life\u2013mind continuity thesis","volume":"195","author":"Kirchhoff","year":"2018","journal-title":"Synthese"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"859","DOI":"10.1080\/01621459.2017.1285773","article-title":"Variational Inference: A Review for Statisticians","volume":"112","author":"Blei","year":"2017","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_22","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1038\/nrn1406","article-title":"Dopamine, learning and motivation","volume":"5","author":"Wise","year":"2004","journal-title":"Nat. Rev. Neurosci."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"15647","DOI":"10.1073\/pnas.1014269108","article-title":"Understanding dopamine and reinforcement learning: The dopamine reward prediction error hypothesis","volume":"108","author":"Glimcher","year":"2011","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"103535","DOI":"10.1016\/j.artint.2021.103535","article-title":"Reward is enough","volume":"299","author":"Silver","year":"2021","journal-title":"Artif. Intell."},{"key":"ref_26","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing Atari with Deep Reinforcement Learning. arXiv."},{"key":"ref_27","unstructured":"Badia, A.P., Piot, B., Kapturowski, S., Sprechmann, P., Vitvitskyi, A., Guo, D., and Blundell, C. (2020). Agent57: Outperforming the Atari Human Benchmark. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"350","DOI":"10.1038\/s41586-019-1724-z","article-title":"Grandmaster level in StarCraft II using multi-agent reinforcement learning","volume":"575","author":"Vinyals","year":"2019","journal-title":"Nature"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"604","DOI":"10.1038\/s41586-020-03051-4","article-title":"Mastering Atari, Go, chess and shogi by planning with a learned model","volume":"588","author":"Schrittwieser","year":"2020","journal-title":"Nature"},{"key":"ref_30","unstructured":"Akkaya, I., Andrychowicz, M., Chociej, M., Litwin, M., McGrew, B., Petron, A., Paino, A., Plappert, M., Powell, G., and Ribas, R. (2019). Solving Rubik\u2019s Cube with a Robot Hand. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1007\/s00422-018-0785-7","article-title":"Deep active inference","volume":"112","year":"2018","journal-title":"Biol. Cybern."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"\u00c7atal, O., Verbelen, T., Nauta, J., De Boom, C., and Dhoedt, B. (2020, January 4\u20138). Learning Perception and Planning with Deep Active Inference. Proceedings of the ICASSP 2020\u20142020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain.","DOI":"10.1109\/ICASSP40776.2020.9054364"},{"key":"ref_33","first-page":"11662","article-title":"Deep active inference agents using Monte-Carlo methods","volume":"Volume 33","author":"Larochelle","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1016\/j.jmp.2017.09.004","article-title":"The free energy principle for action and perception: A mathematical review","volume":"81","author":"Buckley","year":"2017","journal-title":"J. Math. Psychol."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"102447","DOI":"10.1016\/j.jmp.2020.102447","article-title":"Active inference on discrete state-spaces: A synthesis","volume":"99","author":"Parr","year":"2020","journal-title":"J. Math. Psychol."},{"key":"ref_36","unstructured":"Lanillos, P., Meo, C., Pezzato, C., Meera, A.A., Baioumy, M., Ohata, W., Tschantz, A., Millidge, B., Wisse, M., and Buckley, C.L. (2021). Active Inference in Robotics and Artificial Agents: Survey and Challenges. arXiv."},{"key":"ref_37","first-page":"516","article-title":"Amortized inference in probabilistic reasoning","volume":"36","author":"Gershman","year":"2014","journal-title":"Proc. Annu. Meet. Cogn. Sci. Soc."},{"key":"ref_38","unstructured":"Razavi, A., van den Oord, A., and Vinyals, O. (2019). Generating Diverse High-Fidelity Images with VQ-VAE-2. arXiv."},{"key":"ref_39","unstructured":"Karras, T., Aittala, M., Laine, S., H\u00e4rk\u00f6nen, E., Hellsten, J., Lehtinen, J., and Aila, T. (2021). Alias-Free Generative Adversarial Networks. arXiv."},{"key":"ref_40","unstructured":"Vahdat, A., and Kautz, J. (2021). NVAE: A Deep Hierarchical Variational Autoencoder. arXiv."},{"key":"ref_41","unstructured":"Zilly, J.G., Srivastava, R.K., Koutn\u00edk, J., and Schmidhuber, J. (2017). Recurrent Highway Networks. arXiv."},{"key":"ref_42","unstructured":"Melis, G., Ko\u010disk\u00fd, T., and Blunsom, P. (2020). Mogrifier LSTM. arXiv."},{"key":"ref_43","unstructured":"Brown, T.B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., and Askell, A. (2020). Language Models Are Few-Shot Learners. arXiv."},{"key":"ref_44","first-page":"802","article-title":"Convolutional LSTM network: A machine learning approach for precipitation nowcasting","volume":"28","author":"Xingjian","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_45","unstructured":"Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (2017). PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_46","unstructured":"Denton, E., and Fergus, R. (2018). Stochastic Video Generation with a Learned Prior. arXiv."},{"key":"ref_47","unstructured":"Lotter, W., Kreiman, G., and Cox, D. (2017). Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning. arXiv."},{"key":"ref_48","unstructured":"Buesing, L., Weber, T., Racaniere, S., Eslami, S.M.A., Rezende, D., Reichert, D.P., Viola, F., Besse, F., Gregor, K., and Hassabis, D. (2018). Learning and Querying Fast Generative Models for Reinforcement Learning. arXiv."},{"key":"ref_49","first-page":"2555","article-title":"Learning Latent Dynamics for Planning from Pixels","volume":"Volume 97","author":"Chaudhuri","year":"2019","journal-title":"Proceedings of the 36th International Conference on Machine Learning"},{"key":"ref_50","unstructured":"Ha, D., and Schmidhuber, J. (2018). Recurrent World Models Facilitate Policy Evolution. arXiv."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Mazzaglia, P., Catal, O., Verbelen, T., and Dhoedt, B. (2021). Self-Supervised Exploration via Latent Bayesian Surprise. arXiv.","DOI":"10.1609\/aaai.v36i7.20743"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Pathak, D., Agrawal, P., Efros, A.A., and Darrell, T. (2017). Curiosity-driven Exploration by Self-supervised Prediction. arXiv.","DOI":"10.1109\/CVPRW.2017.70"},{"key":"ref_53","unstructured":"Houthooft, R., Chen, X., Duan, Y., Schulman, J., De Turck, F., and Abbeel, P. (2016, January 5\u201310). VIME: Variational Information Maximizing Exploration. Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS\u201916, Barcelona, Spain."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"\u00c7atal, O., Leroux, S., De Boom, C., Verbelen, T., and Dhoedt, B. (January, January 24). Anomaly Detection for Autonomous Guided Vehicles using Bayesian Surprise. Proceedings of the 2020 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA.","DOI":"10.1109\/IROS45743.2020.9341386"},{"key":"ref_55","unstructured":"Hansen, N. (2016). The CMA Evolution Strategy: A Tutorial. arXiv."},{"key":"ref_56","unstructured":"Hubert, T., Schrittwieser, J., Antonoglou, I., Barekatain, M., Schmitt, S., and Silver, D. (2021). Learning and Planning in Complex Action Spaces. arXiv."},{"key":"ref_57","unstructured":"Von Helmholtz, H. (1867). Handbuch der Physiologischen Optik: Mit 213 in den Text Eingedruckten Holzschnitten und 11 Tafeln, Wentworth Press."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1016\/j.tics.2009.04.005","article-title":"The free-energy principle: A rough guide to the brain?","volume":"13","author":"Friston","year":"2009","journal-title":"Trends Cogn. Sci."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1177\/1059712319862774","article-title":"A tale of two densities: Active inference is enactive inference","volume":"28","author":"Ramstead","year":"2020","journal-title":"Adapt. Behav."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1162\/NETN_a_00018","article-title":"The graphical brain: Belief propagation and active inference","volume":"1","author":"Friston","year":"2017","journal-title":"Netw. Neurosci."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Friston, K.J., Daunizeau, J., and Kiebel, S.J. (2009). Reinforcement Learning or Active Inference?. PLoS ONE, 4.","DOI":"10.1371\/journal.pone.0006421"},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"2100","DOI":"10.3390\/e14112100","article-title":"A Free Energy Principle for Biological Systems","volume":"14","author":"Karl","year":"2012","journal-title":"Entropy"},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"e41703","DOI":"10.7554\/eLife.41703","article-title":"Computational mechanisms of curiosity and goal-directed exploration","volume":"8","author":"Schwartenbeck","year":"2019","journal-title":"eLife"},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"2633","DOI":"10.1162\/neco_a_00999","article-title":"Active Inference, Curiosity and Insight","volume":"29","author":"Friston","year":"2017","journal-title":"Neural Comput."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1080\/17588928.2015.1020053","article-title":"Active inference and epistemic value","volume":"6","author":"Friston","year":"2015","journal-title":"Cogn. Neurosci."},{"key":"ref_66","unstructured":"Hafner, D., Lillicrap, T., Norouzi, M., and Ba, J. (2021). Mastering Atari with Discrete World Models. arXiv."},{"key":"ref_67","unstructured":"Hafner, D., Lillicrap, T.P., Ba, J., and Norouzi, M. (May, January 26). Dream to Control: Learning Behaviors by Latent Imagination. Proceedings of the ICLR Conference, Addis Abeba, Ethiopia."},{"key":"ref_68","unstructured":"\u00c7atal, O., Nauta, J., Verbelen, T., Simoens, P., and Dhoedt, B. (2019). Bayesian policy selection using active inference. arXiv."},{"key":"ref_69","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1016\/j.neunet.2021.05.010","article-title":"Robot navigation as hierarchical active inference","volume":"142","author":"Verbelen","year":"2021","journal-title":"Neural Netw."},{"key":"ref_70","unstructured":"Kingma, D.P., and Welling, M. (2014). Auto-Encoding Variational Bayes. arXiv."},{"key":"ref_71","unstructured":"Rezende, D.J., Mohamed, S., and Wierstra, D. (2014, January 21\u201326). Stochastic Backpropagation and Approximate Inference in Deep Generative Models. Proceedings of the 31st International Conference on Machine Learning (ICML), Beijing, China."},{"key":"ref_72","unstructured":"Tishby, N., Pereira, F.C., and Bialek, W. (2000). The information bottleneck method. arXiv."},{"key":"ref_73","unstructured":"Alemi, A.A., Fischer, I., Dillon, J.V., and Murphy, K. (2019). Deep Variational Information Bottleneck. arXiv."},{"key":"ref_74","doi-asserted-by":"crossref","first-page":"713","DOI":"10.1162\/neco_a_01351","article-title":"Sophisticated Inference","volume":"33","author":"Friston","year":"2021","journal-title":"Neural Comput."},{"key":"ref_75","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1016\/0893-6080(91)90009-T","article-title":"Approximation capabilities of multilayer feedforward networks","volume":"4","author":"Hornik","year":"1991","journal-title":"Neural Netw."},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Heiden, E., Millard, D., Coumans, E., Sheng, Y., and Sukhatme, G.S. (2021). NeuralSim: Augmenting Differentiable Simulators with Neural Networks. arXiv.","DOI":"10.1109\/ICRA48506.2021.9560935"},{"key":"ref_77","unstructured":"Freeman, C.D., Frey, E., Raichuk, A., Girgin, S., Mordatch, I., and Bachem, O. (2021). Brax\u2014A Differentiable Physics Engine for Large Scale Rigid Body Simulation. arXiv."},{"key":"ref_78","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1007\/BF02055574","article-title":"A survey of algorithmic methods for partially observed Markov decision processes","volume":"28","author":"Lovejoy","year":"1991","journal-title":"Ann. Oper. Res."},{"key":"ref_79","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1613\/jair.1496","article-title":"Finding Approximate POMDP solutions Through Belief Compression","volume":"23","author":"Roy","year":"2005","journal-title":"J. Artif. Intell. Res."},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Kurniawati, H., Hsu, D., and Lee, W.S. (2008). Sarsop: Efficient point-based pomdp planning by approximating optimally reachable belief spaces. Robotics: Science and Systems, Citeseer.","DOI":"10.15607\/RSS.2008.IV.009"},{"key":"ref_81","unstructured":"Heess, N., Hunt, J.J., Lillicrap, T.P., and Silver, D. (2015). Memory-based control with recurrent neural networks. arXiv."},{"key":"ref_82","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"ref_83","unstructured":"Bengio, Y., L\u00e9onard, N., and Courville, A. (2013). Estimating or Propagating Gradients through Stochastic Neurons for Conditional Computation. arXiv."},{"key":"ref_84","doi-asserted-by":"crossref","unstructured":"Glynn, P.W. (1987, January 14\u201316). Likelilood ratio gradient estimation: An overview. Proceedings of the 19th Conference on Winter Simulation, Atlanta, GA, USA.","DOI":"10.1145\/318371.318612"},{"key":"ref_85","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1007\/BF00992696","article-title":"Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning","volume":"8","author":"Williams","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_86","first-page":"14","article-title":"Active Vision for Robot Manipulators Using the Free Energy Principle","volume":"15","author":"Verbelen","year":"2021","journal-title":"Front. Neurorobot."},{"key":"ref_87","unstructured":"Lee, A.X., Nagabandi, A., Abbeel, P., and Levine, S. (2020). Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model. arXiv."},{"key":"ref_88","unstructured":"Igl, M., Zintgraf, L., Le, T.A., Wood, F., and Whiteson, S. (2018). Deep Variational Reinforcement Learning for POMDPs. arXiv."},{"key":"ref_89","unstructured":"Rolfe, J.T. (2016). Discrete variational autoencoders. arXiv."},{"key":"ref_90","unstructured":"Ozair, S., Li, Y., Razavi, A., Antonoglou, I., van den Oord, A., and Vinyals, O. (2021). Vector Quantized Models for Planning. arXiv."},{"key":"ref_91","unstructured":"Sajid, N., Tigas, P., Zakharov, A., Fountas, Z., and Friston, K. (2021). Exploration and preference satisfaction trade-off in reward-free learning. arXiv."},{"key":"ref_92","doi-asserted-by":"crossref","unstructured":"Serban, I.V., Ororbia, A.G., Pineau, J., and Courville, A. (2017, January 7\u201311). Piecewise Latent Variables for Neural Variational Text Processing. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark.","DOI":"10.18653\/v1\/D17-1043"},{"key":"ref_93","unstructured":"Rezende, D.J., and Mohamed, S. (2016). Variational Inference with Normalizing Flows. arXiv."},{"key":"ref_94","unstructured":"Salimans, T., Kingma, D.P., and Welling, M. (2015). Markov Chain Monte Carlo and Variational Inference: Bridging the Gap. arXiv."},{"key":"ref_95","doi-asserted-by":"crossref","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"LeCun","year":"1998","journal-title":"Proc. IEEE"},{"key":"ref_96","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2021). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv."},{"key":"ref_97","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1037\/h0042519","article-title":"The perceptron: A probabilistic model for information storage and organization in the brain","volume":"65","author":"Rosenblatt","year":"1958","journal-title":"Psychol. Rev."},{"key":"ref_98","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1109\/TNN.2008.2005605","article-title":"The Graph Neural Network Model","volume":"20","author":"Scarselli","year":"2009","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_99","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long Short-Term Memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_100","unstructured":"Chung, J., Gulcehre, C., Cho, K., and Bengio, Y. (2014, January 12\u201313). Empirical evaluation of gated recurrent neural networks on sequence modeling. Proceedings of the NIPS 2014 Workshop on Deep Learning, Montreal, QC, Canada."},{"key":"ref_101","unstructured":"Toth, P., Rezende, D.J., Jaegle, A., Racani\u00e8re, S., Botev, A., and Higgins, I. (2020). Hamiltonian Generative Networks. arXiv."},{"key":"ref_102","doi-asserted-by":"crossref","unstructured":"Sancaktar, C., van Gerven, M.A.J., and Lanillos, P. (2020, January 26\u201330). End-to-End Pixel-Based Deep Active Inference for Body Perception and Action. Proceedings of the 2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), Valparaiso, Chile.","DOI":"10.1109\/ICDL-EpiRob48136.2020.9278105"},{"key":"ref_103","unstructured":"Ghosh, P., Sajjadi, M.S.M., Vergari, A., Black, M., and Sch\u00f6lkopf, B. (2020). From Variational to Deterministic Autoencoders. arXiv."},{"key":"ref_104","doi-asserted-by":"crossref","first-page":"598","DOI":"10.3389\/fnhum.2013.00598","article-title":"The anatomy of choice: Active inference and agency","volume":"7","author":"Friston","year":"2013","journal-title":"Front. Hum. Neurosci."},{"key":"ref_105","doi-asserted-by":"crossref","first-page":"39","DOI":"10.3389\/fnint.2018.00039","article-title":"Precision and False Perceptual Inference","volume":"12","author":"Parr","year":"2018","journal-title":"Front. Integr. Neurosci."},{"key":"ref_106","doi-asserted-by":"crossref","first-page":"20170376","DOI":"10.1098\/rsif.2017.0376","article-title":"Uncertainty, epistemics and active inference","volume":"14","author":"Parr","year":"2017","journal-title":"J. R. Soc. Interface"},{"key":"ref_107","unstructured":"Higgins, I., Matthey, L., Pal, A., Burgess, C.P., Glorot, X., Botvinick, M.M., Mohamed, S., and Lerchner, A. (2017, January 24\u201326). Beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Proceedings of the ICLR Conference, Toulon, France."},{"key":"ref_108","unstructured":"Razavi, A., van den Oord, A., Poole, B., and Vinyals, O. (2019). Preventing Posterior Collapse with delta-VAEs. arXiv."},{"key":"ref_109","unstructured":"Blundell, C., Cornebise, J., Kavukcuoglu, K., and Wierstra, D. (2015). Weight Uncertainty in Neural Networks. arXiv."},{"key":"ref_110","unstructured":"Gal, Y., and Ghahramani, Z. (2016). Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. arXiv."},{"key":"ref_111","unstructured":"Lakshminarayanan, B., Pritzel, A., and Blundell, C. (2017). Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. arXiv."},{"key":"ref_112","unstructured":"Pathak, D., Gandhi, D., and Gupta, A. (2019). Self-Supervised Exploration via Disagreement. arXiv."},{"key":"ref_113","unstructured":"Sekar, R., Rybkin, O., Daniilidis, K., Abbeel, P., Hafner, D., and Pathak, D. (2020, January 12\u201318). Planning to Explore via Self-Supervised World Models. Proceedings of the ICML Conference, Virtual Conference."},{"key":"ref_114","doi-asserted-by":"crossref","unstructured":"Tschantz, A., Millidge, B., Seth, A.K., and Buckley, C.L. (2020). Reinforcement Learning through Active Inference. arXiv.","DOI":"10.1109\/IJCNN48605.2020.9207382"},{"key":"ref_115","unstructured":"Van den Oord, A., Li, Y., and Vinyals, O. (2019). Representation Learning with Contrastive Predictive Coding. arXiv."},{"key":"ref_116","unstructured":"Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., and Joulin, A. (2021). Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. arXiv."},{"key":"ref_117","unstructured":"Grill, J.B., Strub, F., Altch\u00e9, F., Tallec, C., Richemond, P.H., Buchatskaya, E., Doersch, C., Pires, B.A., Guo, Z.D., and Azar, M.G. (2020). Bootstrap your own latent: A new approach to self-supervised Learning. arXiv."},{"key":"ref_118","doi-asserted-by":"crossref","unstructured":"Chen, X., and He, K. (2020). Exploring Simple Siamese Representation Learning. arXiv.","DOI":"10.1109\/CVPR46437.2021.01549"},{"key":"ref_119","unstructured":"Zbontar, J., Jing, L., Misra, I., LeCun, Y., and Deny, S. (2021). Barlow Twins: Self-Supervised Learning via Redundancy Reduction. arXiv."},{"key":"ref_120","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1038\/s42256-020-00265-z","article-title":"Concept whitening for interpretable image recognition","volume":"2","author":"Chen","year":"2020","journal-title":"Nat. Mach. Intell."},{"key":"ref_121","unstructured":"Schwarzer, M., Anand, A., Goel, R., Hjelm, R.D., Courville, A., and Bachman, P. (2021). Data-Efficient Reinforcement Learning with Self-Predictive Representations. arXiv."},{"key":"ref_122","unstructured":"Ma, X., Chen, S., Hsu, D., and Lee, W.S. (2020, January 16\u201318). Contrastive Variational Model-Based Reinforcement Learning for Complex Observations. Proceedings of the 4th Conference on Robot Learning, Virtual Conference."},{"key":"ref_123","unstructured":"Mazzaglia, P., Verbelen, T., and Dhoedt, B. (2021, January 6\u201314). Contrastive Active Inference. Proceedings of the Advances in Neural Information Processing Systems, Virtual Conference."},{"key":"ref_124","first-page":"103","article-title":"Learning Generative State Space Models for Active Inference","volume":"14","author":"Wauthier","year":"2020","journal-title":"Front. Comput. Neurosci."},{"key":"ref_125","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1016\/j.neubiorev.2017.04.009","article-title":"Deep temporal models and active inference","volume":"77","author":"Friston","year":"2017","journal-title":"Neurosci. Biobehav. Rev."},{"key":"ref_126","doi-asserted-by":"crossref","unstructured":"Millidge, B. (2019). Deep Active Inference as Variational Policy Gradients. arXiv.","DOI":"10.1016\/j.jmp.2020.102348"},{"key":"ref_127","unstructured":"Saxena, V., Ba, J., and Hafner, D. (2021). Clockwork Variational Autoencoders. arXiv."},{"key":"ref_128","doi-asserted-by":"crossref","unstructured":"Wu, B., Nair, S., Martin-Martin, R., Fei-Fei, L., and Finn, C. (2021). Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction. arXiv.","DOI":"10.1109\/CVPR46437.2021.00235"},{"key":"ref_129","doi-asserted-by":"crossref","unstructured":"Tschantz, A., Baltieri, M., Seth, A.K., and Buckley, C.L. (2020, January 19\u201324). Scaling Active Inference. Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK.","DOI":"10.1109\/IJCNN48605.2020.9207382"},{"key":"ref_130","unstructured":"Kaiser, L., Babaeizadeh, M., Milos, P., Osinski, B., Campbell, R.H., Czechowski, K., Erhan, D., Finn, C., Kozakowski, P., and Levine, S. (2020). Model-Based Reinforcement Learning for Atari. arXiv."},{"key":"ref_131","unstructured":"Srinivas, A., Laskin, M., and Abbeel, P. (2020). CURL: Contrastive Unsupervised Representations for Reinforcement Learning. arXiv."},{"key":"ref_132","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1016\/j.tics.2018.01.009","article-title":"Hierarchical Active Inference: A Theory of Motivated Control","volume":"22","author":"Pezzulo","year":"2018","journal-title":"Trends Cogn. Sci."},{"key":"ref_133","unstructured":"Zakharov, A., Guo, Q., and Fountas, Z. (2021). Variational Predictive Routing with Nested Subjective Timescales. arXiv."},{"key":"ref_134","doi-asserted-by":"crossref","unstructured":"Verbelen, T., Lanillos, P., Buckley, C.L., and De Boom, C. (2020). Sleep: Model Reduction in Deep Active Inference. Active Inference, Springer International Publishing.","DOI":"10.1007\/978-3-030-64919-7"},{"key":"ref_135","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/j.pneurobio.2015.09.001","article-title":"Active Inference, homeostatic regulation and adaptive behavioural control","volume":"134","author":"Pezzulo","year":"2015","journal-title":"Prog. Neurobiol."},{"key":"ref_136","doi-asserted-by":"crossref","unstructured":"Millidge, B., Tschantz, A., and Buckley, C.L. (2020). Whence the Expected Free Energy?. arXiv.","DOI":"10.1162\/neco_a_01354"},{"key":"ref_137","unstructured":"Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (2017). Hindsight Experience Replay. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_138","unstructured":"Warde-Farley, D., de Wiele, T.V., Kulkarni, T.D., Ionescu, C., Hansen, S., and Mnih, V. (2019, January 6\u20139). Unsupervised Control through Non-Parametric Discriminative Rewards. Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA."},{"key":"ref_139","unstructured":"Mendonca, R., Rybkin, O., Daniilidis, K., Hafner, D., and Pathak, D. (2021). Discovering and Achieving Goals via World Models. arXiv."},{"key":"ref_140","unstructured":"Lee, L., Eysenbach, B., Parisotto, E., Xing, E., Levine, S., and Salakhutdinov, R. (2020). Efficient Exploration via State Marginal Matching. arXiv."},{"key":"ref_141","unstructured":"Levine, S. (2018). Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review. arXiv."},{"key":"ref_142","doi-asserted-by":"crossref","unstructured":"Millidge, B., Tschantz, A., Seth, A.K., and Buckley, C.L. (2020). On the Relationship between Active Inference and Control as Inference. arXiv.","DOI":"10.1109\/IJCNN48605.2020.9207382"},{"key":"ref_143","doi-asserted-by":"crossref","first-page":"674","DOI":"10.1162\/neco_a_01357","article-title":"Active Inference: Demystified and Compared","volume":"33","author":"Sajid","year":"2021","journal-title":"Neural Comput."},{"key":"ref_144","unstructured":"Clark, J., and Amodei, D. (2016). Faulty Reward Functions in the Wild, OpenAI."},{"key":"ref_145","unstructured":"Ziebart, B.D., Maas, A.L., Bagnell, J.A., and Dey, A.K. (2008, January 13\u201317). Maximum entropy inverse reinforcement learning. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, Chicago, IL, USA."},{"key":"ref_146","doi-asserted-by":"crossref","unstructured":"Abbeel, P., and Ng, A.Y. (2004, January 4\u20138). Apprenticeship Learning via Inverse Reinforcement Learning. Proceedings of the Twenty-First International Conference on Machine Learning, ICML\u201904, Banff, AB, Canada.","DOI":"10.1145\/1015330.1015430"},{"key":"ref_147","unstructured":"Shyam, P., Ja\u015bkowski, W., and Gomez, F. (2019). Model-Based Active Exploration. arXiv."},{"key":"ref_148","unstructured":"Achiam, J., and Sastry, S. (2017). Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning. arXiv."},{"key":"ref_149","unstructured":"Burda, Y., Edwards, H., Pathak, D., Storkey, A.J., Darrell, T., and Efros, A.A. (2019, January 6\u20139). Large-Scale Study of Curiosity-Driven Learning. Proceedings of the 7th International Conference on Learning Representations, ICLR, New Orleans, LA, USA."},{"key":"ref_150","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv."},{"key":"ref_151","first-page":"1861","article-title":"Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor","volume":"Volume 80","author":"Dy","year":"2018","journal-title":"Proceedings of the 35th International Conference on Machine Learning"},{"key":"ref_152","unstructured":"Eysenbach, B., and Levine, S. (2021). Maximum Entropy RL (Provably) Solves Some Robust RL Problems. arXiv."},{"key":"ref_153","unstructured":"Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., and Graepel, T. (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. arXiv."},{"key":"ref_154","unstructured":"Maisto, D., Gregoretti, F., Friston, K., and Pezzulo, G. (2021). Active Tree Search in Large POMDPs. arXiv."},{"key":"ref_155","unstructured":"Clavera, I., Fu, V., and Abbeel, P. (2020). Model-Augmented Actor-Critic: Backpropagating through Paths. arXiv."},{"key":"ref_156","first-page":"4045","article-title":"Time Limits in Reinforcement Learning","volume":"Volume 80","author":"Dy","year":"2018","journal-title":"Proceedings of the 35th International Conference on Machine Learning"},{"key":"ref_157","doi-asserted-by":"crossref","unstructured":"Mhaskar, H., Liao, Q., and Poggio, T. (2017, January 4\u20139). When and Why Are Deep Networks Better than Shallow Ones?. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI\u201917, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.10913"},{"key":"ref_158","unstructured":"Novak, R., Bahri, Y., Abolafia, D.A., Pennington, J., and Sohl-Dickstein, J. (2018). Sensitivity and Generalization in Neural Networks: An Empirical Study. arXiv."},{"key":"ref_159","doi-asserted-by":"crossref","unstructured":"Colbrook, M.J., Antun, V., and Hansen, A.C. (2021). Can stable and accurate neural networks be computed?\u2014On the barriers of deep learning and Smale\u2019s 18th problem. arXiv.","DOI":"10.1073\/pnas.2107151119"},{"key":"ref_160","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1038\/s42256-018-0002-3","article-title":"Learnability can be undecidable","volume":"1","author":"Moran","year":"2019","journal-title":"Nat. Mach. Intell."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/24\/2\/301\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:23:48Z","timestamp":1760135028000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/24\/2\/301"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,21]]},"references-count":160,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2022,2]]}},"alternative-id":["e24020301"],"URL":"https:\/\/doi.org\/10.3390\/e24020301","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,2,21]]}}}