Implementation Techniques for Solving POMDPs in Personal Assistant Agents

Varakantham, Pradeep; Maheswaran, Rajiv; Tambe, Milind

doi:10.1007/11678823_5

Pradeep Varakantham²²,
Rajiv Maheswaran²² &
Milind Tambe²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3862))

Included in the following conference series:

International Workshop on Programming Multi-Agent Systems

352 Accesses
1 Citation

Abstract

Agents or agent teams deployed to assist humans often face the challenges of monitoring the state of key processes in their environment (including the state of their human users themselves) and making periodic decisions based on such monitoring. POMDPs appear well suited to enable agents to address these challenges, given the uncertain environment and cost of actions, but optimal policy generation for POMDPs is computationally expensive. This paper introduces two key implementation techniques (one exact and one approximate) to speedup POMDP policy generation that exploit the notion of progress or dynamics in personal assistant domains and the density of policy vectors. Policy computation is restricted to the belief space polytope that remains reachable given the progress structure of a domain. One is based on applying Lagrangian methods to compute a bounded belief space support in polynomial time and other based on approximating policy vectors in the bounded belief polytope. We illustrate this by enhancing two of the fastest existing algorithms for exact POMDP policy generation. The order of magnitude speedups demonstrate the utility of our implementation techniques in facilitating the deployment of POMDPs within agents assisting human users.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Strong Simple Policies for POMDPs

Article Open access 09 June 2024

Exploiting submodular value functions for scaling up active perception

Article Open access 29 August 2017

Demonstration-Based Proximal Policy Optimization with Action Guidance

References

Littman, M.L., Cassandra, A.R., Zhang, N.L.: Incremental pruning: A simple, fast, exact method for partially observable markov decision processes. In: UAI (1997)
Google Scholar
Feng, Z., Zilberstein, S.: Region based incremental pruning for POMDPs. In: UAI (2004)
Google Scholar
Hauskrecht, M.: Value-function approximations for POMDPs. JAIR 13, 33–94 (2000)
MATH MathSciNet Google Scholar
CALO: Cognitive Agent that Learns and Organizes (2003), http://www.ai.sri.com/project/CALO http://calo.sri.com
Gordon, G., Pineau, J., Thrun, S.: PBVI: An anytime algorithm for POMDPs. In: IJCAI (2003)
Google Scholar
Leong, T.Y., Cao, C.: Modeling medical decisions in DynaMoL: A new general framework of dynamic decision analysis. In: World Congress on Medical Informatics (MEDINFO), pp. 483–487 (1998)
Google Scholar
Fraser, H., Hauskrecht, M.: Planning treatment of ischemic heart disease with partially observable markov decision processes. AI in Medicine 18, 221–244 (2000)
Google Scholar
Locatelli, F., Magni, P., Bellazzi, R.: Using uncertainty management techniques in medical therapy planning: A decision-theoretic approach. In: Hunter, A., Parsons, S. (eds.) Applications of Uncertainty Formalisms. LNCS (LNAI), vol. 1455, pp. 38–57. Springer, Heidelberg (1998)
Chapter Google Scholar
Pollack, M.E., Brown, L., Colbry, D., McCarthy, C.E., Orosz, C., Peintner, B., Ramakrishnan, S., Tsamardinos, I.: Autominder: An intelligent cognitive orthotic system for people with memory impairment. Robotics and Autonomous Systems 44, 273–282 (2003)
Article Google Scholar
Poulpart, P., Boutilier, C.: Bounded finite state controllers. In: NIPS (2003)
Google Scholar
Roy, N., Gordon, G.: Exponential family PCA for belief compression in POMDPs. In: NIPS (2002)
Google Scholar
Scerri, P., Pynadath, D., Tambe, M.: Towards adjustable autonomy for the real-world. JAIR 17, 171–228 (2002)
MATH MathSciNet Google Scholar
Schreckenghost, D., Martin, C., Bonasso, P., Kortenkamp, D., Milam, T., Thronesbery, C.: Supporting group interaction among humans and autonomous agents. In: AAAI (2002)
Google Scholar
Zhang, N.L., Zhang, W.: Speeding up convergence of value iteration in partially observable markov decision processes. JAIR 14, 29–51 (2001)
Google Scholar
Zhou, R., Hansen, E.: An improved grid-based approximation algorithm for POMDPs. In: IJCAI (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Southern California, Los Angeles, CA, 90089
Pradeep Varakantham, Rajiv Maheswaran & Milind Tambe

Authors

Pradeep Varakantham
View author publications
Search author on:PubMed Google Scholar
Rajiv Maheswaran
View author publications
Search author on:PubMed Google Scholar
Milind Tambe
View author publications
Search author on:PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Durham, DH1 3LE, Durham, UK
Rafael H. Bordini
Utrecht University, P.O. Box 80.089, 3508, Utrecht, TB, The Netherlands
Mehdi M. Dastani
Clausthal University of Technology, Julius-Albert-Str. 4, 38678, Clausthal-Zellerfeld, Germany
Jürgen Dix
Laboratoire Informatique de Paris 6, 104 avenue du Président Kennedy, 75016, Paris, France
Amal El Fallah Seghrouchni

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Varakantham, P., Maheswaran, R., Tambe, M. (2006). Implementation Techniques for Solving POMDPs in Personal Assistant Agents. In: Bordini, R.H., Dastani, M.M., Dix, J., El Fallah Seghrouchni, A. (eds) Programming Multi-Agent Systems. ProMAS 2005. Lecture Notes in Computer Science(), vol 3862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11678823_5

Download citation

DOI: https://doi.org/10.1007/11678823_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32616-8
Online ISBN: 978-3-540-32617-5
eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Publish with us

Policies and ethics