Arun Chaganty

Overengineering since 1989


Homomorphisms in Reinforcement Learning across Continuity and Partial Observability (2011 - Present)

(Mentor: Balaraman Ravindran)

Abstract

Markov Decision Process (MDP) homomorphisms map states and actions from one MDP to another, providing a basis for the transfer of learning, and MDP minimisation. Homomorphisms in the discrete MDP setting have been comprehensively studied; only recently has work been done to define homomorphisms in continuous or partially observable domains. The objective of this work is to find homomorphisms between continuous and/or partially observable MDPs, in both an exact and approximate manner.

Additional Material

  • I am studying this problem as part of my Masters thesis at IIT Madras. A working draft of my my thesis proposal can be found here.