Title: Formal Verification of Reinforcement Learning
Speaker: Mark Chevallier
In this talk, I will introduce reinforcement learning and show my work formally verifying it using Isabelle/HOL. I will show how I have adapted existing models for Markov Decision Processes and modelled rewards, policies, value functions and other reinforcement learning concepts. I will demonstrate results achieved already showing proofs of the existence of an optimal Q value function under certain conditions and how I intend to use this to show that Q Learning estimates converge to this optimal function over time.