Title: Formalisations of Q learning and logic as a loss function
Speaker: Mark Chevallier
Q learning is a form of reinforcement learning where we try to learn how to navigate through an environment where we can earn rewards by earning the biggest reward possible.
I have been working on its formalisation in the theorem prover Isabelle/HOL, and in this talk I will give details about the different ways I have tried to represent rewards and value, and the pitfalls I have discovered making progress along the way.
I will also be discussing a formalisation of a method of evaluating logical constraints as real numbers to use them as part of a loss function in a neural network. This method has been described by others, and by formalising their methods I uncovered a lack of rigour and a better way to approach some of the issues that arose.
I will demonstrate that I was able to use Isabelle/HOL to generate working code based on my formalisation to guarantee the code fulfils the specifications exactly and eliminate any possibility of human error in implementation.