Thoughts on Machine Learning, Representation Theory, and RoShamBo – Part I

I’ve recently been doing some reading on machine learning with a mind towards applying some of my prior knowledge of representation theory.  The initial realization that representation theory might have some interesting applications in machine learning came from discussions with Chris Olah at the Toronto HackLab a couple months ago; you can get interesting new insights by exploring new spaces!  Over winter break I’ve been reading Bishop’s ‘Pattern Recognition and Machine Learning‘ (slowly), alongside faster reads like ‘Building Machine Learning Systems with Python.‘  As I’ve read, I’ve realized that there is plenty of room for introducing group theory into machine learning in interesting ways.  (Note: This is the first of a few posts on this topic.)

There’s a strong tradition of statisticians using group theory, perhaps most famously Persi Diaconis, who used representation theory of the symmetric group to find the mixing time for card shuffling.  His notes ‘Group Representations in Probability and Statistics‘ are an excellent place to pick up the background material with a strong eye towards applications.  Over the next few posts I’ll make a case for the use of representation theory in machine learning, emphasizing automatic factor selection and Bayesian methods.

First, an extremely brief overview of what machine learning is about, and an introduction to using a Bayesian approach to play RoShamBo, or rock paper scissors.  In the second post, I’ll motivate the viewpoint of representation by exploring the Fourier transform and how to use it to beat repetitive opponents.  Finally, in the third post I’ll look at how we can use representations to select factors for our Bayesian algorithm by examining likelihood functions as functions on a group.

RPS25 is a generalized form of Rock Paper Scissors.  I'll leave generalizations of this post to the RPS25 case as an exercise to the reader...
RPS25 is a generalized form of Rock Paper Scissors. I’ll leave generalizations of this post to the RPS25 case as an exercise to the reader…

Continue reading