Learning Deterministic Finite Automata with Signed Examples: An Investigation into the Role of Entropy in Optimal Model Selection John Fergus William Smiles Abstract: The thesis presents an analysis of the hardness results for learning deterministic finite automata from signed data examples. Focusing on the conditions which allow for learning to take place, we investigate a notion of a “fair” sample that would typically permit learning. Despite demonstrating the need for such a sample in overcoming the hardness of the learning problem, we also detail a major problem with the size of the sample that is required to be considered fair. The major contribution of the thesis is to develop our understanding of the role of entropy in optimal model selection and learning; developing connections to aspects of cognition and inference. We provide a sufficient background introduction to data compression, and an analysis of the Minimum Description Length (MDL) principle for grammar induction, with the purpose of preparing the reader to understand these relationships. We also present novel results which explicate deeper problems with the process of DFA model selection. The contribution is specifically motivated by a similar result for the class of partial and total recursive functions [Adriaans (2020)]: we analyse data sets which have multiple optimal models under MDL, each of which share little to no mutual information. These so-called “polysemantic” data sets are shown to exist abundantly in the context of the much weaker class of regular languages. Their existence poses severe difficulties to any algorithmic methods for reasoning inductively from low quality data samples; there seems to be no way of avoiding mutually irreconcilable interpretations of certain samples. The focus on polysemy brings some interesting considerations about learning to the fore, which we relate back to the central concerns of the thesis.