Thursday 14 June 2018 at 13:00
Venue | 10 Priory Road, G2
Deep neural networks have proved themselves capable of solving a wide variety of tasks, but the representations learned by these networks largely remain unclear. In this talk, I will discuss recent work which aims to better elucidate the structure of these representations, including 1) how this structure relates to generalization, 2) the factors which lead to representational similarity in network representation, and 3) the role of single units in computation. By investigating image classification networks, I will present evidence that networks which generalize are less reliant on small sets of single directions in representational space and that batch normalization, but not dropout, regularizes for this reliance. Using Canonical Correlation Analysis (CCA) to compare representations across disparate networks, I will also show that, across a variety of circumstances, networks which learn generalizable solutions converge to substantially more similar representations than those which memorize the training data. Finally, I will present evidence that, at least for image classification networks, the class selectivity of individual units appears largely unrelated to the importance of those units to network computation.