The Model 2.0:
An Anatomically-Inspired Model of the Primate Ventral Stream
The log-polar mapping, when used as input to a convolutional neural network (CNN), provides two kinds of invariances. Scale is just a left-right shift in this representation (see images of Geoff Hinton (top row) and their log-polar representation (bottom row)). Similarly, rotation in the image plane is an up-down shift. Because CNNs are (somewhat) translation invariant, the network as a whole becomes scale and rotation invariant. However, translation invariance is lost. We make up for this by sampling from the image at multiple points, just as humans use multiple fixations to recognize a face (Hsiao & Cottrell, 2008). I end by explaining the puzzle of why a network that is rotation invariant shows a face inversion effect.
Short Bio: Garrison W. (Gary) Cottrell is a Professor of Computer Science and Engineering and the Director of the Interdisciplinary Ph.D. Program in Cognitive Science at UC San Diego. He was a founding PI of the Perceptual Expertise Network, and directed the Temporal Dynamics of Learning Center, an NSF-sponsored Science of Learning Center comprised of 40 PIs at 18 institutions in 4 countries. Professor Cottrell’s research is strongly interdisciplinary. His main interest is Cognitive Science and Computational Cognitive Neuroscience. He focuses on building working models of cognitive processes, and using them to explain psychological, developmental or neurological processes. In recent years, he has focused on anatomically-inspired deep learning models of the visual system. He has also worked on unsupervised feature learning (modeling precortical and cortical coding), face & object processing, visual salience, and visual attention. His other interest is applying AI to problems in other areas of science or engineering. Most recently he has been using deep learning to elucidate the structure of small (natural product) molecules from their NMR spectra in collaboration with Bill Gerwick at the Scripps Institute of Oceanography. He received his PhD in 1985 from the University of Rochester under James F. Allen (thesis title: A connectionist approach to word sense disambiguation). He then did a postdoc with David E. Rumelhart at the Institute of Cognitive Science at UCSD until 1987, when he joined the CSE Department.