A Data Format for Human Visual Object Recognition

Ken Nakayama, University of California, Berkeley, USA

Venue:  Room 2D17, School of Psychological Science, Priory Road Complex

Neural network models are unusually successful in identifying and classifying objects. However, they can also make huge mistakes in ways never seen in humans. They do not rely so much on object shape, but more likely on texture. As such, they fail to provide a plausible account for human visual object recognition. Here I describe some work from very long ago that indicates that surfaces, a viewer centered intermediate level of representation is what is likely to be missing (1,2,3). Pertinent to this, I propose two classes of object recognition tests to be given to present and future neural network models.

(1) Nakayama et al.,1989 Perception 18: 55-68.
(2) Nakayama, et al., 1995 Visual surface representation: a critical link between lower-level and higher level vision. In Kosslyn, S.M. and Osherson, D.N. Vision. In Invitation to Cognitive Science. M.I.T. Press, p. 1-70.
(3) Nakayama and Shimojo, 1992 Science 257: 1357-1363.