Shape Bias at a Glance: Comparing Human and Machine Vision on Equal Terms
4 November 2021 – 15:00 GMT – Note unusual time
Katherine Hermann, Department of Psychology, Stanford University, USA
Recent work has highlighted a seemingly sharp divergence between human and machine vision: studies have argued that, whereas people exhibit a shape bias, preferring to classify objects according to their shape (Landau et al. 1988, Kucker et al. 2019), standard ImageNet-trained CNNs prefer to use texture (Geirhos et al. 2018). On the CNN side, how prevalent is this bias, and where does it come from? I will present evidence that, while both model architecture and training objective affect a model’s level of texture bias, the statistics of the training data are the most important factor, and that naturalistic data augmentation schemes can ameliorate texture bias and improve generalization to out-of-distribution images. On the human side, existing studies have tested people under conditions different than those faced by a feedforward CNN; does the human–machine divergence remain when testing condition are more fairly aligned? In experiments using brief stimulus presentations, we find that people do still rely on shape more than texture, although texture information plays more of a role than previously reported. Comparing this more moderate shape bias to that of models trained with naturalistic data augmentations suggests that human and machine vision may not be as mismatched as previously believed.
Short Bio: Katherine Hermann is a PhD candidate and NSF Graduate Research Fellow at Stanford University (Psychology Department). She studies representations in neural models, including how inductive biases and training data interact to shape learning, and when and how model behaviors and representations differ from those of people.