Low-Level Visual Psychophysics: A Bridge Between Psychologist and Deep Learning Engineers

1 June 2023 – 13:00 BST

Jesús Malo, University of Valencia, Spain

To those attending in-person
Venue:  Room 2D17, Staff common room, School of Psychological Science, Priory Road, Bristol

To those attending online | join via zoom
Meeting ID: 932 9546 5286 | Passcode: 074795

Conventional training of artificial neural networks on benchmarks is not a good strategy for developing models of human vision. Specifically, Bowers et al. BBS 2022 review many failures of such a strategy both in physiology (neural recordings) and in high-level behaviour (Gestalt, classification, and more). As a result, the reader may end up with an overly pessimistic view of the prospects for deep learning in this field. This can steer these modelers away from human vision problems. The separation of the Machine Learning and Vision Science communities would take eventually valuable input away from visual neuroscience, and would also be a problem for machine vision because it could benefit from psychology.

In this talk I will argue that low-level visual psychophysics (e.g. colour and texture discrimination, image quality…) can be a bridge between both communities. On the one hand, I will show that pattern discrimination is good enough to point out the limitations of the naive benchmark-only approach [1]: it suggests specific changes in architecture which are not obvious from conventional practices in deep learning [1,2], where deeper deeper is not more human [3-5]. However, on the other hand, the low dimensionality of these low-level (image-patch) perception problems is an advantage too: I will show that the behaviour of artificial networks and image-probability models for patches is easier to compare with (and to benefit from!) classical early vision models [6]. In this way, low-level visual psychophysics can serve as a common ground for positive interaction between the Machine Learning and the Vision Science communities: it represents an independent way to evaluate artificial systems trained for some other tasks, classical vision models can inspire modifications in the architectures, and non-Euclidean perceptual metrics can serve as effective regularisers (or priors) when there is not enough data to train machine learning models.


[1] M. Martínez, M. Bertalmío, J. Malo
In Praise of Artifice Reloaded: Caution With Natural Image Databases in Modelling Vision
Front. Neurosci., Vol. 13 (2019) https://doi.org/10.3389/fnins.2019.00008

[2] M. Bertalmío, A. Gómez-Villa, J. Malo et al.
Evidence for the intrinsically nonlinear nature of receptive fields in vision
Scientific Reports volume 10, Article number: 16277 (2020) https://doi.org/10.1038/s41598-020-73113-0

[3] A. Gomez-Villa, A. Martın, J. Vazquez, M. Bertalmio, and J. Malo.
Color illusions also deceive CNNs for low-level vision tasks: Analysis and implications.
Vision Research, 176:156–174, (2020) https://doi.org/10.1016/j.visres.2020.07.010

[4] Q. Li, A. Gomez-Villa, M. Bertalmio, and J. Malo.
Contrast sensitivity functions in autoencoders.
Journal of Vision, 22(6):8–8, 05 (2022) https://doi.org/10.1167/jov.22.6.8

[5] P. Hernandez, J. Vila, V. Laparra and J. Malo
Human-like Perception in Deep Nets for Computer Vision
ArXiV (2023) https://isp.uv.es/docs/Deep_Image_Quality_arxiv_2302.13345.pdf

[6] A. Hepburn, V. Laparra, R. Santos-Rodriguez, J. Ballé, and J. Malo.
On the relation between statistical learning and perceptual distances.
Int. Conf. Learn. Repr. ICLR, (2022). https://openreview.net/forum?id=zXM0b4hi5_B