Abstract: Visually indicated sounds

Visually indicated sounds

A Owens, P Isola, J H McDermott, A Torralba, E H Adelson and W T Freeman.

Published in Conference on Computer Vision and Pattern Recognition, pp. 2405--2413, Jun 2016.

  • Reprint (pdf)

  • Materials make distinctive sounds when they are hit or scratched - dirt makes a thud; ceramic makes a clink. These sounds reveal aspects of an object's material properties, as well as the force and motion of the physical interaction. In this paper, we introduce an algorithm that learns to synthesize sound from videos of people hitting objects with a drumstick. The algorithm uses a recurrent neural network to predict sound features from videos and then produces a waveform from these features with an example-based synthesis procedure. We demonstrate that the sounds generated by our model are realistic enough to fool participants in a 'real or fake' psychophysical experiment, and that they convey significant information about the material properties in a scene
  • Listing of all publications