There are no sounds in this image.
Overview of perception as inference in a world model. A) Generative processes in the world produce sound, which can be described with models of sound generation (acoustics). Perceptual systems could use an internal model of how causes generate signals (a "world model") to explain the data, in terms of causes which generated the data. To generate signals from causes, this world model contains audio synthesizers. B) Example of inference in a visual world model (20). We perceive the image on the left to be a set of solid blue cylinders that is lit from the left. This percept can be considered an explanation of the image in terms of interacting causal variables like illumination and surface reflectance (shown in middle). But there are many alternative causal explanations for this image, for instance that the cylinders are lit uniformly and have gradients that are painted onto their surface (right). Perception could avoid this alternative explanation due to the low prior probability of surfaces being painted and illuminated in this way. C) High-level description of the proposed internal generative model of auditory scenes. Sources are sampled from a source prior distribution. Sources emit events that create sound. A source is thus an audio synthesizer. The model is expressed with a probabilistic program, which allows for scene descriptions that vary in dimensionality (the schematic here should not be mistaken for a graphical model, which would not allow for variable dimensionality).