Relative pitch representations and invariance to timbre

Malinda J. McPherson, & Josh H. McDermott
(2023) Cognition, 232, 105327.

Paper Link


Information in speech and music is often conveyed through changes in fundamental frequency (f0), perceived by humans as ``relative pitch''. Relative pitch judgments are complicated by two facts. First, sounds can simultaneously vary in timbre due to filtering imposed by a vocal tract or instrument body. Second, relative pitch can be extracted in two ways: by measuring changes in constituent frequency components from one sound to another, or by estimating the f0 of each sound and comparing the estimates. We examined the effects of timbral differences on relative pitch judgments, and whether any invariance to timbre depends on whether judgments are based on constituent frequencies or their f0. Listeners performed up/down and interval discrimination tasks with pairs of spoken vowels, instrument notes, or synthetic tones, synthesized to be either harmonic or inharmonic. Inharmonic sounds lack a well-defined f0, such that relative pitch must be extracted from changes in individual frequencies. Pitch judgments were less accurate when vowels/instruments were different compared to when they were the same, and were biased by the associated timbre differences. However, this bias was similar for harmonic and inharmonic sounds, and was observed even in conditions where judgments of harmonic sounds were based on f0 representations. Relative pitch judgments are thus not invariant to timbre, even when timbral variation is naturalistic, and when such judgments are based on representations of f0.

All demos are best heard over headphones; please set the volume to low before listening.

Demo, Experiment 2: Same vs. Different Instrument Stimuli

Examples trials for each instrument, demonstrating resynthesis.
Key results: For small pitch changes, harmonic and inharmonic stimuli are equally discriminable, even when comparing across same vs. different instruments. At a 9 semitone pitch change, harmonic stimuli were easier to discriminate than inharmonic stimuli.

Same Instruments; 1 Semitone Pitch Shift

Ukulele Pipe Organ Oboe Cello Baritone Sax
Harmonic (Key: Up, Down, Up, Up, Down)
Inharmonic (Key: Up, Up, Up, Down, Down)

Harmonic, Different Instruments (Key describes whether pitch shift was Up or Down, and whether the trial was 'Congruent' or 'Incongruent')
Cello/Ukulele (Key: Up, Incongruent) Pipe Organ/Ukulele (Key: Down, Congruent) Cello/Pipe Organ (Key: Up, Incongruent)
Inharmonic, Different Instruments (Key describes whether pitch shift was Up or Down, and whether the trial was 'Congruent' or 'Incongruent')
Cello/Ukulele (Key: Down, Congruent) Pipe Organ/Ukulele (Key: Down, Congruent) Cello/Pipe Organ (Key: Up, Incongruent)

Harmonic, 9 Semitone Pitch Shift
Baritone Sax Organ Ukulele/Cello Organ/Baritone Sax Ukulele/Baritone Sax

Inharmonic, 9 Semitone Pitch Shift
Baritone Sax Ukulele Oboe/Cello Cello/Baritone Sax Oboe/Ukulele

Demo, Experiment 3: Discrimination of synthetic tones with extreme spectral differences

Participants' pitch judgments are biased by changes in the spectral shape between notes.

The second note is always 1 semitone higher than the first

Same Harmonics Congruent Incongruent

Demo, Experiment 5: Spectral invariance of musical interval perception

In this experiment, listeners heard two notes and were asked whether the notes matched the starting interval in `Happy Birthday', specifically, the two-semitone difference between the last syllable of `Happy' and the first syllable of `Birthday'. The interval could be mistuned by +/- .5 or 1 semitones.

Happy Birthday - 1 Semitone Happy Birthday -.5 Semitone Happy Birthday, Correct Interval Happy Birthday + .5 Semitone Happy Birthday + 1 Semitone

About the paper

Here is a link to the paper.

Email me with questions or requests for code.