The human voice contains in its acoustic structure a wealth of information on the speaker's identity and emotional state which we perceive with remarkable ease and accuracy. Although the perception of speaker-related features of voice plays a major role in human communication, little is known about its neural basis. Here we show, using functional magnetic resonance imaging in human volunteers, that voice-selective regions can be found bilaterally along the upper bank of the superior temporal sulcus (STS). These regions showed greater neuronal activity when subjects listened passively to vocal sounds, whether speech or non-speech, than to non-vocal environmental sounds. Central STS regions also displayed a high degree of selectivity by responding significantly more to vocal sounds than to matched control stimuli, including scrambled voices and amplitude-modulated noise. Moreover, their response to stimuli degraded by frequency filtering paralleled the subjects' behavioural performance in voice-perception tasks that used these stimuli. The voice-selective areas in the STS may represent the counterpart of the face-selective areas in human visual cortex; their existence sheds new light on the functional architecture of the human auditory cortex.