In this writeup I'll discuss how the human brain perceives sound. Complementary writeups exist under the nodes ear, perception, and sound.
The perceived qualities of sound can be divided into three categories--loudness, pitch, and timbre. Loudness is the perception of sound* amplitude. Pitch is the perception of the fundamental frequency of the sound. The term timbre is used to refer to the perception of all other aspects of sound, that allows us to distinguish among musical instruments or to describe sounds with such adjectives as brilliant or mellow.
* Sound is a wave phenomenon. See waves for more information.
Perception of loudness
Humans perceive the loudness of a pure tone** as a logarithmic function of sound amplitude. The most common measurement unit of loudness is the decibel (dB), which is defined as follows.
Loudness (in dB) = 20 log10 (P / 20μPa), where P is the sound pressure amplitude.
** A pure tone is a sound wave of a single frequency. Since musical instruments invariably produce higher-order harmonics called overtones, pure tones are typically created electronically.
The reference pressure of 20μPa is slightly above the minimum pressure that the ear can perceive. Therefore the loudness in dB is almost always positive. However, the ear can perceive smaller wave pressures at frequencies around 4kHz, so negative loudness in dB is meaningful. The minimum difference in loudness that the ear can perceive is about 1dB.
The perception of loudness is a strong function of the frequency of sound. Maximum perceived loudness occurs near 4kHz. Equal-loudness curves (with loudness in dB on the y-axis and frequency on the x-axis) have local minima at around 4 and 13kHz (resonant frequencies of the ear canal). At low loudness levels, these curves show huge frequency variation. The equal loudness curve of 10 phons (a phon is defined as the loudness in dB perceived at 1kHz) has a minimum of about 3dB at 4kHz, but rises to 80dB at 20Hz and 28dB at 12kHz. In other words, the ear perceives the same loudness for a 20Hz signal with pressure 10,000 times that of a 4kHz signal.
However, the equal-loudness curves become much more flat at higher loudness levels. The 120-phon equal-loudness curve has a spread of only 20dB over the range of frequencies the ear can perceive. It is for this reason that music tends to sound better when played at high volume. Music is typically recorded at high sound levels. When music is played at a low volume, the loudness balance among frequencies is destroyed.
Perception of pitch
The perception of pitch is a bit more complicated than the perception of loudness. Children can perceive the pitch of sound waves with frequencies ranging from 20Hz to 20kHz. As humans reach adulthood, the upper limit drops to about 16kHz. As humans mature to retirement age, the upper limit continues to drop gradually to 8kHz. This phenomenon is known as presbycusis.
Musical instruments (percussion is an exception) produce sound waves of a fundamental frequency (the first harmonic) and several higher-order harmonics. You can picture a guitar string, whose vibration is a superposition of several normal modes (standing waves that meet the requirement that the oscillations at the ends of the string must be zero since the ends are fixed). The harmonics are waves with frequencies that are integer multiples of the fundamental. Our brains perceive only the fundamental frequency. As an example, the note "middle A" has a fundamental frequency of 440Hz, but all instruments produce waves of 440Hz, 880Hz (the second harmonic), 1320Hz (the third harmonic), etc.
An interesting fact is that even when the fundamental frequency is filtered out of the sound produced by an instrument, we still perceive only the fundamental frequency! It seems that our brains perceive the greatest common factor of the frequencies present. There are theories as to why this is, but they are complicated and unproven so I will not present them.
It is interesting to consider how our brain perceives two simultaneous sound waves, since this would help explain our notion of harmony. Two sound waves with identical frequencies sound the same as one (though louder). As the difference between the frequencies is increased a bit, we start to hear beats--we hear a single pitch that repeatedly fades in and out.
As the difference between the frequencies of the two simultaneous sound waves rises above 15Hz, we no longer hear a single tone with beats. Instead we hear discordant roughness. When the difference between the frequencies rises to a level known as the critical bandwidth, we hear smooth, separate tones. It turns out that the critical bandwidth is close to a linear function of the center frequency (the average frequency of the two waves). The critical bandwidth is always 2 semitones or larger, which explains why adjacent notes (on the Western musical scale) played simultaneously sound dissonant. The critical bandwidth vs. frequency curve actually has some parabolic nature--a concave-up inflection point exists at about 2kHz. At a center frequency of 100Hz, the critical bandwidth is 7 semitones! This helps explain why intervals and chords at very low or very high frequencies are unpleasant.
One might wonder why two-note intervals are perceived the way they are. This is a quite complicated question, but perhaps the beginnings of an explanation is the following. The way the combination of two notes is perceived is dependent on the consonance or dissonance of each interval of each pair of harmonics. For example, a minor third has more dissonant higher-order harmonics than a major third. I think most of us would agree that a major third is more pleasant than a minor third. Much additional pscyhoacoustical research will be necessary for us to have a better understanding of harmony. Scientific explanation of our perception of melodies is an even more challenging and unexplored area of research!
Perception of timbre
As mentioned above, every musical instrument produces a fundamental frequency (first harmonic) and higher harmonics. However, the way energy is dispered among the harmonics is unique to each instrument. The presence of high-order harmonics (harmonics greater than 5-7) makes instruments sound brilliant and shrill. Instruments such as the violin, pipe organ, trumpet, and tenor saxophone produce a great deal of energy in high harmonics. Instruments such as the tuba, clarinet, oboe, and flute sound more mellow since they do not produce much energy in high-order harmonics.
Immediately after a musician plays a note, musical instruments have an onset phase that lasts from tens to hundreds of milliseconds. During the onset phase, an instrument begins producing each harmonic in a characteristic chronological pattern. For example, pipe organs produce the second harmonic approximately 30ms before the first and third harmonics. Almost all of our ability to distinguish among instruments is due to the differences in the onset phases! Studies have shown that we have great difficulty distinguishing among instruments if we begin listening to their outputs after the onset phase.
Acoustics and Psychoacoustics by David Howard and James Angus provides a wonderful reference about this topic.