small child. From there, this more-or-less constant sound source is filtered, so that only a subset of its many frequencies makes it through. For those who like visual analogies, imagine producing a perfect white light and then applying a filter, so that only part of the spectrum shines through. The vocal tract works on a similar 'source and filter' principle. The lips, the tip of the tongue, the tongue body, the velum (also known as the soft palate), and the glottis (the opening between the vocal folds) are known collectively as articulators. By varying their motions, these articulators shape the raw sound stream into what we know as speech: you vibrate your vocal cords when you say 'bah'but not 'pah'; you close your lips when say 'mah' but move your tongue to your teeth when you say 'nah.'
Respiration, phonation, and articulation are not unique to humans. Since fish walked the land, virtually all vertebrates, from frogs to birds to mammals, have used vocally produced sound to communicate. Human evolution, however, depended on two key enhancements: the lowering of our larynx (not unique to humans but very rare elsewhere in the animal kingdom) and increased control of the ensemble of articulators that shape the sound of speech. Both have consequences.
Consider first the larynx. In most species, the larynx consists of a single long tube. At some point in evolution, our larynx dropped down. Moreover, as we changed posture and stood upright, it took a 90-degree turn, dividing into two tubes of more or less equal length, which endowed us with considerably more control of our vocalizations — and radically increased our risk of choking. As first noted by Darwin, 'Every particle of food and drink which we swallow has to pass over the orifice of the trachea, with some risk of falling into the lungs' — something we're all vulnerable to.*
Maybe you think the mildly increased risk of choking is a small price to pay, maybe you don't. It certainly didn't
In any event, the descended larynx was only half the battle. The real entr?e into speech came from significantly increased control over our articulators. But here too the system is a bit of a kluge. For one thing, the vocal tract lacks the elegance of the iPod, which can play back more or less any sound equally well, from Moby's guitars and flutes to hip-hop's car crashes and gunshots. The vocal tract, in contrast, is tuned only to words. All the world's languages are drawn from an inventory of 90 sounds, and any particular language employs no more than half that number — an absurdly tiny subset when you think of the many distinct sounds the ear can recognize.
Imagine, for example, a human language that would refer to something by reproducing the sound it makes. I'd refer to my favorite canine, Ari, by reproducing his woof, not by calling him a dog. But the three-part contraption of respiration, phonation, and articula
*According to a recent article in
tion can only do so much; even where languages allegedly refer to objects by their sounds — the phenomenon known as onomatopoeia
— the 'sounds' we refer to sound like, well,
Tongue-twisters emerge as a consequence of the complicated dance that the articulators perform. It's not enough to close our mouth or move our tongue in a basic set of movements; we have to coordinate each one in precisely timed ways. Two words can be made up of exactly the same physical motions performed in a slightly different sequence.
And that timer, which evolved long before language, is really good at only very simple rhythms: keeping things either exactly in phase (clapping) or exactly out of phase (alternating steps in walking, alternating strokes in swimming, and so forth). All that is fine for walking or running, but not if you need to perform an action with a more complex rhythm. Try, for example, to tap your right hand at twice the rate of your left. If you start out slow, this should be easy. But now gradually increase the tempo. Sooner or later you will find that the rhythm of your tapping will break down (the technical term is
Which returns us to tongue-twisters. Saying the words
The peculiar nature of our articulatory system and how it evolved, leads to one more consequence: the relation between sound waves and phonemes (the smallest distinct speech sounds, such as
Why such a complex system? Here again, evolution is to blame; once it locked us into producing sounds by articulatory choreography, the only way to keep up the speed of communication was to cut corners. Rather than