Wednesday 22 August 2007

minimal pair

Compared to other companies, McDonald's "I'm loving it" ad campaign is only tolerable, though I must say they are heading in the right direction when their ads endorse you wearing that second-hand shirt. To be thrifty is a "cool" thing, today anyway. The ad in particular that I wish to comment about (and which I have come so long-windedly to say) concerns that part where someone knocks on a girl's door. When it opens, immediately, a rapper starts belting out several (impressive) lines in Spanish. The girl acts all confused, and her friends helping her move into her house stop, half out of curious shock. The boy next to the rapper then appends his own line, greeting his girlfriend. At first, it might seem that the rap didn't suit the girl's tastes. However the girl says, "I asked you to bring me a wrap, not a rap!" The boyfriend smiles cheekily and reveals that he has that too. Cue McDonald's plugging in their chicken mustard wraps or whatever, and happy youths plonked down on the couch eating.

The part that gets my attention linguistically is not the pun, but rather the different stresses on "wrap" and "rap". Under the traditional analysis of the English dialects, nearly all dialects treat both of them as homophones, save true mavericks like Scots (and not just Scottish English!). In linguistic speak, it is generally said that for the sheer majority of English dialects, there is no minimal pair to distinguish both of the words phonemically. Because we can distinguish "but" from "putt", for example -- there is a minimal pair distinguishing /b/ and /p/ (specifically, regarding voicing/aspiration). In contrast, according to the mainstream analysis of the English dialects, there usually is no minimal pair for /wr/ and /r/.

In the commercial, one can rule out suprasegmental stress, e.g. "I asked you to bring me A, not B," and stressing the A. If we had another example that went, "I asked you to bring me flour, not a flower!" the stress might even be placed on the second item, rather than the first. Furthermore, on second analysis, the girl does more than just to merely stress the word "wrap", she seems to employ extra secondary articulation, if not use a different consonant altogether.

This seems to imply that perhaps there is some distinction, even in the common dialects, considering that we can find this distinction in a McDonald's commercial. Yes, part of an ad campaign that McDonald's spends tens of millions of dollars on in order to get a rather simplistic observation of the youth demographic, while ironically seeming to support thriftiness. Anyway, the basic question to ask is, do the standard dialects (General American, Londoner, even "Singaporean Standard English" etc.) make a distinction, however fine, between, /wr/ and /r/?

In Old English, the distinction was by lip rounding. For example, "right" and "write" would be distinguished by the fact that in the first word, the consonant /r/ would be pronounced with the lips relatively relaxed, while the /wr/ of "write" would be articulated with the lips tensed in a circle (rounded). This distinction however, does not seem to be the distinction today. (This ignores the other distinction in Old English that would have been made between "right" and "write" -- the presence of the velar fricative in "right" [hence the H] and the absence of it in "write". But we're not talking about that, yo.)

Before I became interested in linguistics, I always thought there was some sort of fine distinction between "night" and "nite" (I later learned the distinction was more than subtle during Old English), "sign" and "sine", etc. The presence of silent "g" in words always made me tense my lips more -- a half-conscious strategy used to distinguish homophones while reading as a child. This however is artificial, as the distinction is inspired by writing, and usually is not noticeable soundwise in speech, save to the speaker making the distinction. "Wrap" and "rap" perhaps is the exception, a distinction inspired first by spelling but perhaps has since entered speech. Because today's /r/ tends to be rounded or labialised anyway, regardless of whether a /w/ precedes it, a distinction between /wr/ and /r/ is subtle to make. But that doesn't mean it isn't there. /wr/ can be distinguished from /r/ by rounding the lips even further. Distinguishing three levels of rounding is rare, but not impossible -- it for example occurs in Swedish.

One thing to note while viewing the IP chart of consonants is that the English native speakers can choose from two different realisations of R. Even now I realise that I may articulate the word "realise" itself a labialised consonant, but considerably less labialised than in the word "writing", for example. There's the alveolar approximant, and there's the retroflex approximant. The retroflex approximant supposedly occurs in some American English dialects only, but it is my suspicion that many English speakers, even non-American ones, may "push" their alveolar approximant R's back towards the retroflex position when they aRe tRying to stRess the R-ness of something. (The retroflex position is the area immediately somewhat behind the "alveolar ridge" itself behind the gums and teeth, but in front of the palate.) You know the Beijing Mandarin speakers with their R's (shir arh) -- one of the distinctions, besides the centralisation of some of their front vowels, is the use of the retroflex R over the alveolar R that Singaporeans tend to use more often.

So my argument after all these paragraphs is this, and perhaps an interesting tidbit of a question for linguistic fanatics like me to look into: do native English speakers -- or at least a significant lot of them -- make a phonetic, if not phonemic distinction, between /wr/ and /r/? How is this articulated? My own suspicion is that it is a mix of both even further labialisation as well as the use of the retroflex approximant over the alveolar.'

Don't try saying that I'm reading too much into a McDonald's commercial. This be linguistics we be talkin' bout here, 'yo.

(edited and reposted from my personal blog)

Friday 10 August 2007

it woyz only oy hopeless foyncy, it poyssed loyke oyn oygust doy

By themselves, /ɑ / and /a/ (both open vowels, one in the back in the mouth and one in the front) don't sound that different. I mean, say "haha" but in the back of your mouth, as far back as possible. Doesn't sound much different does it? Sure people whose English dialects use the first vowel prominently sound like they have something in their mouth. Moy foyther oylways used to get up oyt foyr ay-emme.

No offence intended to the diverse English speakers out there: the only dialects I speak naturally are Singlish and rhotic New Englandic. I mean, from a relative point of view, people who use the "back-A" to replace phonemes where I would use /a/ and sometimes /ɔ/ (cot) do indeed sound like they have something in their mouth, preventing them from opening their mouth fully. Whereas in contrast, they may (subconsciously) view me as speaking lazily. And for the Southerners, who love to diphthongise what I would normally leave as a monophthong, it seems to me (as a perception I can't control) like they can't close their mouth enough, what with all those vowel glides!

These are the sorts of perceptions and prejudices people do not consciously exert, but it sort of cannot be ignored. As long as we don't really believe that Southern twangers speaking with their mouth hanging open (or for RP speakers who think that my tongue can't move properly to make the appropriate distinctions), etc. etc. no harm done. And plus, it makes a fascinating psycholinguistics area of study.

So anyway, compared to /u/ (sue) and /i/ (see), /e/ (say) and /o/ (tote), etc. /ɑ / and /a/ don't seem that different. There is a noticeable difference, naturally. But compare (for English speakers) if someone said "I've got the flea" versus "I've got the flu", there would be an immediate change in perception of difference, compared to contrasting, say, papa said in the front of the mouth and in the back of the mouth.

Phonetics has some explanations for this. You might point out for example, that /i/ and /e/ are unrounded, while /u/ and /o/ are rounded. (As an explanation to the others, this means the lips are tensed to produce a circular shape; one could guess that our lips are flexible for the purpose of rounding vowels, in the same way chimps do.)

But roundedness only accounts for some of it (and why they this rounding distinction occurs in the first place is dealt with in a later point). For example, veux in French (rounded front mid-close vowel) contrasts with vaut (rounded BACK mid-close vowel, or just plain English /o/ [ohhhhh!]), and they sound very different, despite both being unrounded. And the /y/ - /u/ distinction, a distinction that both Mandarin Chinese and French make, distinguishes a rounded close-mid vowel in the front of the mouth versus one in the back. For example, French tu sounds very different from French tout. (Last "t" is silent.) Ask any Frenchman! And if you know Mandarin, you might know the distinction between 努 (/nu/, or pinyin nǔ) and 女 (/ny/, or pinyin nǚ). They sound very distinct from each other, though they both have the same tone and are both rounded close vowels. The thing they differ is in their backness.

So what else accounts for it? If you see the IPA vowel chart, you can see there is a large distance between /u/ (loo) and /i/ (lee). You can fit central vowels of the same height in there, complete with a rounded/unrounded pair. But down below, the central vowel closest to the A's (both back and front) is one step higher in vowel height (the amount the tongue is raised by when pronouncing a vowel) compared to both of them. And there is no room for a rounded/unrounded distinction for that vowel. Making a distinction for features of vowel backness and roundedness gets more difficult and subtle the more open your vowels become.

This is something you can confirm for yourself: when your tongue is high and close to the roof, it can go a lot of places, front, back, wiggle side-to-side. When it is close to the mouth floor (the height where you would pronounce a "low" or an "open" vowel), tongue movement becomes more and more pretty restricted.

This probably why for example, it seems more common for languages to contrast /i/ and /y/ (though English does not do it, German, Mandarin and French does), while I know of few languages that contrast /ɑ / and /a/ as a minimal pair (in fact, I can't name any off the top of my head).

But there is something interesting (and I did not plan on taking so many paragraphs to come to this!) When you nasalise /a/ versus /ɑ/, suddenly, the pronunciation seems very different. In Parisian "street" French for example (and not the "metropolitan French" they teach as an academic standard), vin is pronounced as a nasalised /a/. (The vowel sounds like the one in han, as hanyu pinyin, only don't let the tongue touch your teeth or the roof your mouth while pronouncing the /n/.)

If you were to replace the nasalised /a/ with a nasalised /ɑ/ for example, you get vent (or "vant" as in vante, "boasts", if you discard the /t/ sound). Even if you do not know a scrap of French, it should sound drastically different. Nasalised /ɑ/ is the same vowel found in the imitation posh pronunciation of "lingerie" (which actually should use the nasalised /a/ if you are speaking street French or /ɛ / [bed] if you want to speak "higher class" French).

The history of French nasals is interesting, and they have a tendency to go all over the place. After all, -en- and -an- are merged pair in French, save for certain exceptions where I have heard native speakers pronounce Catalan, Verlan with a nasal /a/ rather than a nasal /ɑ/, etc.
And after you factor in the general Romance sound changes from Vulgar Latin to French (abolere to abolir, etc.), you still have something interesting because why is "-in-" using a nasalised /a/ or /ɛ/, rather than something closer to a nasalised /i/?

Something that has piqued me for some quite some time now, is the concept of formants. Formants take the step forward from to a general theory of phonemes and methods of articulation into sound physics. When you play music on a music player, such as with WinAmp, Windows Media Player, XMMS (or whatever proprietary, open source, etc. software you use), those bars bouncing up and down are formants. When you examine an mp3 or a PCM .wav file, you can see formants to a degree, though not very clearly as you would see them on a formant chart, like the pulsing beat that pops up at regular intervals of say, a song like Black Eyed Peas' Pump It. The car player displaying the "musical bars" at the beginning of the video displays the formants of the song as it plays, selected at the most common frequencies generally most pertinent to music. You can see for example that the first instrument (I am aware that the original musical idea came from from Misirlou) raises at the right a few bars only, before it breaks into the special guitar playing that affects the rest of the bars. By viewing the "musical bars" of a song, you are seeing a sort of a spectrogram (for the end-user) of the song's harmonic frequencies.

(The exact physics of a formant are covered in the Wikipedia article -- it pertains to resonant frequency -- and I will enjoy torturing Mr. Weirich with formants in AP Physics next year. Linguistics is such a brilliant marriage of the humanities and lab science.)

Each human language sound is a combination of formants -- the brain analyses parts of speech (and I literally mean "parts of speech" -- the sound information contained in each articulation, not the stuff they teach you as "verbs", "nouns" and so forth) and breaks them down into their appropriate formants. All the special features of speech can often be found to be raising or lowering specific formants. For example, /i/ really does seem to share something with /u/ -- they sound like they have a higher "pitch" in some sense. That is because although the other formants are different, they have similar formants that correspond to vowel height. /a/ after all sounds less energetic or "high-pitched" than /i/. And /y/ (as in 女), which is basically the same vowel as /i/ except with the lips rounded does sound a bit less "high-pitched" than /i/ because roundedness lowers some formants. There is a basic guideline for some of the more fundamental characteristics of sound production, but I'll be lazy and quote from wiki:

Most often the two first formants, f1 and f2, are enough to disambiguate the vowel. These two formants are primarily determined by the position of the tongue. f1 has a higher frequency when the tongue is lowered, and f2 has a higher frequency when the tongue is forward.

As formants are part of wave physics, there is interference between formants and not each feature corresponds cleanly to one formant (or just one set of formants). For example, a higher vowel height tends to push the F2 formant up too (even though it would still have the same backness between the two), such that if you buy into a direct relationship it would seem that the vowel is more front than it really is. This seems a bit natural -- as your tongue gets higher, it distorts the feature of backness somewhat, since the volume of the space behind the tongue changes too. It is because of formants that vowels can seem like it is "higher" and "lower" than another vowel.

Consider for example, "messed" versus "most". The English "short e" vowel (/ɛ/) seems 'lower' because of its openness, and /o/ seems more "well-defined in pitch" in the most fundamental aspect (the first formant). And yet on another level /ɛ/ seems higher because /o/ sounds "deeper". This is because of the nature of their two formants: /ɛ/ has a lower F1 value (height) but higher F2 and F3 values due to rounding and backness (or lack thereof: after compensating for the raising of F2 that a raised F1 brings) than the other vowel /o/. A formant with a lower value occurs represents a resonance at a lower frequency, and that is why it seems more "fundamental" (hence why /o/ sounds "cleanly" high while /ɛ/ seems "messily high"). Other things you may notice is that vowels next to /r/ sound "lower" on another scale -- this is because it lowers some formants, generally F3, while nasal vowels often raise formants to a very high extent. Which brings us back to French phonological history.

/i/ is a good example of a vowel that is "high" because it seems to resonate so easily: it is one of the vowels with the highest of all formants, as it is both a close (high) and a front vowel. And if you nasalise it, those are some really really really high formants! Ever thought about the piercing ability of the word "sheen"? It is very high, almost annoying if you say it a certain way, the way it cuts into your brain like a high-pitched note. And this is normal speech -- you are not even singing yet. (This in part explains how jingles like Mr Clean Mr Clean can be so catchy since they are actually sung). In English "sheen/clean", the /n/ nasalisation only occurs at the very end of the vowel ... but in French phonology, graphemes like -in (like in voisin) signify the vowel is fully nasalised while the /n/ itself is omitted. I suspect this arrangement was quite unstable and it was so high that lowering the vowel a bit while nasalised didn't appear to compromise too much (there were little minimal pairs in regard to nasal vowels) while making the vowel more aesthetic. So over the years it got lowered to /e/, then /ɛ/, then in street French /a/. Nasalisation also has the effect of appearing to "raise" vowel height by one level due to the entire effect of formant-raising. For example, many people seem to perceive "en avion" as "on aviohn", even though "en" uses a nasalised /ɑ/ rather than nasalised /ɔ/, and "avion" itself uses nasalised /ɔ/ rather than nasalised /o/.

And ultimately, the original question. The slight difference in /a/
and /ɑ/ becomes extremely magnified with nasalisation. Although formants are most usually used for computer applications of linguistics (like making voice recognition programs -- now you have a rough idea on how they work, by analysing formants and identifying phonemes by formants' features) they allow people to scientifically quantify what would have otherwise been a subjective perception. After all, saying that /i/ sounds "higher" than /u/ is vague. High in what way? Formants allow us to quantify this perception.