Language: An Introduction to the Study of Speech

Chapter 5

Chapter 53,816 wordsPublic domain

[Footnote 13: By "quality" is here meant the inherent nature and resonance of the sound as such. The general "quality" of the individual's voice is another matter altogether. This is chiefly determined by the individual anatomical characteristics of the larynx and is of no linguistic interest whatever.]

The lungs and bronchial tubes are organs of speech only in so far as they supply and conduct the current of outgoing air without which audible articulation is impossible. They are not responsible for any specific sound or acoustic feature of sounds except, possibly, accent or stress. It may be that differences of stress are due to slight differences in the contracting force of the lung muscles, but even this influence of the lungs is denied by some students, who explain the fluctuations of stress that do so much to color speech by reference to the more delicate activity of the glottal cords. These glottal cords are two small, nearly horizontal, and highly sensitive membranes within the larynx, which consists, for the most part, of two large and several smaller cartilages and of a number of small muscles that control the action of the cords.

The cords, which are attached to the cartilages, are to the human speech organs what the two vibrating reeds are to a clarinet or the strings to a violin. They are capable of at least three distinct types of movement, each of which is of the greatest importance for speech. They may be drawn towards or away from each other, they may vibrate like reeds or strings, and they may become lax or tense in the direction of their length. The last class of these movements allows the cords to vibrate at different "lengths" or degrees of tenseness and is responsible for the variations in pitch which are present not only in song but in the more elusive modulations of ordinary speech. The two other types of glottal action determine the nature of the voice, "voice" being a convenient term for breath as utilized in speech. If the cords are well apart, allowing the breath to escape in unmodified form, we have the condition technically known as "voicelessness." All sounds produced under these circumstances are "voiceless" sounds. Such are the simple, unmodified breath as it passes into the mouth, which is, at least approximately, the same as the sound that we write _h_, also a large number of special articulations in the mouth chamber, like _p_ and _s_. On the other hand, the glottal cords may be brought tight together, without vibrating. When this happens, the current of breath is checked for the time being. The slight choke or "arrested cough" that is thus made audible is not recognized in English as a definite sound but occurs nevertheless not infrequently.[14] This momentary check, technically known as a "glottal stop," is an integral element of speech in many languages, as Danish, Lettish, certain Chinese dialects, and nearly all American Indian languages. Between the two extremes of voicelessness, that of completely open breath and that of checked breath, lies the position of true voice. In this position the cords are close together, but not so tightly as to prevent the air from streaming through; the cords are set vibrating and a musical tone of varying pitch results. A tone so produced is known as a "voiced sound." It may have an indefinite number of qualities according to the precise position of the upper organs of speech. Our vowels, nasals (such as _m_ and _n_), and such sounds as _b_, _z_, and _l_ are all voiced sounds. The most convenient test of a voiced sound is the possibility of pronouncing it on any given pitch, in other words, of singing on it.[15] The voiced sounds are the most clearly audible elements of speech. As such they are the carriers of practically all significant differences in stress, pitch, and syllabification. The voiceless sounds are articulated noises that break up the stream of voice with fleeting moments of silence. Acoustically intermediate between the freely unvoiced and the voiced sounds are a number of other characteristic types of voicing, such as murmuring and whisper.[16] These and still other types of voice are relatively unimportant in English and most other European languages, but there are languages in which they rise to some prominence in the normal flow of speech.

[Footnote 14: As at the end of the snappily pronounced _no!_ (sometimes written _nope!_) or in the over-carefully pronounced _at all_, where one may hear a slight check between the _t_ and the _a_.]

[Footnote 15: "Singing" is here used in a wide sense. One cannot sing continuously on such a sound as _b_ or _d_, but one may easily outline a tune on a series of _b_'s or _d_'s in the manner of the plucked "pizzicato" on stringed instruments. A series of tones executed on continuant consonants, like _m_, _z_, or _l_, gives the effect of humming, droning, or buzzing. The sound of "humming," indeed, is nothing but a continuous voiced nasal, held on one pitch or varying in pitch, as desired.]

[Footnote 16: The whisper of ordinary speech is a combination of unvoiced sounds and "whispered" sounds, as the term is understood in phonetics.]

The nose is not an active organ of speech, but it is highly important as a resonance chamber. It may be disconnected from the mouth, which is the other great resonance chamber, by the lifting of the movable part of the soft palate so as to shut off the passage of the breath into the nasal cavity; or, if the soft palate is allowed to hang down freely and unobstructively, so that the breath passes into both the nose and the mouth, these make a combined resonance chamber. Such sounds as _b_ and _a_ (as in _father_) are voiced "oral" sounds, that is, the voiced breath does not receive a nasal resonance. As soon as the soft palate is lowered, however, and the nose added as a participating resonance chamber, the sounds _b_ and _a_ take on a peculiar "nasal" quality and become, respectively, _m_ and the nasalized vowel written _an_ in French (e.g., _sang_, _tant_). The only English sounds[17] that normally receive a nasal resonance are _m_, _n_, and the _ng_ sound of _sing_. Practically all sounds, however, may be nasalized, not only the vowels--nasalized vowels are common in all parts of the world--but such sounds as _l_ or _z_. Voiceless nasals are perfectly possible. They occur, for instance, in Welsh and in quite a number of American Indian languages.

[Footnote 17: Aside from the involuntary nasalizing of all voiced sounds in the speech of those that talk with a "nasal twang."]

The organs that make up the oral resonance chamber may articulate in two ways. The breath, voiced or unvoiced, nasalized or unnasalized, may be allowed to pass through the mouth without being checked or impeded at any point; or it may be either momentarily checked or allowed to stream through a greatly narrowed passage with resulting air friction. There are also transitions between the two latter types of articulation. The unimpeded breath takes on a particular color or quality in accordance with the varying shape of the oral resonance chamber. This shape is chiefly determined by the position of the movable parts--the tongue and the lips. As the tongue is raised or lowered, retracted or brought forward, held tense or lax, and as the lips are pursed ("rounded") in varying degree or allowed to keep their position of rest, a large number of distinct qualities result. These oral qualities are the vowels. In theory their number is infinite, in practice the ear can differentiate only a limited, yet a surprisingly large, number of resonance positions. Vowels, whether nasalized or not, are normally voiced sounds; in not a few languages, however, "voiceless vowels"[18] also occur.

[Footnote 18: These may be also defined as free unvoiced breath with varying vocalic timbres. In the long Paiute word quoted on page 31 the first _u_ and the final _ü_ are pronounced without voice.]

[Transcriber's note: Footnote 18 refers to line 1014.]

The remaining oral sounds are generally grouped together as "consonants." In them the stream of breath is interfered with in some way, so that a lesser resonance results, and a sharper, more incisive quality of tone. There are four main types of articulation generally recognized within the consonantal group of sounds. The breath may be completely stopped for a moment at some definite point in the oral cavity. Sounds so produced, like _t_ or _d_ or _p_, are known as "stops" or "explosives."[19] Or the breath may be continuously obstructed through a narrow passage, not entirely checked. Examples of such "spirants" or "fricatives," as they are called, are _s_ and _z_ and _y_. The third class of consonants, the "laterals," are semi-stopped. There is a true stoppage at the central point of articulation, but the breath is allowed to escape through the two side passages or through one of them. Our English _d_, for instance, may be readily transformed into _l_, which has the voicing and the position of _d_, merely by depressing the sides of the tongue on either side of the point of contact sufficiently to allow the breath to come through. Laterals are possible in many distinct positions. They may be unvoiced (the Welsh _ll_ is an example) as well as voiced. Finally, the stoppage of the breath may be rapidly intermittent; in other words, the active organ of contact--generally the point of the tongue, less often the uvula[20]--may be made to vibrate against or near the point of contact. These sounds are the "trills" or "rolled consonants," of which the normal English _r_ is a none too typical example. They are well developed in many languages, however, generally in voiced form, sometimes, as in Welsh and Paiute, in unvoiced form as well.

[Footnote 19: Nasalized stops, say _m_ or _n_, can naturally not be truly "stopped," as there is no way of checking the stream of breath in the nose by a definite articulation.]

[Footnote 20: The lips also may theoretically so articulate. "Labial trills," however, are certainly rare in natural speech.]

The oral manner of articulation is naturally not sufficient to define a consonant. The place of articulation must also be considered. Contacts may be formed at a large number of points, from the root of the tongue to the lips. It is not necessary here to go at length into this somewhat complicated matter. The contact is either between the root of the tongue and the throat,[21] some part of the tongue and a point on the palate (as in _k_ or _ch_ or _l_), some part of the tongue and the teeth (as in the English _th_ of _thick_ and _then_), the teeth and one of the lips (practically always the upper teeth and lower lip, as in _f_), or the two lips (as in _p_ or English _w_). The tongue articulations are the most complicated of all, as the mobility of the tongue allows various points on its surface, say the tip, to articulate against a number of opposed points of contact. Hence arise many positions of articulation that we are not familiar with, such as the typical "dental" position of Russian or Italian _t_ and _d_; or the "cerebral" position of Sanskrit and other languages of India, in which the tip of the tongue articulates against the hard palate. As there is no break at any point between the rims of the teeth back to the uvula nor from the tip of the tongue back to its root, it is evident that all the articulations that involve the tongue form a continuous organic (and acoustic) series. The positions grade into each other, but each language selects a limited number of clearly defined positions as characteristic of its consonantal system, ignoring transitional or extreme positions. Frequently a language allows a certain latitude in the fixing of the required position. This is true, for instance, of the English _k_ sound, which is articulated much further to the front in a word like _kin_ than in _cool_. We ignore this difference, psychologically, as a non-essential, mechanical one. Another language might well recognize the difference, or only a slightly greater one, as significant, as paralleling the distinction in position between the _k_ of _kin_ and the _t_ of _tin_.

[Footnote 21: This position, known as "faucal," is not common.]

The organic classification of speech sounds is a simple matter after what we have learned of their production. Any such sound may be put into its proper place by the appropriate answer to four main questions:--What is the position of the glottal cords during its articulation? Does the breath pass into the mouth alone or is it also allowed to stream into the nose? Does the breath pass freely through the mouth or is it impeded at some point and, if so, in what manner? What are the precise points of articulation in the mouth?[22] This fourfold classification of sounds, worked out in all its detailed ramifications,[23] is sufficient to account for all, or practically all, the sounds of language.[24]

[Footnote 22: "Points of articulation" must be understood to include tongue and lip positions of the vowels.]

[Footnote 23: Including, under the fourth category, a number of special resonance adjustments that we have not been able to take up specifically.]

[Footnote 24: In so far, it should be added, as these sounds are expiratory, i.e., pronounced with the outgoing breath. Certain languages, like the South African Hottentot and Bushman, have also a number of inspiratory sounds, pronounced by sucking in the breath at various points of oral contact. These are the so-called "clicks."]

The phonetic habits of a given language are not exhaustively defined by stating that it makes use of such and such particular sounds out of the all but endless gamut that we have briefly surveyed. There remains the important question of the dynamics of these phonetic elements. Two languages may, theoretically, be built up of precisely the same series of consonants and vowels and yet produce utterly different acoustic effects. One of them may not recognize striking variations in the lengths or "quantities" of the phonetic elements, the other may note such variations most punctiliously (in probably the majority of languages long and short vowels are distinguished; in many, as in Italian or Swedish or Ojibwa, long consonants are recognized as distinct from short ones). Or the one, say English, may be very sensitive to relative stresses, while in the other, say French, stress is a very minor consideration. Or, again, the pitch differences which are inseparable from the actual practice of language may not affect the word as such, but, as in English, may be a more or less random or, at best, but a rhetorical phenomenon, while in other languages, as in Swedish, Lithuanian, Chinese, Siamese, and the majority of African languages, they may be more finely graduated and felt as integral characteristics of the words themselves. Varying methods of syllabifying are also responsible for noteworthy acoustic differences. Most important of all, perhaps, are the very different possibilities of combining the phonetic elements. Each language has its peculiarities. The _ts_ combination, for instance, is found in both English and German, but in English it can only occur at the end of a word (as in _hats_), while it occurs freely in German as the psychological equivalent of a single sound (as in _Zeit_, _Katze_). Some languages allow of great heapings of consonants or of vocalic groups (diphthongs), in others no two consonants or no two vowels may ever come together. Frequently a sound occurs only in a special position or under special phonetic circumstances. In English, for instance, the _z_-sound of _azure_ cannot occur initially, while the peculiar quality of the _t_ of _sting_ is dependent on its being preceded by the _s_. These dynamic factors, in their totality, are as important for the proper understanding of the phonetic genius of a language as the sound system itself, often far more so.

We have already seen, in an incidental way, that phonetic elements or such dynamic features as quantity and stress have varying psychological "values." The English _ts_ of _fiats_ is merely a _t_ followed by a functionally independent _s_, the _ts_ of the German word _Zeit_ has an integral value equivalent, say, to the _t_ of the English word _tide_. Again, the _t_ of _time_ is indeed noticeably distinct from that of _sting_, but the difference, to the consciousness of an English-speaking person, is quite irrelevant. It has no "value." If we compare the _t_-sounds of Haida, the Indian language spoken in the Queen Charlotte Islands, we find that precisely the same difference of articulation has a real value. In such a word as _sting_ "two," the _t_ is pronounced precisely as in English, but in _sta_ "from" the _t_ is clearly "aspirated," like that of _time_. In other words, an objective difference that is irrelevant in English is of functional value in Haida; from its own psychological standpoint the _t_ of _sting_ is as different from that of _sta_ as, from our standpoint, is the _t_ of _time_ from the _d_ of _divine_. Further investigation would yield the interesting result that the Haida ear finds the difference between the English _t_ of _sting_ and the _d_ of _divine_ as irrelevant as the naïve English ear finds that of the _t_-sounds of _sting_ and _time_. The objective comparison of sounds in two or more languages is, then, of no psychological or historical significance unless these sounds are first "weighted," unless their phonetic "values" are determined. These values, in turn, flow from the general behavior and functioning of the sounds in actual speech.

These considerations as to phonetic value lead to an important conception. Back of the purely objective system of sounds that is peculiar to a language and which can be arrived at only by a painstaking phonetic analysis, there is a more restricted "inner" or "ideal" system which, while perhaps equally unconscious as a system to the naïve speaker, can far more readily than the other be brought to his consciousness as a finished pattern, a psychological mechanism. The inner sound-system, overlaid though it may be by the mechanical or the irrelevant, is a real and an immensely important principle in the life of a language. It may persist as a pattern, involving number, relation, and functioning of phonetic elements, long after its phonetic content is changed. Two historically related languages or dialects may not have a sound in common, but their ideal sound-systems may be identical patterns. I would not for a moment wish to imply that this pattern may not change. It may shrink or expand or change its functional complexion, but its rate of change is infinitely less rapid than that of the sounds as such. Every language, then, is characterized as much by its ideal system of sounds and by the underlying phonetic pattern (system, one might term it, of symbolic atoms) as by a definite grammatical structure. Both the phonetic and conceptual structures show the instinctive feeling of language for form.[25]

[Footnote 25: The conception of the ideal phonetic system, the phonetic pattern, of a language is not as well understood by linguistic students as it should be. In this respect the unschooled recorder of language, provided he has a good ear and a genuine instinct for language, is often at a great advantage as compared with the minute phonetician, who is apt to be swamped by his mass of observations. I have already employed my experience in teaching Indians to write their own language for its testing value in another connection. It yields equally valuable evidence here. I found that it was difficult or impossible to teach an Indian to make phonetic distinctions that did not correspond to "points in the pattern of his language," however these differences might strike our objective ear, but that subtle, barely audible, phonetic differences, if only they hit the "points in the pattern," were easily and voluntarily expressed in writing. In watching my Nootka interpreter write his language, I often had the curious feeling that he was transcribing an ideal flow of phonetic elements which he heard, inadequately from a purely objective standpoint, as the intention of the actual rumble of speech.]

FORM IN LANGUAGE: GRAMMATICAL PROCESSES

The question of form in language presents itself under two aspects. We may either consider the formal methods employed by a language, its "grammatical processes," or we may ascertain the distribution of concepts with reference to formal expression. What are the formal patterns of the language? And what types of concepts make up the content of these formal patterns? The two points of view are quite distinct. The English word _unthinkingly_ is, broadly speaking, formally parallel to the word _reformers_, each being built up on a radical element which may occur as an independent verb (_think_, _form_), this radical element being preceded by an element (_un-_, _re-_) that conveys a definite and fairly concrete significance but that cannot be used independently, and followed by two elements (_-ing_, _-ly_; _-er_, _-s_) that limit the application of the radical concept in a relational sense. This formal pattern--(b) + A + (c) + (d)[26]--is a characteristic feature of the language. A countless number of functions may be expressed by it; in other words, all the possible ideas conveyed by such prefixed and suffixed elements, while tending to fall into minor groups, do not necessarily form natural, functional systems. There is no logical reason, for instance, why the numeral function of _-s_ should be formally expressed in a manner that is analogous to the expression of the idea conveyed by _-ly_. It is perfectly conceivable that in another language the concept of manner (_-ly_) may be treated according to an entirely different pattern from that of plurality. The former might have to be expressed by an independent word (say, _thus unthinking_), the latter by a prefixed element (say, _plural[27]-reform-er_). There are, of course, an unlimited number of other possibilities. Even within the confines of English alone the relative independence of form and function can be made obvious. Thus, the negative idea conveyed by _un-_ can be just as adequately expressed by a suffixed element (_-less_) in such a word as _thoughtlessly_. Such a twofold formal expression of the negative function would be inconceivable in certain languages, say Eskimo, where a suffixed element would alone be possible. Again, the plural notion conveyed by the _-s_ of _reformers_ is just as definitely expressed in the word _geese_, where an utterly distinct method is employed. Furthermore, the principle of vocalic change (_goose_--_geese_) is by no means confined to the expression of the idea of plurality; it may also function as an indicator of difference of time (e.g., _sing_--_sang_, _throw_--_threw_). But the expression in English of past time is not by any means always bound up with a change of vowel. In the great majority of cases the same idea is expressed by means of a distinct suffix (_die-d_, _work-ed_). Functionally, _died_ and _sang_ are analogous; so are _reformers_ and _geese_. Formally, we must arrange these words quite otherwise. Both _die-d_ and _re-form-er-s_ employ the method of suffixing grammatical elements; both _sang_ and _geese_ have grammatical form by virtue of the fact that their vowels differ from the vowels of other words with which they are closely related in form and meaning (_goose_; _sing_, _sung_).

[Footnote 26: For the symbolism, see chapter II.]

[Footnote 27: "_Plural_" is here a symbol for any prefix indicating plurality.]