Wednesday, 9 December 2009

Nepali, Nez Perce, and Na'vi: On alien-language in Cameron's Avatar, with remarks on etymology and "Universal Grammar"

In order to lend authenticity to his film Avatar, James Cameron had the language of the alien Na'vi people designed by linguist Paul Frommer, as reported by Benjamin Zimmer[1] in his 4 December article "Skxawng!" (in his New York Times column "On Language"). Cameron apparently choose Frommer partly on the basis of his co-authored textbook Looking at Languages[2], where one of the exercises involves deciphering Klingon word order [spoiler: it's object-verb-subject] (Klingon is another linguist-designed language).

An interview with Frommer is available at the Unidentified Sound Object blog, in which Frommer reports on some interesting features of the language he developed. The language of the Na'vi (who look sort of like blue cat-people, see above) involves some typologically-unusual linguistic features, including: the presence of ejectives in the phonological inventory, specifically [k'], [t'], and [p'] (click to hear what these sound like), and---more interesting to syntacticians and morphologists---a tripartite system of case marking.

Like ejectives, tripartite case-marking is present but rare in human languages, found in the Australian languages Wangkumara and Kala Lagaw Ya (though these two languages are apparently unrelated) as well as in the Amerindian language Nez Percé spoken in the northwest of the USA (on which see further Cash Cash[3]). The tripartite case-marking system involves differences in morphological case-marking on (a) agents of transitive verbs [agentive/ergative case], (b) objects of transitive verbs [objective/accusative case], and (c) agents of intransitive verbs [absolutive/nominative case].

However, there are languages which are much less exotic (at least to me) that could also be seen as employing tripartite case-marking, including many Indo-Aryan languages such as Hindi and Nepali:-- see examples below (where nom=nominative/absolutive case; acc=accusative/objective case; erg=ergative/agentive case).
(1) लड़का कल आया
laṛkā-[Ø] kal āyā
boy-nom yesterday came
"The boy came yesterday."

(2) लड़के ने लड़की को देखा
laṛke-ne laṛkī-ko dekhā
boy-erg girl-acc saw
"The boy saw the girl."

(3) केटा हिजो आयो
keṭā-[Ø] hijo āyo
boy-nom yesterday came
"The boy came yesterday."

(4) केटाले केटीलाई हेर्यो
keṭā-le keṭī-lāī heryo
boy-erg girl-acc saw
"The boy saw the girl."
[Though the "accusative" case-marker (Hindi ko, Nepali lāī) in Indo-Aryan is not straightforwardly a marker of objects of transitive verbs, rather it tends to occur particularly on objects which are animate and/or specific--see Bhatia[4].]

The fact that the Na'vi language shares this feature with Indo-Aryan perhaps makes all the more appropriate that the name of Cameron's film is also Indo-Aryan. Avatar, from Sanskrit अवतार (avatāra), is usually translated into English as "incarnation", used to refer to gods assuming human bodies (e.g. the god Vishnu becoming Krishna). It also has an extended use in the world of cyberspace, where it refers to the graphic representation of a user or his alter ego. The sense in Cameron's film, I take it, actually draws on both of these meanings, as some of the human characters control Na'vi-appearing bodies.

Interestingly, in the Mahabharata, one of the two major Indian epic poems, where avatars are a cental concept, the term avatāra is actually never employed (Sutton[5]:156-7); however the concept is frequently alluded to (Biardeau[6]:1621n2, Hiltebeitel[7]:109n56) by usages of the verb avatr̥̄-, which literally means something like "stepping down" (prefix ava- "down, off" + √tr̥̄ "to cross over"). In fact, the verb avatr̥̄- is conventionally used in the Mahabharata to refer to people "stepping down from their chariots" (Hiltebeitel[7]:232).

Returning to Na'vi, in his Unidentified Sound Object interview, Frommer remarks that:
As I mentioned, there’s nothing in Na’vi that couldn’t be found in some human language—and that’s important, since humans have learned to speak it.
I found this idea that the Na'vi language is learnable by humans rather intriguing, since part of the Chomskian notion of (natural human) language is that it relies on biocognitive structures which are unique (at least on Earth) to humans (i.e. not present in any other Terran creatures). Would/could language as developed in an extraterrestrial species rely on biocognitive structures which would be equivalent to those underlying human language?

This reminds me of a story that Prof. Peter Lasersohn told in one of his semantics courses; paraphrased (as well as I can remember it):
Logicians and philosophers had long treated human language not being expressable in terms of formal logic. Richard Montague famously developed a system of formal semantics for language (Montague[8,9,10]); in one of the earlier accounts he states: "There is in my opinion no important theoretical difference between natural languages and the artificial languages of logicians; indeed, I consider it possible to comprehend the syntax and semantics of both kinds of language within a single natural and mathematically precise theory. On this point I differ from a number of philosophers, but agree, I believe, with Chomsky and his associates" (Montague[8]).
However, the Chomskian notion of "Universal Grammar" involves an abstract (but biocognitively instantiated) system which underlies all human language but is unique to humans. Montague, on the other hand, used "Universal Grammar" in the sense of a formal syntax and semantics which would be truly "universal", that is, applicable to any language, human or otherwise.
When Barbara Partee (a semanticist who was instrumental in popularising Montague-Grammar among generative linguists) explained Chomsky's sense of "Universal Grammar" to Montague, he was perplexed, remarking that he did not understand why linguists would adopt a human-only conception of "Universal Grammar" which would thus automatically disqualify them from being the ones the world would to turn to---in the event of humans making contact with aliens---for the decryption of extraterrestrial language.
[1]Zimmer, Benjamin. 2009. "Skxawng!" On Language, New York Times, 4 December 2009.
[2]Frommer, Paul R. & Finegan, Edward. 2004. Looking at languages: A workbook in elementary linguistics. Boston: Wadsworth, 3rd edn.
[3]Cash Cash, Phillip. 2004. "Nez Perce verb morphology". Ms., University of Arizona, Tucson.
[4]Bhatia, Archna. 2008. "Animacy, specificity and overt object case marking in Hindi". Ms., University of Illinois, Urbana-Champaign.
[5]Sutton, Nicholas. 2000. Religious doctrines in the Mahābhārata. Delhi: Motilal Banarsidass.
[6]Biardeau, Madeleine. 1999. Le Rāmāyaṇa de Vālmīki. Paris: Gallimard.
[7]Hiltebeitel, Alf. 2001. Rethinking the Mahābhārata: A reader’s guide to the education of the dharma king. New Delhi: Oxford University Press [Indian edition].
[8]Montague, Richard. 1970a. “Universal grammar”. Theoria 36: 373-398.
[9]Montague, Richard. 1970b. “English as a formal language”. In Bruno Visentini et al. (ed.), Linguaggi nella società e nella tecnica. Milan: Edizioni di Comunità, 188-221.
[10]Montague, Richard. 1973. “The proper treatment of quantification in ordinary English”. In K.J.J. Hintikka, J.M.E. Moravcsik, & P. Suppes (eds.), Approaches to natural language. Dordrecht: Reidel, 221-242.


  1. Excellent post--and I remember Lasersohn telling that story. Good times with extraterrestrial languages.

  2. Of course, as I'm sure you know, Hindi is far more complicated that that:

    "The boy saw A [not THE] girl" would be "laṛke-ne laṛkī-[Ø] dekhī" with a zero marker on the direct object and verbal agreement with the object rather than the subject.

  3. Perhaps the Na'vi have devised this language for the purpose of having a language learnable by human beings.

  4. @Matt: I don't know if the Partee-Montague story exists in a written form elsewhere. I thought it was worth recording.

    @vp: Yes, the Bhatia paper I refer to in the []s discusses the complexity of the Hindi data. There's also a certain amount of variation in Hindi regarding ko-marking. I have a sense that older speakers tend to use ko on any and all animate direct objects, whereas younger speakers can make the distinction you refer to (and make non-animate direct objects specific by adding ko, e.g. laṛke-ne kitāb-ko dekhā "the boy saw THE book", whereas older speakers don't like to add ko to inanimate direct objects). There's an additional complication in that ko also marks indirect objects...

    @John: I don't know the plot of Avatar, but my null hypothesis would be that the Na'vi linguistic ability would have been the result of evolution, like that of humans, and thus they wouldn't have devised their language at all. (Unless it's a sort of trade jargon/pidgin resulting from Na'vi-Human contact.)

  5. I ought to give up Classics and take up professional conlanging for the movie industry...


  6. be_slayed: There are a number of alternatives: a pidgin, as you say; a conlang devised by a Na'vi (or a committee of them) for the purpose of talking to humans, which is what I had in mind; Na'vi baby talk; the equivalent of talking LOUDLY and SLOWLY to foreigners who don't understand you (in this kind of Italian, e.g., all verbs are in the infinitive form).

    Mattitiahu: Stick with Classics. You'll get rich a lot faster.

  7. "Stick with Classics. You'll get rich a lot faster."


  8. Yes, you got the Partee-Montague story right, and I'm tickled to find it here. @be_slayed: Yes, it's written down, in my "Reflections of a Formal Semanticist", in both the published version and in the longer version available here: .