Thursday, 16 September 2010

Philology and (La)Tex: on Proto-Indo-European dragon-slaying and Hittite ḫ

A couple of years ago I made the switch from Word to LaTex. At the time I was in the middle of writing a paper on formulaic language in Proto-Indo-European, specifically working on the reconstruction of formulae connected with the PIE dragon-slaying mytheme. Though the (first draft of the) paper was mostly written, I decided I would reset it in LaTeX. This was a rather labourious task, but resulted in much more aesthetically-pleasing document, and LaTeX allows for a much easier system of referring to numbered examples than does Word (amongst other benefits of the LaTeX type-setting system). [I use Wolfgang Sternefeld's linguex package for example numbering.]

As this was a philological paper dealing with a number of different languages (Old Irish, Old English, Old Saxon, Gothic, Vedic Sanskrit, Classical Greek, Avestan, Pahlavi, and Hittite), special diacritics and characters were required. Rei Fukui's TIPA package handles almost all of the characters/diacritics which were needed. The one exception was the Hittite "laryngeal " and polytonic classical Greek.

I. How to typeset Hittite in LaTeX:
The character may be defined by the following macro (assuming that the TIPA package has been loaded in the preamble by \usepackage{tipa}):
Then whenever is required, it may be called via the command {\hith}, as in the following text:
n=an=za namma \super{\sc{mu\v{s}}} illuyanka$[$n$]$ tara{\hith}{\hith}\={u}wan d\=ai\v{s}
which results in:
n=an=za namma MUŠilluyanka[n] taraḫḫūwan dāiš
(meaning "He (the storm god) began to overcome the serpent"; from KBo. 3.7 iii 24-5)

II. How to typeset classical Greek in LaTeX:
In the philological tradition, the only language using a non-Latinate script which is not transliterated is Greek (I've always found this a bit unfair: why isn't Sanskrit rendered in Devanagari?). To typeset polytonic (ancient) Greek in LaTeX, we'll need the following packages: babel, teubner, fontenc, cbgreek. Defining a macro \greekfont then allows us to switch to polytonic Greek.

A minimal example illustrating the usage:
\usepackage{mathptmx} %OPTIONAL, in order to set Latin/English in Times font
\usepackage{tipa} %OPTIONAL, for typesetting diacritics/special characters for other lgs.


{\noindent}From Pindar's \textit{Olympian} 13.63--4:
{\noindent}\greekfont{\Ar{o}c t\cap{a}c \s{o}fi\'hdeos u\r{i}\'on pote Gorg\'onos \cap{\s{h}} p\'oll> \s{a}mf\`i krouno\cap{i}c\\P\'agason ze\cap{u}xai poj\'ewn \Gs{e}pajen}


(meaning "who (rel. pro.) beside the Springs, striving to break the serpent Gorgon's child, Pegasos, endured much hardship")

You'll need to make sure you have the full cbgreek package, otherwise the Greek font will be blurry and ugly.

III. Post scriptum
Here's the rub (of course): having produced a beautifully typeset document, I submitted it to Historische Sprachforschung (Adalbert Kuhn's old Zeitschrift für vergleichende Sprachforschung). It was accepted, but the journal could only process Word documents. So I had to go back and retypeset the whole thing in Word (again). This involved a lot of using find-and-replace (to turn LaTeX code/macros into Unicode characters or Word formatting or example numbers etc.) Unfortunately, this almost meant that the table I had managed to fit on a single page (in order that the various formulae could be easily compared) using smaller font sizes and rotating it horizontally using the package lscape, thus

is in HS split across three pages...

Happily, the original LaTeX-produced version did in fact appear earlier in Studies in the Linguistic Sciencies: Illinois Working Papers 2009 (who do accept LaTeX submissions (since I designed a LaTeX style file for the journal...)).


[1] Slade, Benjamin. 2008[2010]. How (exactly) to slay a dragon in Indo-European? PIE *bheid- {h3égwhim, kwŕ̥mi-}. Historische Sprachforschung 121: 3-53. [link]
[2] Slade, Benjamin. 2009. Split serpents and bitter blades: Reconstructing details of the PIE dragon-combat. Studies in the Linguistic Sciences: Illinois Working Papers 2009: 1-57. [link]

Wednesday, 8 September 2010

English "like" can, like, function like Sanskrit "इति"

In a recent blog post, "How Old is Parasite 'Like'?", Oxford Etymologist Anatoly Liberman explores the history of (modern) English like when used as a type of filler/discourse-marker. However, modern English like has another function, which I think is often conflated with filler like (presumably because it commonly occurs in the speech of people who also use filler like): namely, as a sort of quotative marker (also noted by commenters Mike Gibson and Charles Wells).
She was like OMG! And then I was like wow!
The above sentences might be "translated" as:
She said, "Oh my god!" And then I said, "Wow!"
Or (since there seems to be some ambiguity):
She said, "Oh my god!" And then I thought, "Wow!"
This use as a sort of quotation mark is a separate function from its pragmatic discourse-marking use (which Liberman focusses on) in examples like:
You wanna, like, go see a movie?
Which might be uttered by a teenage boy asking a girl out on a date , where like can either act as a "hedge", foreseeing the possibility of rejection ("'s ok if you don't want to"), or to allow for the possibility of other activities ("...or get some ice cream").

Both uses of like are stigmatised; again, the stigmatisation of "quotative" like is probably via guilt-by-association with the discourse-marking/filler like. I'll admit that, like Liberman, I find both uses rather aesthetically displeasing (which doesn't mean that I never use them---they are, as Liberman suggests, somewhat viral). But the "quotative" like is interesting. Though Liberman remarks that:
Particularly disconcerting is the fact that the analogs of like swamped other languages at roughly the same time or a few decades later. Germans have begun to say quasi in every sentence. Swedes say liksom, and Russians say kak by; both mean “as though.” In this function quasi, liksom, and kak by are recent. The influence of American like is out of the question, especially in Russian. So why, and why now? Delving into the depths of Indo-European and Proto-Germanic requires courage and perspicuity. But here we are facing a phenomenon of no great antiquity and are as puzzled as though we were trying to decipher a cuneiform inscription.
Interestingly, however, the "quotative" function of English like has a couple of parallels in Sanskrit. One is these---the one most closely resembling English "quotative" like, at least in its frequency---is the Sanskrit particle iti (इति).

Both Sanskrit iti and English like can occur in the following contexts:
A. When quoting words actually utttered, alongside a verb of speaking:
(Skt-1) kathitam avalokitayā "madanodyānam gato mādhava" iti
"Avalokita had told me that Madhava was gone to the grove of Kama." [Mālatīmādhava I, p. 11; cited from Speijer[1]:§493a]

(Eng-1) "She said like 'I want to go too'."

B. Expressing the contents of one's thought:
(Skt-2) manyate pāpakam kṛtvā "na kaścid vetti mām" iti
"After committing some sins, one thinks 'nobody knows me'." [Mahabharata 1.74.29; cited from Speijer[1]:§493b]

(Eng-2) “And I thought like 'wow, this is for me'.” [OED, 2nd Supplement[2]; 1970, no earlier citations]

C. More general setting forth of motives, emotions, judgements etc.:
(Skt-3) vyāghro mānuṣam khādati iti lokāpavādaḥ
"'The tiger eats the man' is slanderous gossip." [Hitopadesha 10; cited from Speijer[1]:§493c]

(Eng-3) "I was like 'wow'!"
There are obvious differences between English quotative like and Sanskrit iti, including the fact that English quotative like precedes the "quotation", while Sanskrit iti follows it (in conformity with the general left-branching nature of Sanskrit syntax).

Further, Sanskrit iti doesn't have any of the other functions or meanings associated with English like. English like derives ultimately from Proto-Germanic *lîko- "body, form, appearance", while Sanskrit iti is built from the pronominal stem i-. In fact, iti still has pronominal uses, even in Classical Sanskrit, as in the following example.
(Skt-4) tebhyas pratijnāya nalaḥ kariṣya iti
"Nala promised them he would do thus." [Nala 3,1; cited from Speijer[1]:§492]
Amusingly, I find that (pretending that a parallel development has taken place in English) replacing "quotative" like with thus actually seems grammatical to me---though wholly unidiomatic, e.g.:
(Eng-4) "I was thus: 'Wow!'"
(Somehow I imagine that if thus had been recruited as a quotative in English rather than like, the use of a quotative marker wouldn't be so stigmatised, since there would be no association with filler like and, moreover, thus is largely used in formal registers of English.)

However, there is another element in Sanskrit which---though not as frequently used in this function as iti---actually is more similar to English quotative like in its syntax and semantics: yathā. Yathā is, properly speaking, a relative pronoun and is often part of relative-correlative constructions of the form yathā X...tathā Y "As X...., so Y". However, it can occur without correlative tathā, and in fact can have the meaning "like", as in the following example:
(Skt-5) mansyante mām yathā npam
"They will consider me like a king." [Mahabharata 4.2.5; cited from Speijer[1]:§470a]
Yathā can also function as a sort of quotative, but---unlike iti and like like---it precedes rather than follows the quoted discourse:
(Skt-6) viditam eva yathā "vayam malayaketau kimcitkālāntaram uṣitāḥ".
"It is certainly known (to you) that I stayed for some time with Malayaketu." [Mudrarakshasa VII; cited from Speijer[1]:§494]
(Or, maybe: "You certainly know, like, 'I stayed for some time with Malayaketu'.")
(Yathā and iti (since they occupy different syntactic positions) can also co-occur.)

So there is at least one antique parallel for the development of modern English like as a quotative marker.

Returning to the more commonly used iti, the following Sanskrit example---occurring when one of the heroes of the Mahabharata has performed an act of generosity so great that even the gods are impressed---I think is a great parallel for examples like "I was like, 'Wow!'":

(Skt-7) tato 'ntarikṣe vāg āsīt "sādhu sādhv" iti
"Then a voice in the sky was like 'Wow! Wow!'" [Mahabharata 14.91.15]
This line might be more usually translated as "then a voice in the sky said 'Bravo! Bravo!'", but there is actually no verb of speaking: āsīt means "was".

[1] Speijer, J.S. 1886.
Sanskrit syntax. Leiden: E.J. Brill. [reprinted, Delhi: Motilal Banarsidass, 1973.]
[2] The Oxford English Dictionary, September 2009 rev. ed.

Monday, 30 August 2010

Co-ordination fail

One of the first topics in intro syntax classes is the notion of constituency, including a variety of tests which can be used to determine constituency. One of these tests is the co-ordination test: generally only items of the same syntactic category can be conjoined. Thus the following examples are fine: fresh and clean (coordination of adjectives), mad dogs and Englishmen (coordination of nouns [DPs]), (to) serve and protect (coordination of infinitive verbs). But verbs can't be conjoined in the same way with nouns, e.g. *I like mad dogs and to serve is bad; and prepositional phrases don't conjoin with nouns, e.g. *I like mad dogs and on top of the Empire State Building is also bad.

Here's a label I noticed which violates the co-ordination constraint:
*[DP Side Dish], [DP Soup Mix], [PP Over Rice]

Tuesday, 3 August 2010

Computational approaches to understanding language evolution [video]

In her recent ILLS talk, Tandy Warnow discusses computational approaches for inferring language evolution and linguistic relationships:

Computational methods for inferring evolutionary histories of languages

A page with links to various publications associated with this project (and the datasets used) is available here:

One of the interesting points of this study is the relationship of Germanic with respect to the other branches of Indo-European. Germanic, at least when the morphological data is given more weight, is not particularly closely related to Italic or Celtic, though it shares a number of lexical similarities with these groups. This is suggestive of a later migration of Germanic-speaking peoples into an area where they came into contact with Italo-Celtic speakers. In any case, it's an interesting approach to historical data.

For more ILLS2 videos, see this link:

Friday, 12 February 2010

Logic, Language, and Information: Summer course at Bloomington

The North American Summer School in Logic, Language, and Information (NASSLLI) is a summer school with classes in the interface between computer science, linguistics, and logic.

After previous editions at Stanford University, Indiana University, and UCLA, NASSLLI will return to Bloomington, Indiana, June 20–26, 2010. The summer school, loosely modeled on the long-running ESSLLI series in Europe, will consist of a number of courses and workshops, selected on the basis of the proposals. Courses and workshops meet for 90 or 120 minutes on each of five days, June 21–25, and there will be tutorials on June 20 and a day-long workshop on June 26. The instructors are prominent researchers who volunteer their time and energy to present basic work in their disciplines. Many are coming from Europe just to teach at NASSLLI.

NASSLLI courses are aimed at graduate students and advanced undergraduates in wide variety of fields. The instructors know that people will be attending from a wide range of disciplines, and they all are pleased to be associated with an interdisciplinary school. The courses will also appeal to post-docs and researchers in all of the relevant fields.

We hope to have 100-150 participants. In addition to classes in the daytime, the evenings will have social events and plenary lectures. Bloomington is a wonderful place to visit, known for arts, music, and ethnic restaurants. All of this is within 15 minutes walking from campus. We aim to make NASSLLI fun and exciting.

Tuesday, 2 February 2010

Ill Linguistics and Novel Technologies (Call for papers: 28 Feb '10 deadline)

ILLS2 (28-30 May 2010) is a student-run conference at the University of Illinois at Urbana-Champaign. The theme for this year's conference is Novel Technologies and Methodologies in Linguistics Research. The purpose of this theme is to inspire ideas and create enthusiasm for the ways in which we pursue research in Linguistics. Talks will involve the creation of new tools for Linguistic research, the novel use of old tools, experimental methods, studies of validity or authenticity, and, otherwise, studies that cause reflection in Linguistic research.

Talks from all subfields of Linguistics are welcome.

Invited Speakers:
Wayne Cowart (University of Southern Maine)
Bryan Gick (University of British Columbia)
Tania Ionin (University of Illinois Urbana-Champaign)
Richard Sproat (Oregon Health and Science University)
Tandy Warnow (University of Texas at Austin)

Conference Chairs:
Tim Mahrt (University of Illinois Urbana-Champaign)
Megan Osfar (University of Illinois Urbana-Champaign)

Call for Papers
Call Deadline: 28-Feb-2010

The online submission form can be found on the conference website:

ILLS welcomes the submission of general empirical and theoretical papers relevant to the field of linguistics and the language sciences. Special consideration will be given to applicants whose research fits within the conference theme of Novel Technologies and Methodologies in Linguistics Research. Relevant talks for this theme would involve at least one of the following: the use of new tools for Linguistic research, the novel use of old tools, experimental methods, studies of validity or authenticity, and, otherwise, studies that cause reflection.

ILLS requests the submission of abstracts summarizing the main points of the research paper, including hypotheses, methods, and conclusions.

ILLS also welcomes the submission of workshop proposals on advanced, emerging, or domain-specific applications, particularly where there is little available existing documentation. Where applicable, we invite those with a related paper to consider submitting a workshop proposal--however, independent workshops are just as welcome.

Suitable topics could involve technologies such as PRATT, eyetrackers, or EMA.

Abstracts are to be submitted in PDF format, and should be no more than 500 words in length, including examples (encouraged) and in-text citations. Full references are not necessary; please use the (Author, Year) format.

See the LSA model abstracts page for guidance in building an acceptable abstract.

You may submit at most: one single-author abstract and one multi-author abstract, or two multi-author abstracts. Additionally you may submit one workshop proposal. For abstracts co-authored with a faculty member, the student should be the primary author and should have carried out the bulk of the research and analysis. In addition, the student will be responsible for the presentation of the paper at the conference.

Abstracts are to be uploaded through the conference interface on the Abstract page.

Sunday, 24 January 2010

Accidental sarcasm: On focus semantic values and hotel lifts

Earlier this month I was in Baltimore for the annual LSA conference, and gave a talk on why wh-words (e.g. who, what, when etc.) need ordinary as well as focus semantic values (in contrast to Beck 2006, Cable 2007). Relevant to the calculation of focus semantic values, I spotted this notice in the conference hotel elevator:
The relevant bit: "protecting and empowering SO many ways".

Here so is focussed; presumably to emphasise the variety of ways in which Marylanders are protected and empowered. However, I can't help but read the focus on so as sarcastic (I can imagine someone saying "Oh yeah, you were SO helpful to me."), though I'm not sure why.

[Update (30.1.10): Does DLLR monitor the activities of this construction site?

Beck, Sigrid. 2006. Intervention effects follow from focus interpretation. Natural Language Semantics 14:1–56.
Cable, Seth. 2007. The grammar of Q: Q-particles and the nature of Wh-fronting, as revealed by the Wh-questions of Tlingit. Doctoral Dissertation, MIT, Cambridge, MA.