A Comparative Study of Andean Languages
Research Questions, Methods and Data
Questions: Looking Into the Pre-History of the
Our study involves two different types of comparison, addressing two different types of research question.
Firstly, we aim to provide additional data for questions of the
internal classification of each of the main
two language families of the
• a comparison of all the varieties of Quechua amongst themselves;
• a separate comparison between all varieties of Aymara amongst themselves (i.e. including Jaqaru/Kawki).
Secondly, we aim to look into what quantified comparative data tell us that might help elucidate the thorny issue of the nature of the relationship between the two language families of Quechua and Aymara: the famous Quechumara question of whether they are ultimately related ‘genealogically’, or whether the striking parallels between them go back only to convergence through intense and prolonged contact throughout their histories. This involves making comparisons between language varieties that we are not sure share a common origin in the first place, comparisons of any variety of Quechua against any variety of Aymara.
While these issues are essentially linguistic issues, since
we have no written history of the
• What are the most plausible ranges of dates for when each family first began to break up?
• In what stages through history did they expand to reach the regions where they are now spoken?
• Where are the most plausible locations for the original Quechua and Aymara ‘homelands’?
Our Methodological Approach: Measuring Language Similarity
This comparative study of the Andean languages is part of a larger research project measure the similarity and relationships between languages. That is, we seek to quantify, just how similar or how different various language varieties are relative to each other., and as its title suggests, our approach to looking into such issues of classification is by means of seeking to
The Quechua and Aymara families together present a continuum of degrees of difference/similarity between language varieties, from certain only minimally different regional ‘accents’ of Quechua, to entirely different languages which may not even be related to each other. We produce quantified comparisons over various spans of this continuum – that is, comparisons between pairs of language varieties showing all the various possible degrees of difference. Naturally this is a gradual scale of degree of difference, though in the familiar terminology one might talk of the following four distinct levels, i.e. we make comparisons between:
regional variation within the same ‘dialect’:
for example between the Quechua spoken in the
• Varieties which while markedly different, are generally still considered ‘dialects’ belonging to the ‘same language’: for example Ayacucho, Bolivian, and the even more different Ecuadoran Quechua.
• Different languages, though clearly genealogically (‘genetically’) related ones, i.e. from the same family (Quechua or Aymara): for example the mutually unintelligible Quechua of Cuzco, and that of Huancayo.
• Quite different languages, where it is not yet clear whether they share an ultimate common origin within a single family or not, or show similarities only due to prolonged and deep contact: Aymara and Quechua.
Which Data? Lexis, Phonetics and Morphosyntax
Data have been collected in order to make a detailed comparison of these varieties in two aspects:
In their basic
lexicon, based on a list of 150 word-meanings adapted to the
cultural and linguistic context of the Andes (and to a certain extent also
• In their phonetics, based on the pronunciation of a sample list of some 100 ‘pan-Quechua’ cognates (and a different 100 ‘pan-Aymara’ ones), many of which overlap with cognates found for the 150 meanings in the lexical comparison.
A further possibility for which we have also developed a method, though not yet collected data for the Andean languages, is to measure their similarity in certain aspects of their basic inflectional morphosyntax, which in these highly agglutinating languages principally means their morphology.
Methods: Producing and Analysing Measures of Similarity
All the methods we use to produce quantifications of similarity in these three fields of language are basic morphosyntax are set out in full in the book Measured Language Heggarty (in preparation), to be published by Blackwell in late 2005. (This is a full revision and expansion based on Heggarty’s Ph.D. thesis: click on these links for either a brief abstract or a fuller description).
The method for similarity in lexis, and its specific application to the Andean languages using the data in our study, are due to appear in January 2005 in Heggarty (forthcoming).
Details on the method we use to produce quantifications of phonetic similarity, and examples of the results it produces for Romance varieties and a set of Indo-European languages, have already been published in Heggarty (2000) and will also appear in McMahon, Heggarty, McMahon & Slaska (in press).
Having produced our quantifications of language similarity, stage two in our research approach is to process these figures using various ‘family tree‑drawing’ programmes, initially devised for similar uses in biology, particularly genetics. These include Phylip by Felsenstein (2001), Network by Bandelt et al. (1995), and especially the very recent NeighbourNet by Bryant & Moulton (2002). These are explained in McMahon & McMahon (in preparation).
Our first publications specifically on our results for the Andean
languages, starting with the lexical data, will appear in January 2005 in Heggarty (forthcoming). This shows how we make use of these combined techniques to bring new insights to the
analysis of linguistic data in problematic cases such as those of the
In the meantime, we have a full list of the papers already published by our research group which can be found by clicking here.
References and Bibliography
Any work cited on these webpages that forms part of our main online bibliography for the Andean languages appears as a clickable link that takes you to the full bibliographical entry for it on our bibliography webpage. (We plan to replace this system later with a frames version so that the entry appears in a window on the page you clicked from.) The references given below are for other more general linguistics works we cite that are not in our Andean bibliography.
H-J. & P. Forster, B. C. Sykes & M. B. Richards (1995) Mitochondrial
portraits of human populations using median networks
in: Genetics - 141: 743-753
Bryant, David & V. Moulton
NeighborNet: an agglomerative method for the construction of planar phylogenetic networks
Proceedings of the Workshop in Algorithms for Bioinformatics
programme can be downloaded data available at:
Isidore & Joseph B. Kruskal & Paul Black (1992) An
Indoeuropean classification: a lexicostatistical experiment
in: Transactions of the American Philosophical Society - 82
data available at: www.ldc.upenn.edu
Sheila M., (1986)
J. (2001) PHYLIP: Phylogeny Inference Package. Version 3.6
Department of Genetics,
Peter & Alfred Toth (2003) Toward a phylogenetic chronology of ancient Gaulish,
Celtic, and Indo-European
in: Proceedings of the National Academy of Sciences - 100:15: 9079 9084
P.A. (2000) Quantifying Change Over Time in Phonetics
in: Renfrew, C. & McMahon A. Trask L, (Eds): Time-Depth in Historical Linguistics - 2: 531-562
MacDonald Institute for Archaeological Research:
Paul A. (forthcoming)
Enigmas en los orígenes de los idiomas andinos: aplicando nuevos métodos a las preguntas aún no resueltas
Revista Andina, 40
Paul A. (in preparation)
Measured Language: From First Principles to New Techniques for Putting Numbers on Language Similarity
April & McMahon (in preparation) Language
Classification by Numbers
McMahon, April, Paul Heggarty, Robert McMahon &
Swadesh sublists and the benefits of borrowing: an Andean case study
in: McMahon, April (ed.): Quantitative Methods in Language Comparison
Transactions of the Philological Society, 103.2
(1999) Methods for the Genetic Classification of Languages
in: Unpublished PhD thesis,
Sergei A. (1991) Altaiskaia problema i proiskhozhdenie iaponskogo
Nauka, Glavnaia redaktsiia vostochnoi literatury:
Morris (1952) Lexico-statistical dating of prehistoric ethnic
contacts: With special reference to North American Indians and Eskimos.
in: Proceedings of the American Philosophical Society - 96: 452-463