A Comparative Study of
Andean Languages
Research Questions, Methods and Data
Contents
Research Questions:
Looking Into the Pre-History of the Andes
Our Methodological Approach: Measuring Language Similarity
Which Data? Lexis,
Phonetics and Morphosyntax
Methods: Producing
and Analysing Measures of Similarity
Research
Questions: Looking Into the Pre-History of the
Our study involves two different types of comparison, addressing two different types of research question.
Firstly, we aim to provide additional data for questions of the
internal classification of each of the main
two language families of the
• a comparison of all the varieties of Quechua amongst themselves;
• a separate comparison between all varieties of Aymara amongst themselves (i.e. including Jaqaru/Kawki).
Secondly, we aim to look into what quantified comparative data tell us that might help elucidate the thorny issue of the nature of the relationship between the two language families of Quechua and Aymara: the famous Quechumara question of whether they are ultimately related ‘genealogically’, or whether the striking parallels between them go back only to convergence through intense and prolonged contact throughout their histories. This involves making comparisons between language varieties that we are not sure share a common origin in the first place, comparisons of any variety of Quechua against any variety of Aymara.
While these issues are essentially linguistic issues, since
we have no written history of the
• What are the most plausible ranges of dates for when each family first began to break up?
• In what stages through history did they expand to reach the regions where they are now spoken?
• Where are the most plausible locations for the original Quechua and Aymara ‘homelands’?
Our
Methodological Approach: Measuring
Language Similarity
This comparative study of the Andean languages is part of a larger research project Quantitative Methods in Language Classification, and as its title suggests, our approach to looking into such issues of classification is by means of seeking to measure the similarity and relationships between languages. That is, we seek to quantify, just how similar or how different various language varieties are relative to each other.
The Quechua and Aymara families together present a continuum of degrees of difference/similarity between language varieties, from certain only minimally different regional ‘accents’ of Quechua, to entirely different languages which may not even be related to each other. We produce quantified comparisons over various spans of this continuum – that is, comparisons between pairs of language varieties showing all the various possible degrees of difference. Naturally this is a gradual scale of degree of difference, though in the familiar terminology one might talk of the following four distinct levels, i.e. we make comparisons between:
•
‘Accents’, or
regional variation within the same ‘dialect’:
for example between the Quechua spoken in the
• Varieties which while markedly different, are generally still considered ‘dialects’ belonging to the ‘same language’: for example Ayacucho, Bolivian, and the even more different Ecuadoran Quechua.
• Different languages, though clearly genealogically (‘genetically’) related ones, i.e. from the same family (Quechua or Aymara): for example the mutually unintelligible Quechua of Cuzco, and that of Huancayo.
• Quite different languages, where it is not yet clear whether they share an ultimate common origin within a single family or not, or show similarities only due to prolonged and deep contact: Aymara and Quechua.
Which Data?
Lexis, Phonetics and Morphosyntax
Data have been collected in order to make a detailed comparison of these varieties in two aspects:
•
In their basic
lexicon, based on a list of 150 word-meanings adapted to the
cultural and linguistic context of the Andes (and to a certain extent also
• In their phonetics, based on the pronunciation of a sample list of some 100 ‘pan-Quechua’ cognates (and a different 100 ‘pan-Aymara’ ones), many of which overlap with cognates found for the 150 meanings in the lexical comparison.
A further possibility for which we have also developed a method, though not yet collected data for the Andean languages, is to measure their similarity in certain aspects of their basic inflectional morphosyntax, which in these highly agglutinating languages principally means their morphology.
Methods: Producing and Analysing Measures of Similarity
All the methods we use to produce quantifications of similarity in these three fields of language are basic morphosyntax are set out in full in the book Measured Language Heggarty (in preparation), to be published by Blackwell in late 2005. (This is a full revision and expansion based on Heggarty’s Ph.D. thesis: click on these links for either a brief abstract or a fuller description).
The method for similarity in lexis, and its specific application
to the Andean languages using the data in our study, are due to appear in
January 2005 in Heggarty (forthcoming).
Details on the method we use to produce quantifications of
phonetic similarity, and examples of the results it produces for Romance
varieties and a set of Indo-European languages, have already been published in Heggarty (2000) and will also appear in McMahon,
Heggarty, McMahon & Slaska (in press).
Having produced our quantifications
of language similarity, stage two in our research approach is to process these
figures using various ‘family tree‑drawing’ programmes, initially devised
for similar uses in biology, particularly genetics. These include Phylip by Felsenstein (2001),
Network by Bandelt et
al. (1995), and especially the very recent NeighbourNet
by Bryant &
Moulton (2002). These
are explained in McMahon & McMahon (in preparation).
Our first publications specifically on our results for the Andean
languages, starting with the lexical data, will appear in January 2005 in Heggarty (forthcoming). This shows how we make use of these combined techniques to bring new insights to the
analysis of linguistic data in problematic cases such as those of the
In the meantime, we have a full
list of the papers already published by our research group which can be found
by clicking here.
References and Bibliography
Any work
cited on these webpages that forms part of our main online bibliography for the Andean languages appears as a clickable link that takes you to the full bibliographical entry for it on
our bibliography webpage. (We plan to
replace this system later with a frames version so that the entry appears in a
window on the page you clicked from.)
The references given below are for other more general linguistics works
we cite that are not in our Andean bibliography.
Bandelt,
H-J. & P. Forster, B. C. Sykes & M. B. Richards (1995) Mitochondrial
portraits of human populations using median networks
in:
Genetics - 141: 743-753
Bryant, David & V. Moulton
(2002)
NeighborNet: an agglomerative method
for the construction of planar phylogenetic networks
Proceedings of the Workshop in Algorithms
for Bioinformatics
programme can be downloaded data available at:
http://www-ab.informatik.uni-tuebingen.de/software/jsplits/welcome_en.html
Dyen,
Isidore & Joseph B. Kruskal & Paul Black (1992) An
Indoeuropean classification: a lexicostatistical experiment
in: Transactions of the American
Philosophical Society - 82
data available at: www.ldc.upenn.edu
Embleton,
Sheila M., (1986)
Brockmeyer:
Felsenstein,
J. (2001) PHYLIP: Phylogeny Inference Package. Version 3.6
Department
of Genetics,
Forster,
Peter & Alfred Toth (2003) Toward a phylogenetic chronology of ancient Gaulish,
Celtic, and Indo-European
in:
Proceedings of the National Academy of Sciences - 100:15: 9079 9084
Heggarty,
P.A. (2000) Quantifying Change Over Time in Phonetics
in:
Renfrew, C. & McMahon A. Trask L, (Eds): Time-Depth in Historical
Linguistics - 2: 531-562
MacDonald
Institute for Archaeological Research:
Heggarty,
Paul A. (forthcoming)
Enigmas en los orígenes de los idiomas andinos: aplicando nuevos métodos a las preguntas aún
no resueltas
Revista Andina, 40
Heggarty,
Paul A. (in preparation)
Measured Language: From First
Principles to New Techniques for Putting Numbers on Language Similarity
McMahon,
April & McMahon (in preparation) Language
Classification by Numbers
McMahon, April, Paul Heggarty, Robert McMahon &
Swadesh sublists and the benefits of borrowing: an Andean case study
in: McMahon, April (ed.): Quantitative
Methods in Language Comparison
Transactions of the Philological Society, 103.2
Lohr, Marisa
(1999) Methods for the Genetic Classification of Languages
in:
Unpublished PhD thesis,
Starostin,
Sergei A. (1991) Altaiskaia problema i proiskhozhdenie iaponskogo
iazyka
Nauka,
Glavnaia redaktsiia vostochnoi literatury:
Swadesh,
Morris (1952) Lexico-statistical dating of prehistoric ethnic
contacts: With special reference to North American Indians and Eskimos.
in:
Proceedings of the American Philosophical Society - 96: 452-463