Everything2
Near Matches
Ignore Exact
Full Text
Everything2

Swadesh list

created by Mercuryblues

(idea) by Gritchka (2.5 y) (print)   ?   (I like it!) 1 C! Sun Jul 04 2004 at 16:33:31

In the early 1950s the linguist Morris Swadesh proposed to study the dating of language change by using a list of one hundred basic concepts, assumed to exist in all languages, and comparing how many items were different between related languages.

The approach he founded is called glottochronology, or sometimes lexicostatistics. It makes two key assumptions, first that these basic terms (earth, man, egg, sit, bite, I, this) are likely to be more conservative, and not be replaced by borrowings or meaning shifts as much as other, more cultural items would be. The Swadesh 100 list would track the oldest and most stable core of languages.

The second assumption is that replacement of these items happens at some fairly identifiable constant rate, so that the amount of difference between two languages maps linearly to the time since they parted from their common ancestor. Swadesh calibrated his list on the best established dates then known, basically those of Indo-European languages, and came up with a figure of 14% replacement per millennium. This also means that language relationships can be traced no further back than about 7000 years, since by then they resemble each other no more than random noise even if they are in fact related.

Unfortunately this second assumption is wrong: languages change at variable rates. They all change continuously, but word replacement need not proceed at the same rate as change in grammar or pronunciation. For example, Icelandic has changed in pronunciation at a normal rate over the last thousand years, but its Swadesh 100 list is almost identical over that period. By contrast, in the last thousand years English has changed greatly: the massive influx from Norman French affects about half the dictionary, though in the basic 100 list only a few words are from French (mountain, person, round).

The first assumption might be okay. Comparison of Swadesh list probably does give us a rough measure of the degree of relatedness. We can see that English and German are more closely related to each other than either is to Danish, and that they are all equally distant from French or Russian, and unrelated to Arabic. It is only the calibration to specific time depths that has been discredited, and very few if any linguists still try to use it for dating these days.

The European languages have been written down for a long time, and this disguises differences: in print English all and German alle are almost identical, but the main use of the Swadesh list would be on much less well-known languages, such as unwritten Amazonian ones, where we want to reconstruct their history. So we should compare English [O:l] and German [al@].

The list of 100 basic words:

all
ashes
bark
belly
big
bird
bite
black
blood
bone
breast
burn
claw
cloud
cold
come
die
dog
drink
dry
ear
earth
eat
egg
eye
fat
feather
fire
fish
fly
foot
full
give
good
green
hair
hand
head
hear
heart
horn
I
kill
knee
know
leaf
lie
liver
long
louse
man
many
meat
moon
mountain
mouth
name
neck
new
night
nose
not
one
person
rain
red
road
root
round
sand
say
see
seed
sit
skin
sleep
small
smoke
stand
star
stone
sun
swim
tail
that
this
tongue
tooth
tree
two
walk
warm
water
we
what
white
who
woman
yellow
you
The number 100 is arbitrary; there are also lists of 200 words or thereabouts, and these contain a few terms like 'rain' and 'snow' that won't be found everywhere. The principle of translation of the lists is to try to find a general term rather than a specific one (so 'woman' rather than 'wife'). It should also be noted that several of the English words are ambiguous, in the sense that only one of their senses is the one to be translated: 'child' means young human, not offspring; 'man' means male adult; 'skin' means human skin; 'know' is knowing a fact, not a person; 'you' is singular.

You also use the ordinary word, and don't hunt out less usual near-synonyms that you know to be cognate, because this would distort the list in favour of the known relationships. In the nature of things, we calibrate these lists on language groups (like Indo-European) where we have some independent idea of time depths; but we use it to try and extract time-depth information from groups we don't have a history for. So we want to avoid a bias towards cognates that have drifted apart.

A recent issue of Nature (vol. 426, 27 November 2003) contained a brief paper "Language Tree Divergence Times Support the Anatolian Theory of Indo-European Origin" by two biologists, Russell Gray and Quentin Atkinson, claiming to have used improved statistical techniques to extract usable dating from Indo-European comparisons. The reaction of the linguistic community was one of puzzlement and scepticism on the whole: the paper itself was not clear enough about the techniques to judge if they could really control for the random fuzziness and variable rates across the family.

For more on the Gray--Atkinson paper see http://itre.cis.upenn.edu/~myl/languagelog/archives/000208.html


printable version
chaos

glottochronology How to get SMS death threats from coke dealers in London Morris Swadesh If you had to invent a language with no more than one hundred words
English is not a Romance language swadeshi Indo-European Swedish fish in the bottom of your beer glass
1574 Cognate universal culture Puerto Rican Cuisine
Anatolian Relatedness Calibration The Swedish Chef
D'Anville armada historical linguistics lexicostatistics English
Y'know, if you log in, you can write something here, or contact authors directly on the site. Create a New User if you don't already have an account.
  Epicenter
Login
Password

password reminder
register

Everything2 Help

Cool Staff Picks
Look at this mess the Death Borg made!
Prester John
Hurricane Katrina
This ocean is angry but I might live through it
Egyptian rat screw
Life of Pi
All Things Kink
My body is a battlefield, and all my breasts ever do is argue about existentialism
Civil Rights Act of 1964
You don't know my GOD
Eggs and tomatoes
Editor Log: December 2003
Mystery Science Theater 3000
Knowing how to sleep with someone
New Writeups
octillion369
Frost wyrm(person)
kalen
Three "T"s(idea)
octillion369
Undead(idea)
archiewood
Ico(fiction)
Heisenberg
Why I love Everything2(log)
octillion369
Death Knight(person)
XWiz
Are you hoping for a miracle?(review)
santo
The Host(review)
LostPsion
"Shut the Fuck Up" Theaters(idea)
Vanish
The line between normal and not(place)
Vanish
insanity(thing)
beatrice
You've been slowly taking me over for nearly a year, do you know that?(idea)
Berek
YouTube(thing)
shaogo
How to Pretend to Have a Job(idea)
hapax
Les Provinciales(review)
This affordable entertainment brought to you by The Everything Development Company