Category Archives: Morphology

Uyghur’s myriad verb forms produce over 42 million words

In a previous post I wrote about the latest candidate for longest word in the Uyghur language. In that post we saw that agglutinative languages like Uyghur can produce extremely long words. Another feature of such languages is the sheer number verbs forms that can be produced.  Mood, voice, aspect, tense person and number are all conveyed by suffixes which must be attached to the root verb in the appropriate order.

Recently Alim Ahat, founder of Uighursoft, wrote an article suggesting that Uyghur may have a store of some 50 million words when you include all the possible forms of every verb. Alim was working on a new edition of Uighursoft’s spell-checker when he took the opportunity to carry out some mathematical analysis of Uyghur verb inflections.

By his calculations, the total number of possible verb forms in modern literary Uyghur is 8,455. When he ran that by all available verbs in his software he came up with a store of an astonishing 42,613,200 synthetic words!

Alim says that no matter what form of what verb you write, his spell-checker will find it 99.99% of the time, and the total number of words and expressions stored in his software tool approaches 50 million.

In case you’re having trouble believing all this, he provides an example of what happens when you enter the verb root aptomatlash- (“to be automated”) into the software. With the press of a button it generates a list of 5040 different forms. Here are the first ten, and the 5040th:

1. aptomatlashmaq
2. aptomatlishish
3. aptomatliship
4. aptomatlashqan
5. aptomatlishidu
6. aptomatlashti
7. aptomatlishidighan
8. aptomatlishe
9. aptomatlishishqa
10. aptomtalishishi
5040. aptomatlashturulghudekla

Perhaps most astonishing of all is that as I look down the list I am yet to see a form that I do not recognize as a possible form of the word. In other words, a person who knows the language could (in theory) form and use any of these words. It seem the human brain is indeed “hard-wired” for language (Chomsky?) and has an amazing capacity to synthesize language according to the “rules”.

How long can one word be?

One of the fun little sidelines that come with learning an agglutinative language like Uyghur is  competing to find the longest possible word.   Things that need a whole sentence to say in English can be expressed in Uyghur with one long word.

I remember talking some years ago with a linguist who was studying the Kyrgyz language. Kyrgyz morphological rules  are so utterly strict and consistent that he was able to devise a computer program to produce for him the longest possible word in Kyrgyz. (He never told me the actual word and I am afraid I do not remember how many syllables it had.) In any case, once ha had the word, the next question was: Is this word a real part of the living Kyrgyz lexicon or is it purely hypothetical?

To find the answer, the linguist went to a remote part of Kyrgyzstan and sought out and elderly Kyrgyz man.  He presented a scenario involving a certain person carrying out a certain action in a certain way in a certain context, and so on.  He then asked how this man would sum up the situation in his own words.  Sure enough, so the linguist claimed, this senior Kyrgyz spontaneously used the very same word the computer had come up with.

Uyghur has some features that would make it harder to program than Kyrgyz. Nevertheless, the search is always on for the longest word. Recently I came across a contender that was put forward with the offer of a prize to anyone who could beat it. The word consists of 47 letters and 18 syllables.

Here it is in kona yeziq and UKY:



I am not convinced that the construction “-mayliwat-” is valid.  If we reduce it to “-maywat-” then the word looks correct to me.  Even then, this word still weighs in at a respectable 17 syllables.

So, what does the word actually mean?

Here’s my stab at it:

I dunno, maybe it’s because you all are unable to bring them together.

What do you think?