Since working in computational linguistics I have been interested in constructed languages (‘conlangs’). When trying to process any natural language with computer programs, you constantly run into inconvenient exceptions, in morphology, syntax, etc. A conlang such as Esperanto promises a great simplification: as it is completely regular, NLP software is likely to be much simpler and less complex. However, given that Esperanto is a full-scale language, it’s still not trivial to work with. It’s got a large vocabulary, and the syntax is not that easy to parse.
In a post on the Esperanto Language Stack Exchange I then heard of another conlang, Toki Pona. This is billed as a minimalist language: it only has about 120 words, and a very fixed and simplistic sentence structure. Unlike Esperanto, it is not really a proper language for everyday use, but more of a philosophical experiment. How does your language influence the way you think about the world? This is kind of related to the Sapir-Whorf Hypothesis. When you have to limit yourself to describing/paraphrasing everything with a limited vocabulary, you need to reflect more about what it is you’re talking about. For example, a friend is a jan pona (‘good person’), and a bad person is a jan ike. So how do you say ‘bad friend’? You are not able to say someone is both good and bad at the same time. So a friend cannot be a bad person, or vice versa.
Toki Pona arguably is a toy language only. There is just one word for fruit, vegetable, etc.: kili. If you want to say banana, you say kili jelo (“yellow fruit”). If you want to talk about lemons as well, you’re out of luck. It’s very context dependent, and definitely not useful for a scientific treatise, or even recipes. But it’s fine for basic stories, myths and legends, and so on. And, most importantly, it’s easy to learn. No morphology. Very limited syntax. Small vocabulary. The main difficulty is to express yourself given those limited means. But we grow only when challenged!
I’m interested in doing NLP with Toki Pona, as it is so limited. It should be possible to quickly get to the semantic or pragmatic levels, as morphology and syntax will be dealt with easily. Analysing and generating sentences should be extremely easy. More on that as it materialises.
If you’re interested in languages, and what to explore the way you express meaning, give Toki Pona a try. There are various on-line resources available, plus a text book Toki Pona – The Language of Good.
I’ll post more on this at a later time…