URO: A new language for Europe

How are new words made?

Raymond Meester
7 min readJan 4, 2023

The vocabulary of Esperanto is mostly derived from Latin and German. Interlingua already uses more, primarily English, French, Italian, Spanish, and Portuguese and secondarily German and Russian. In Uro we use even more languages and also include Slavic languages.

In this last chapter, we define the rules to create new words in Uro. Which European languages does Uro use as ‘control languages’? Think for example of the word “mobile phone”. This is relatively a new word. Still, it’s interesting how differently it’s used in various languages:

  • Mobile phone (English — UK)
  • Mobiel (Dutch)
  • Handy (German)
  • Móvil (Spanish)
  • Portable (French)
  • Cellulare (Italian)
  • Komórka (Polish)

I’m not arguing that it’s not a beautiful thing that every language is different. That every language founds a great name for the device in our pockets. What I argue for is: Is it possible that we agree on one international variant? In Esperanto this is for example poŝtelefono, but this makes no sense. Much more sensible would be in this case “Mobil”, as it would be understandable and can be derived from English, Spanish, Dutch as well as most Scandinavian and Balkan languages.

The rules

The word “mobil” was derived by comparing with the various European languages. There are different ways to construct words, taking the various European languages into account.

Examples are:

  1. Take all European languages (EU and outside) into account. This would include regional languages like Catalan or Frisian. There are about 200 spoken languages in Europe
  2. Take only official European Union languages into account. Currently, there are 24. https://en.wikipedia.org/wiki/Languages_of_the_European_Union)
  3. Use the top 10 of most spoken languages in Europe. These are (in order) Russian, German, French, English, Turkish, Italian, Spanish, Ukrainian, Polish and Dutch.
  4. Top 6 EU languages with most native speakers. These are German, English, French, Italian, Spanish and Polish. This is 70% of native speakers of the Union.
  5. Top 3 secondary languages (English 43%, German 16% and French 16%)
  6. Take English as the base language.

There are many more approaches, in Uro we don’t have a single or primary language where we borrow words from. All European languages can be used as a control language. The basic rules for word derivation:

1. International words

If there is an international recognized word, then this word is preferred.

2. If one doesn’t find out if there is a common root between the languages.

3. If common roots between languages can’t be found, then give weight to the language by how much it’s spoken as a secondary language.

4. If not possible to find an international word or a common root, then choose English.

To compare words between various languages, there are various websites like:

  1. https://www.indifferentlanguages.com/
  2. https://translated-into.com/comparison
  3. https://omegawiki.org

I used the first, because European languages are a separate category at the website. I also created a script that finds the most common letters as the character in the word:

As discussed in the previous chapter on the design of the language, we will use two approaches to derive the words: A manual approach and one that’s AI-driven. In the following there is a list with the derived words for each approach. These lists can also be found on GitHub:

Manual approach

Basic words

International words

To be

To have

Colors

Months

Days

Family

Season

The complete list can be found on GitHub:

If you want to add a list of words, just create a pull request.

Interim thoughts

The whole list is of course very subjective. It is my personal choice based on the basic rules I created. If you want to create a language, then it would probably be good to formalize various ways to derive words.

What I find interesting is not the formalization of word building, but more how people from various European languages perceive those words and how easy they recognize its meaning and learn how to create sentences. At the end, you want something that feels natural for most people in all different European countries.

Besides programming languages and NLP (Natural Language Processing) tools to create the vocabulary, there are also other ways. You could create an open source project where people can add new words and let other people rate those words. After a certain moment, you take the most rated words and create a 1.0 version.

AI Approach

The next approach to derive words is to use AI. For example, ChatGPT is already multilingual. You can ask it in Polish to write a poem in French, then you can use German to ask it to translate it into English. Here we will ask ChatGPT (based on the language model GPT 3.5) to derive words for Uro.

Note that we limit ourselves to a number of languages. We use four languages of each major language group (Germanic, Roman and Slavic).

Start of the conversation with ChatGPT.

Based on these rules, I asked the chatbot to create a list for the same categories as we did with the manual approach and compare it to the control languages. Uro is added as the last column.

Numbers

Basic words

International words

Seasons

Days of the week

Months

Colors

To be

To have

Family

I also asked to translate Lord Prayer into Uro:

But as you can see, it mostly used uses Esperanto for the translation.

Final note

It was fun to experiment with the various ways to create words and in general to construct a language for Europe. I am under no illusion that we will all speak Uro tomorrow, but I do agree with the Esperanto creator’s message that we should all be a little more tolerant of each other. Amen.

Introduction:

--

--

Raymond Meester
Raymond Meester

No responses yet