Skip to main content
31

First Steps in Lojban

Lesson 31. Sound equals text — Lojban's audio-visual design

Sora
Sora

So wait — Sevan showed me earlier that you can type .o'imuXAGjisofyBAKnicuZVAtilePURdi (all one string, no spaces) and a Lojban parser can still tell what every word is?

Sevan
Sevan

Yes. That's not a trick — it's a core design property called audio-visual isomorphism (AVi for short).

Sora
Sora

Iso-what-now?

Sevan
Sevan

"Isomorphism" means "same shape." The idea is:

  • Any properly spoken Lojban utterance can be uniquely written down.
  • Any properly written Lojban text can be uniquely read aloud.

This is stricter than most languages. In Japanese, for example, a string of hiragana like くるまでまとう could parse as "waiting by car" or "waiting until [he] comes" — the boundary between words is ambiguous. Lojban forbids that.

How word shapes make boundaries unambiguous

Sevan
Sevan

Recall the three word types (Lesson 3):

  • brivla — has a consonant cluster in the first five letters, ends in a vowel.
  • cmavo — no internal consonants except possibly the very first, ends in a vowel.
  • cmevla — ends in a consonant.

These shapes were engineered so that once you read a stream of sounds (including where the stress falls), there is exactly one way to cut it into words. Every word boundary is recoverable from the sound alone.

Sora
Sora

That's why you can ditch spaces if you include stress marking?

Sevan
Sevan

Exactly. Spaces are a courtesy to the reader, not a grammatical necessity. Lojban's design makes them optional — as long as stress is explicit, the string is still unambiguous.

(That said: please use spaces when writing to humans. Reserve the spaceless form for puzzles and Twitter character limits.)

Why dots exist

Sevan
Sevan

Now you can fully understand why certain dots are required:

  1. Before vowel-initial words.i, .abu, .iu, etc. Without the dot, a preceding final vowel might run into the opening vowel and create ambiguity about where one word ends and the next begins.
  2. After words ending in y.y. is a cmavo; the trailing dot keeps it from being read as the start of the next word.
  3. Around cmevlala .soran. has a dot at each end, clearly marking the name as a single unit separate from surrounding words.

All three rules are there for the same reason: to keep word boundaries clear.

Sora
Sora

So "omajinaï" (Koshon's word for the dots back in Lesson 2) was actually "word boundary markers"!

Sevan
Sevan

Right. Koshon skipped the explanation to keep Lesson 2 simple. Now you know.

The bigger picture

Sevan
Sevan

This design has a few practical consequences:

What AVi enables:

  • Parsers work cleanly. Because word boundaries are always recoverable, programs like ilmentufa (the camxes parser) can parse Lojban text exactly as a speaker would hear it.
  • Lojban creates new words carefully. Every new gismu and lujvo has to pass morphological checks — forbidden clusters, uniqueness, etc. — precisely to preserve this property.
  • Spoken Lojban matches written Lojban. There is no difference between "formal written" and "informal spoken" grammar the way there is in many natural languages.
Sora
Sora

So the word-shape rules I grumbled about learning aren't arbitrary — they're load-bearing architecture.

Sevan
Sevan

Exactly. Lojban is less like a collection of rules-for-rules'-sake and more like an engineered system where each constraint earns its keep. Take out one rule and something else breaks.

Sora
Sora

That's actually… kind of cool? It's more like a proof than a language.

Sevan
Sevan

Some people say exactly that. Whether that's a feature or a bug is left as an exercise for the learner.

True or false

Pick whether each statement is true or false according to the lesson.

  1. Lojban's audio-visual isomorphism means every properly spoken utterance can be uniquely written, and every properly written text can be uniquely read aloud.

  2. Spaces between Lojban words are always grammatically required.

  3. Lojban places a dot before vowel-initial words partly to keep word boundaries unambiguous.

  4. A cmavo ending in y (like .y.) needs a dot after it for the same word-boundary reason.