Voynich: natural language or not? (1)

I've had many encouraging comments about my paper on the Voynich script, but also some puzzling ones.

Some people insist that the analysis “couldn’t be possible”, so they don’t bother to consider it, because the script “couldn’t represent a 1:1 mapping” onto real words, or the signs “couldn’t represent a 1:1 mapping” onto letters burp suite pro. Therefore they are sure that the script must be some kind of complex code.

As a linguist, I am perplexed by this, so I thought I'd explore it in this posting. To start with, this position hides a lot of assumptions. Firstly, it seems to assume that 'normal' scripts do or should have a 1:1 mapping of letters to sounds. Well, English doesn't for a start, as the word "thorough" ('TH U R A', roughly) can demonstrate – 4 sounds, 8 letters. The Arabic verb meaning 's/he wrote' is written with three Arabic signs (K T B), though it is read with vowels which the reader must supply from memory (roughly 'K A T A B A' ) – i.e. 6 sounds, 3 letters. No sign of a 1:1 mapping there then.

In terms of words: Arabic 'ana fi'lbeet', written as 2 'words' in Arabic script, is semantically 4 words: ana   fi   al   beet' = "I am in the house".  So in terms of 'script words' to 'semantic words',  no 1:1 mapping there.

A thoughtful account of the argument concerning 1:1 mapping and the Voynich manuscript has been offered by Elmar Vogt (http://voynichthoughts.wordpress.com/) as follows:

"Any "brute force" attack [by computer] would presume that the ciphertext words of the VM are mapped 1:1 from the plaintext words.

  1. The ciphertext alphabet seems to consist of around 17 frequent letters, plus a large number of rare "wierdos" in these words. That maps poorly to a latin alphabet.
  2. Some frequent letter groups show up almost exclusively word-initial ("qo") or word-terminal ("dy"). That's unknown for any Central European language.
  3. Word-length distribution is odd: There is a shortage of both very short and very long words; words have a comparatively uniform length — Again, this is unusual for Central European languages.
  4. Overall, the words exhibit a very regular structure — check out Stolfi’s “Core-Mantle-Crust” paradigm. (Yes, it’s a tough read, but worth working it through if you want to understand the VM.) they are composed by a fairly rigid “grammar”, the like of it is unknown for European languages.
  5. Nobody has been able to identify particles and articles (“a”, “and”, “with”…) in the VM.

 All of these differences between natural languages and the VM make it highly unlikely that the enciphering mechanism simply always turned plaintext word “A” into ciphertext word “X”, and “B” into “Y”.*) I’m convinced that one VM word is not equivalent to a plaintext word, but rather that it only represents a few letters.”

It is interesting that Elmar mentions European languages four times, then says “All of these differences between natural languages and the VM……..”. I have two issues with this:

a)      There’s a big confusion here between language and script. In theory just about any language could be written in any script – the two terms are not synonymous. So it is wrong to mix them up, making a point about a script and then using it to make a deduction about a language.

b)      We need to consider non-European scripts and languages. It is perfectly possible that the underlying language of the VM script is non-European. And a non-European script – for example an ‘abjad’ like Arabic, or a script which in other ways does not represent all vowel sounds – could well explain most of the points which Elmar mentions.

As for his final  point: “I’m convinced that one VM word is not equivalent to a plaintext word, but rather that it only represents a few letters”, well, in an abjad like Arabic, with vowels omitted, that is exactly what we get – ‘a word represented by only a few letters’ because the reader must fill in the missing vowels herself.

Why do we Voynich people torture ourselves and insist on complicating things, when what we need to do is work out the script step-by-step, NOT closing off options by assuming that it must be European, or that it cannot be a 1:1 matching?

In fact in my paper I conclude that the words I have identified also do not represent a plain 1:1 match. For example the word for the plant (genus) Centaury is identified as KNTAIRN, with no written vowel after the first consonant, nor before the last consonant. No 1:1 match there of sign to sound, but why on earth should there be?

More to follow on this methinks…


  1. Peter

    write attempt

    Writes this character without thinking 10x fast with closed eyes on a piece of paper.

    What do you get ?

  2. Peter

    On one occasion I show my work to get the signature (code) on the track.
    Here’s a little insight.

    • Peter


  3. Hatice

    Hello Stephen! Really good work, congratulations! I am sure you have considered it already (I might have missed it), but will ask away anyway: Have you looked at the most frequent words in the botanical pages and tried to match them with words such as “plant, flower, leaf, stem, root, scent, water, sun/sunlight, earth/soil, etc.” I mean, the words you would expect to be mentioned while writing about plants.

    • Stephen Bax

      I believe that has been tried, but the problem is that we have no idea which Voynich words might match the words we choose, nor which language they might be in. So it is impossible even to make a start with a strategy like that, sorry to say!

      • Peter

        Perhaps we should not search for keywords in the plant part, but for common ground. All where berries have to look for a tie. (Berry).
        Or cordate with leaves. Cup form with flowers …. etc.

  4. V. H.M.

    P.S. Noticed I wrote bookman and him… What about a woman? 😉

  5. V. H.M.

    When I first stumbled over the existance of the V.M. it instantly reminded me of the Heidelberg Prinzhorn Sammlung, which is a collection of “art brut” of mentally ill patients (mostly diagnosed as “schizophrenic” at the time)
    See some in the widest sense similar example pictures
    Many of the pieces in this collection have in common the sometimes repetitive, often pedantic, meticulous drawings, texts, plans, lists, systems or calenders. The patients usually were locked up in prison-like cells and used what they had at hand to express themselves in their illness. Amazing, and touching, is the jacket of Agnes Richter, which is stitched over and over on the inside with seemingly nonsense and contradictory texts, obviously related to her life, and in the quite complicated German handwriting font of Sütterlin: http://prinzhorn.ukl-hd.de/index.php?id=50&L=1
    Thus I always imagined the V.M. could have been created by a locked up, imprisoned or hermit living wealthy or noble person, perhaps mentally ill as mentioned above or with some sort of Asperger’s syndrome and yet with ample access to expensive parchment and colors at the time and with the freedom to create perhaps even a “nonsense” or secret piece we have here as the manuscript.
    The so called Cosmologic section reminds me, being a biologist, of graphically simplified drawings of plant stem sections under the microscope. Not likely invented at the time, but who knows what this bookman – or call him a geek – else invented in his officina?

  6. Tricia

    I believe the Voynich Manuscript may well be recording an Ancient
    Technology (too complicated to go into here – the Flower shows
    red and Blue (Positive and Negative), the two Fish represent
    Electromagnetism travelling in both directions at the same time, and
    the name at the top of the Page is very similar to Khufu (whose name
    although translated, may also represent this Technology). I hope
    this helps.

  7. I appreciate a lot your work. The world is not Boolean, built in “natural languages vs. hoax.” Languages are cultural constructions, and I think you are in the right way. Please consider language planned for esoteric purposes such as Balaibalan, discovered by my academic grandfather, the interlinguist Alessandro Bausani. Perhaps this domain-specific language was in part planned in order to preserve secrecy among adepts. Feel free to contact me if you want more details on it. Best, Federico

    • Stephen Bax

      Thanks very much. I’ll look into it.

      Best wishes

  8. Paul

    If there’s anything to it…

    I’m going to guess that it’s a Uralic language from along one of the northernly Silk Road trade routes. (Don’t know them, but patterning looks similar when comparing to language examples found on the internet, even if letters used are different.) It makes sense the way you worked it out with some parts of letter forms acting as inflections and diacritics, which explains repetitive patterns in most letter shapes. The language being from a region along a trade route would also explain mideastern, Persian, or Indian loan words which you appear to be finding. Then on top of that you have slight variations in most words for plurals, possesives, and tenses, etc. Which adds another layer of variation to what are essentially the same words.

    So with a lack of familiarity and a language that might be dead or evolved with a changed alphabet (if a variation is still in use, it’s probably written in Cyrillic now), it would make it seem fairly cryptic by the time historians get to it. Particularly if there’s something like 27 ways to say “dog” based on who it belonged to, when it belonged to them, whether or not people liked the dog, etc.

    As for why the book ended up in Western Europe? It’s probably copy from some medicine writings brought in during a plague outbreak. Obviously people then would desparately be looking for anything that might offer some kind of relief or cure.

  9. Judy Eschmann

    Loved the video and found it very interesting. You sir are amazing, good work. Looking at the manuscript and also http://www.edithsherwood.com it is possible that the manuscript could be a sort of fertility guide. A lot of the drawings of the woman seem to have a look of pregnancy. They also appearing to be to be bathing/showering in what appears to be the fluids from some of these plants/flowers. A lot of these herbs/plants are used good and bad in Natural fertility treatments. This would also explain what may be recipes and the astrological charts.
    Just a thought.

  10. Dolores J. Nurss

    I’d like to propose an idea which might have nothing to it at all, but it can’t hurt to suggest it. Could this be a trade language, used among merchants who know each other, who have learned a number of languages in their travels and privately speak among themselves in bits and pieces of them all? The book might then be a medical text for taking care of their ills while on the road, using the herbs that would come to hand, in the wild spaces between towns.

    Just a thought.

  11. WDCRutherford

    I’ve just finished watching your video, and reading through the article which somehow made its way onto my facebook page. I’m an Arabic/Urdu/Hindi linguist with a tiny bit of experience in Latin, and I found a couple of possible trends based on your work/suggestions, and am in the process of building a few hypotheses on top of these. I’ve been interested in the Manuscript for a few years now, and it’s great, as well as really helpful, to see that someone is appearing to make headway into it.

    I just want to say congratulations on what I’m sure was an electrifying find, and I hope to be able to contribute to the good fight in the near future.

    Alif shukur,

    • Stephen Bax

      Shukran aleyk, ya William!

  12. Hi,
    I just watch your video – 26:52 Hellenore = hunyor (hungarian, see google translator, vocalize it), it is closer to “KAaur” ?
    Now, I watch more the video 🙂
    and congratulate!!
    ps: the website I recommend is a blog, konteo = konspirations theorien, there are a lot of people cheer you 🙂

    • Stephen Bax

      Thanks Ildiko… keep working on the Hungarian connection!

      • Mia Mai

        Hungarian language consists numerous turkish derivatives both from the ancient turkish (3-5 century) and from the middle age (12-16 century A.D. ) It could be helpful in the comperative research with the arabic.
        Congratulations to your research and good luck further on!

  13. Josh Humble

    Good Evening,
    Is it possible the text was written in a type of shorthand? If so, would that account for some of the translation difficulties?

    -Josh Humble

    • Stephen Bax

      Yes, some have thought of that, but there are no obvious easy parallels in the 15th century. But it is still worth considering.

  14. Eric

    I am excited to learn of any breakthrough with Voynich. I’ve been thinking about that text myself, and indeed it occurred to me some time ago as well that one important key in deciphering it would lie in the names of the houses of the zodiac depicted on several pages of it. I know that Pisces is in there once or twice, among several others, and once some correlation might be discovered between known Medieval names for these and the Voynich script, the remaining text is bound to become clear enough for its language to become apparent, making the rest of the decoding likely quite easy provided that the language is still recorded to some extent elsewhere or has a surviving relative. One thing that has struck me about Voynich’s script is that it seems to bear some resemblance to written runes. I would therefore guess that it was in fact derived from a runic script.

    • Stephen Bax

      Thanks Eric – one problem is that although there are ‘names’ beside the zodiac symbols, they are not in the Voynich script but apparently in a script/language reminiscent of French!

      But many theories are still in play, of course!

  15. John Stroud

    I think you are on the right track. Just shows that logical and critical thinking can go a long way. Sometimes it may take 600 years as in this case. It proves that nothing isn’t always over even when the fat lady gets up to sing. 🙂

    • Stephen Bax

      Thanks – I’m not that fat actually 🙂

  16. Juan Carlos Ferrández


    Sorry for my English! I’m learning it now! 🙂 I’m from a little city in Spain.

    I have read this (http://www.beds.ac.uk/news/2014/february/600-year-old-mystery-manuscript-decoded-by-university-of-bedfordshire-professor) five minutes ago, and I want to congratulate you on your findings.

    I’m passionate about the manuscript for a long time, and your results are incredible. I know that is a hard work, but It’s seems so easy!! You are a genius!!

    I will follow all your work from now on. Thank you for your work! I’m excited hehehe 🙂

    Juan Carlos.

