Voynich phonetics – Derek Vogt

Derek Vogt has kindly provided an update of his scheme, so the one on this page is now outdated.

Please see his revised and updated June 2015 scheme here and add any comments or suggestions onto that page.


Here I am posting the Voynichese phonetic system which Derek Vogt has been working on, drawing on approaches used in my earlier paper which attempted to sketch out a few sound-symbol relationships in the Voynich script. Derek has been drawing on a few other resources:

For plant identifications from their pictures by other people:

And for names of the plants identified there & translations of some of the words in their names:

Any comments or suggestions welcome.



  1. Derek Vogt


    • Derek Vogt

      The purpose of working out the phonetic system is to find a relationship between Voynichese and a known language or family of languages which can be used as a model for translation (presuming it has any relatives that are known).

      One way that could happen is by comparison of their phoneme inventories. Closely related languages usually use similar sets of phonemes, and very different sets of phonemes are used by more distantly related or unrelated languages. Fortunately, Voynichese’s inventory seems to be rather unusual, so a good match would stand out from the crowd. Even if we never found a perfect match, at least this would be a way of narrowing down the list of candidates to the ones that come closest. Wikipedia’s articles on languages usually have lists/tables of phonemes, like this one, but not all, and some of the languages/families in the same geographic region as Voynichese don’t even have Wikipedia articles. So information on the phoneme inventories of some candidate languages/families would need to be found somewhere else.

      Another way would be by matching some of the rest of the words in the text, the ones that form sentences in the author’s language, not just things that are likely to come from other languages like the plant & constellation names. This requires some sign of what some of the other words should mean first. For example, one of the plants in my list produces an oil that induces vomiting, so another word on its page that isn’t used on most other pages should be equivalent to “vomit”, and others that are somewhat more common would probably equate to “induce” and “oil/juice/extract”. The problem with this is that you can only try one language or language family at a time and there are a lot of candidates, many of which don’t appear in online translators.

      So again, the same issue applies: information on some of the candidate languages would need to be found somewhere else. We need a way to look up translations, or at least sound inventories, for lots of languages, even the obscure little ones for which such information is hard to get.

      • Darren Worley

        Derek – the recent flurry of proposed words of an Indian origin is fascinating.

        I’ve been thinking about how this could be reconciled with the (possible) European and Judeo-Christian influences observed in the VM.

        I have previously suggested a Yiddish influence, this seemed plausible given the suggested German-Latin text on f116v. The Yiddish language seems to have originated in Southern Germany and Central Europe.

        What surprised me was that Yiddish also contains words of an Indian origin. I don’t claim to be an expert on this topic but it has been suggested that some of these early Central European Jewish settlers may have been from decended from the Khazars. These were a semi-nomadic tribe of Turkic peoples living in Southern Russia and extending from Eastern Europe to Central Asia. Its a fairly old and somewhat controversial theory.


        The linguistic example that I’ve read is the Yiddish word “goy” (plural:goyim) for gentiles (non-Jews), which is apparantly cognate with “gora” for “white Man” in Hindi and Urdu.

        It seems plausible that these Central Asian tribes, that eventually settled in Eastern Europe could have introduced Indian words into Central Europe, and thus into Yiddish.

        What I find attractive about this idea, is that the VM, which was first recorded in Prague could be the product of local Jewish settlers from this region possibly writing in an archaic tribal script.

        However, if there really is a Judeo-Christian influence in the VM, this is only one of many possible explainations to reconcile the Indian influences you’ve reported.

        There have been Jewish and Christian settlers in India since 1st-century CE, and perhaps as early as 500 BCE, and there are several Judeo-Indian languages too.

        eg. https://en.wikipedia.org/wiki/Jud%C3%A6o-Marathi

        Judeo-Marathi is spoken by the Bene Israel, a Jewish ethnic group that developed a unique identity in India and in Pakistan.

        Other possible scenario for the introduction of Indian-derived ideas into European text would be from scholars visiting religious academies in Palestine. There are examples of Ashkenazi Jews from Southern Germany travelling to Palestine to attend religious schools during the medieval period, and presumably Jewish scholars from other distant lands would have travelled to these schools too, exchanging ideas and texts.

        • Darren Worley

          One other comment – I mentioned Judeo-Marathi previously, and there seemed to be one unusual feature I found in a Judeo-Marathi book that I’ve not seen elsewhere, but I have seen in the VM.

          I don’t think anyone has remarked on page f86v3 – what seems unusual about this page is the the text is written in all 4 directions : horizontally across the top, and down each side, and also along the bottom.

          Link to f86v3

          Presumably the VM was supposed to be rotated to be read, but this does seem to be quite unusual.

          Was this a common scribal tradition to write in all directions around a page? If so where else is it found?

          Compare that with an example from a Judeo-Marathi text here. The text is also written in all 4 directions. Is this a uniquely Jewish or Indian custom?

          The image is a page from a Haggada shel Pesah in Judaeo-Marathi which was printed in Mumbai in 1890.

          • Darren Worley

            The only vaguely similar example I can think of, for employing text in different directions within a single book, is the Mandaean tradition, found in the Ginza Rba whereby the book contains the Right Ginza on one side, and, when turned upside-down and back to front, contains the Left Ginza.


            • MarcoP

              Hello Darren,
              what about the Book of Curiosities, Diagram of the Encompassing Sphere [the heavens], f2b-3a?


              • Darren Worley

                Thanks Marco, but I’m not convinced that the diagram you’ve referred to has passages of text written in all four directions. There is possibly some (indistinct) words in the lower left running horizontally. Maybe its a label(?). What I had in mind was a diagram with blocks of text written in all 4 directions. I’ll keep looking..

        • Derek Vogt

          Most of the Voynichese words with Indian cognates don’t need to be of Indian origin. They could have come to both from a third source, or Voynichese could be the source from which India got some of them, or a mix of all of those possibilities in different cases could have happened. On top of that, if there’s a third source involved, it could be either Indo-Iranian or something else completely outside that family, as long as it’s in the geographic area. (Hopefully I will soon post a list of the languages or language families in that region.) I looked for cognates in Indian alphabets not to investigate an Indian influence on the book, but to simply collect cognates, just like before with the other alphabets.

          Sometimes words that are only found in a particular group can still illuminate past states of a language outside the group, because words can get replaced or shift meanings. For example, the word for “eagle” is “ørn” in Danish and Norwegian, and “örn” in Swedish and Icelandic, but nothing like that in English or German. So you could think that the “ørn/örn” root is purely Norse, not applicable to southern Germanic lanuages, and start trying to explain the specific Nordic input to the book. But the root is actually universal to Germanic languages and just got replaced in some; the English word for “eagle” was still “erne” in the fourteenth century. So a mysterious old book could have a word for “eagle” with modern cognates only in the Nordic languages, even if that book had actually gotten it from English/German, thus making those Norse cognates the only modern evidence of the book’s English/German connection.

          Another example in which the original didn’t get completely dropped would still create just as much trouble for an outsider using a translation dictionary like me. Other Germanic languages still use versions of the original Germanic word for “dog”, but Old English replaced it with “dog” (whose origin nobody knows), so the original shifted meanings and ended up as “hound” and “hunt” in our modern langauge, which won’t show up in a translation dictionary for an outsider looking for the word for “dog”. It would look as if our cognate for things like German “Hund” were just as absent as “erne” is. So again, a mysterious old book with a connection to English would only reveal that connection through cognates in languages that aren’t English. And because it happened so long ago, we couldn’t even blame France this time.

          It’s not just Germanic languages or just English; in fact, I ran into cases in other families just while checking on Voynich words. Another story just like those is why, for example, the Punjabi word for “wolf” (associated with star 45, constellation Lupus) isn’t like the others; its cognates mean “hyena” in Hindi and “tiger” in some other languages, but it shifted meaning in Punjabi and replaced the original word for “wolf”. Putting pieces together in a world where those kinds of things happen requires casting a wide net and avoiding over-interpreting the jumble of bits & pieces you pull up in it.

          So, let’s suppose for a moment that the Voynich language belongs somewhere in the Indo-Iranian family. I think both geographic and phonetic indicators point to its being more likely to come from the Iranian side of the family than the Aryan side. (This is neglecting a smaller subfamily or two and focusing on just the big two subfamilies.) Persian and Kurdish and such cover a larger and more western region, and, although their sound systems aren’t a great match for Voynichese’s, they’re not such bad mismatches as the Aryan ones. But the only member of that group that I can easily get translations for is modern standard Persian. So if & when anything funky happened to a Persian word such as with English “erne” and “hunt/hound”, the only cognates I could easily find would be in the only other Indo-Iranian languages I can use: the Aryan ones, even if they are rather out of place.

          And all of that ignores the possibility of words getting transferred between languages that aren’t even related but just live close enough to each other… meaning Voynichese could have gotten words from or given words to almost anything in the Indo-Iranian family or its neighbors anyway… which leads me to a subject inspired by Judeo-Marathi which will need to be its own separate post later…

          • Neticis

            Derek, Latvian seems intermediate language not only in geography, but also for your cognates:
            en: eagle, lv: ērglis, no: örn, en: dog, lv: suns, ge: hund; en: wolf, lv: vilks…

      • D.N. O'Donovan

        Even if you decide not to consider some of the original workers in this area such as Fr. Petersen, or (much later) Edith Sherwood, or me – I think you really should consider the many years’ work done by Dana Scott, who was active in this field from – I think – the 1980s or ’90s, and put in a lot of very solid work.

        I understand that he agreed to hand over his id’s to someone called “SD” and I hope that “SD” did the decent thing and acknowledged when his identification was first. He first identified the Rose, for example, and I recall another of his was ‘hops’. But he did many years’ work, and within the assumed parameters (his were that the work was all-EUropean), he did good hard work. I don’t agree with many, but that’s by the way. Do take his work into account – I’m sure Rene and Ellie won’t mind too much.

        • Derek Vogt

          The lists of identifications I’ve been using all along include Sherwood and Scott. I don’t know who “Fr Petersen” is, but if it’s Theodore Petersen (with “Fr” as a title of some kind rather than a first name), then he’s already in there, too.

          When I collected the suggested plant identifications, I didn’t keep track of which ones had come from which people. But I could still look that up, and I have thought of doing so to see whether any one source was more likely than others to have yielded my phonetic matches.

  2. Derek Vogt

    Here’s the updated version of the master list.

    Entirely new entries since the last version include plants 4v, 7r, 9r, 17v, 25v, and 43r, the blue text for 2v, and stars 16 and 21. One of those includes EVA-q, so that symbol has been added to the table of symbols and their sound values at the top. Plant 15r has been retracted.

    Some “old” entries have had just a few new cognates added, or had old cognates that were only available as transliterations added now in their native alphabets, including plants 5v, 6r, 14v, 16v, 20r, and 39r, 2v’s red text, and stars 7, 26, 37, 38, 45, and 53.

    Old notes 9 and 13, on the sounds /ɣ/ and /ʕ/, have been merged so that note 9 alone now address both sounds. A new note on the use of ^h^ after plosives has been added as #13, so later notes are not affected.

    • Derek Vogt

      I just spotted some errors in that. Some are minor: a misspelling that doesn’t confuse the meaning and a couple of dropped quotation marks… but there’s one I can’t let slide. Punjabi recently experienced a sound shift in which all voiced aspirated plosives became unvoiced plosives associated with a falling tone on the subsequent vowel. So the letters that once represented the former, and still do in sibling alphabets for other Aryan languages, now actually represent the latter, but are still shown as the former in the table I just put up. In one case, the change makes the word more similar to the way I read the Voynich word, and in another case, it makes it less similar, but both need to be fixed. Corrected table coming shortly…

  3. Maryam

    I am not sure about what you think, because that guy know a lot about Arabic language, and in Arabic , we write the same letter in different way, so I am sure that he meant same letter in different way to write.

  4. Maxim Burlakov

    Here is my short list of possible Sanskritisms in the manuscript:
    1). http://www.voynichese.com/#/exa:okor/1503.2 is probably ‘akara’ which means ‘tax-free’
    http://sanskritdictionary.com/?q=akara&lang=sans&iencoding=iast&action=Search and
    ‘a’ is a Sanskrit prefix which means ‘without’.
    2). http://www.voynichese.com/#/f68v3/exa:chodo/0 is probably ‘dātā’ which means ‘charitable person, giver’
    http://sanskritdictionary.com/?q=d%C4%81t%C4%81&lang=sans&iencoding=iast&action=Search and
    All these slashes: ‘i’, ‘ii’, ‘iii’ (in EVA) are in fact diacritics.
    I’m convinced that one slash (‘i’ in EVA) indicates the lengthening of a preceding vowel.
    3). http://www.voynichese.com/#/exa:char/0 is probably ‘dura’ which means ‘giver, granter’
    http://spokensanskrit.de/index.php?script=HK&beginning=0+&tinput=dura+&trans=Translate&direction=AU and
    4). http://www.voynichese.com/#/exa:chair/0 is probably ‘dūra’ which means ‘distant, far; distance’
    http://spokensanskrit.de/index.php?script=HK&beginning=0+&tinput=dUra+&trans=Translate&direction=AU and
    5). http://www.voynichese.com/#/exa:kor/0 is probably ‘kara’ which means ‘toll, tax, tribute’
    http://spokensanskrit.de/index.php?script=HK&beginning=0+&tinput=kara&trans=Translate&direction=AU and
    6). http://www.voynichese.com/#/exa:ksor/106.4 is probably ‘kesara’ which means ‘saffron’
    http://spokensanskrit.de/index.php?script=HK&beginning=0+&tinput=kesara&trans=Translate&direction=AU and
    7). http://www.voynichese.com/#/exa:kchor/0 is probably ‘kedāra’ which means ‘paddy-field’
    http://spokensanskrit.de/index.php?script=HK&beginning=0+&tinput=kedAra+&trans=Translate&direction=AU and
    8). http://www.voynichese.com/#/exa:yor/0page is probably ‘nara’ which means ‘man, male, person’
    http://spokensanskrit.de/index.php?script=HK&beginning=0+&tinput=nara&trans=Translate&direction=AU and
    9). http://www.voynichese.com/#/exa:far/0 is probably ‘pura’ which means ‘town, city’
    http://spokensanskrit.de/index.php?script=HK&beginning=0+&tinput=pura&trans=Translate&direction=AU and
    10). http://www.voynichese.com/#/exa:sor/38.4 is probably ‘sara’ which means ‘going, moving’
    http://spokensanskrit.de/index.php?script=HK&beginning=0+&tinput=sAra+&trans=Translate&direction=AU and
    11). http://www.voynichese.com/#/f69r/exa:soir/715.2 is probably ‘sāra’ which means ‘essence; chief ingredient’
    http://spokensanskrit.de/index.php?script=HK&beginning=0+&tinput=sAra+&trans=Translate&direction=AU and
    12). http://www.voynichese.com/#/exa:sair/38.4 is probably ‘sūra’ which means ‘a wise man; sun’
    http://sanskritdictionary.com/?q=s%C5%ABra&iencoding=iast&lang=sans and
    or ‘śūra’ which means ‘warrior, champion, brave man’
    http://sanskritdictionary.com/?q=%C5%9B%C5%ABra&lang=sans&iencoding=iast&action=Search and
    13). http://www.voynichese.com/#/exa:dcho/38.4 is probably ‘tadā’ which means ‘at that time; then’
    http://spokensanskrit.de/index.php?script=HK&beginning=0+&tinput=tadA&trans=Translate&direction=AU and
    14). http://www.voynichese.com/#/f99v/exa:dor/1183.2 is probably ‘tāra’ which means ‘star’
    http://spokensanskrit.de/index.php?script=HK&beginning=0+&tinput=+star&trans=Translate&direction=ES and
    If the above is not sufficient evidence\strong possibility for Sankritisms in the manuscript then I’m Chinese. (That’s a joke and not a hint!)

    P.S. I believe:
    1). Diane O’Donovan is right in assuming that the author of the VM was a native of (but only East) Asia who came to Europe (by sea?) and there wrote his cross-cultural enigma. He was a heathen, i.e. didn’t adhere to Christianity, Judaism or Islam (at least for most part of his life). Or if to be more precise a person of the Hindu-Buddhist culture. An adventurer but a learned one by the standards of his culture and time.
    2). The manuscript is NOT written in Sanskrit but the author was fond of using words borrowed in his language from Sanskrit. (Maybe he was a high caste brahmin but I’m not sure.) Just as we Europeans resort to the Latin and Ancient Greek words for expressing complex notions and ideas, he resorted to the Sanskrit ones for the same purpose.

  5. Maxim Burlakov

    There is no name ‘Chiron’ in the VM at all! Just the transcription of (probably Latin word) ‘Centaurea’ with unknown characters. I mean the next word after ‘Centaurea’ on page f2r (8th line) is NOT read like ‘Chiron’. And doesn’t denote ‘Chiron the centaur’ too!
    But what is at least the rough reading of the first character of this enigmatic word? Someone’s granted you his brilliant insight about the name of a plant on page f16v (7th line) that it should be read ‘bedian’. It is natural to conclude that EVA characters “ch” stand for “d” and “c’h” for “ɖ”. See https://en.wikipedia.org/wiki/Voiced_retroflex_stop
    P.S. Btw as Russian I’m used to read “д” (cute little house, isn’t it?) like “d”.

    • Stephen Bax

      Thanks – this is interesting, but of course we would need a lot more evidence from other parts of the manuscript about your suggestions.

      Could you perhaps look for such evidence and report back? It would be an interesting addition to the discussion. Derek, anyone, any views?

    • Derek Vogt

      I agree that the word is not “Chiron”, for a different reason. That interpretation originally used the word’s last two or three symbols, EVA-iin, for the sounds /i/ and /r/, which mine does not, which is why it doesn’t appear in my tables on this page. Also, the second letter, EVA-a, gets in the way, and the lack of the apparent letter for /n/, EVA-y, which often appears at the ends of other Voynic words, is conspicuous.

      The plant’s identification and the interpretation of its name are not affected by whether the second word is “Chiron” or not. It was introduced as a possible explanation for the different endings for two versions of the word “kent…”, like our “centaur” and “centaurea“, but there are many other possible reasons for a root word to get different suffixes.

      I’m not going to argue about your sound assignments for these two letters now. Everything I could say about that is already in my master list here and its addendums on the letters ^h^ & ^x^ here and on various letters including the one I think accounts for the /dya/ in bedyan here. But I will throw in a few points of logic to use if you want to build a new phonetic system for the Voynich Manuscript independent of mine:

      1. If you’re already throwing out my way of reading “bedyan”, you might just as well consider that it isn’t necessarily “bedyan” at all. That’s an identification that a botanist came up with from the picture. That botanist didn’t write the book and could be wrong. Other botanists have thought it was Jacea nigra, Pimpinella anisum, or an unspecified member of the genus Eryngium or the family Fabaceae. I simply went with the one for which I could find a reasonable phonetic match using my phonetic system, which you would be ignoring & replacing anyway. Maybe your new system would find a better match with one of the plant’s other identifications.

      2. Letters that look similar sometimes sound similar but sometimes don’t. And when they do, there are various ways for sounds to be similar. For example, if you already have /d/ and are looking for something else similar to it, /ɖ/ is one option, but so are /g/, /ǧ/, /b/, /t/, /dʰ/, /dˤ/, /ᵈz/, /ᵈž/, /z/, /ž/, /n/, /L/, /r/, /y/, and a trill like Spanish [rr]. Narrowing down the list of options requires finding examples in words, which your connection from /d/ to /ɖ/ doesn’t have yet, even if your use of /d/ alone in “bedyan” is correct.

      • Derek Vogt

        (I left out /ð/…)

        • Maxim Burlakov

          First of all NOT letters! Characters, symbols, glyphs etc. but please, Derek, don’t call them letters! In studying any unknown script a very important issue is to ascertain to what writing system it belongs: logographic one, alphabet, abugida or abjad. And all the Voinich script researchers before the professor were under the delusion that they are dealing with an alphabet. Steven Bax was the first who doubted it and surmised that the Voinich script was abjad. As for me, after a few months of work I had noticed one amazing thing – the VM author could not imagine a stand-alone consonant, i.e. a consonant without any inherent vowel!
          Yes, I assert the Voinich script is a peculiar abugida so “the consonant letters” are in fact SYLLABLES: a consonant + a vowel (very often ‘a’ but not always!). The best proof in itself of the author’s Oriental education or provenance.

          1. Let’s see: The name for ‘star anise’ in Persian is ‘badyan’,
          ——————————— Hindi is ‘badayan’,
          ——————————- French is ‘badiane’,
          ——————————- English is ‘bedian’,
          why on Earth it was ‘bhajn’ for the VM author’s ears? How could the voiced consonant ‘d’ just vanish from the word? It’s impossible!
          If some plant looks like a bedian, smells like a bedian and even tastes like a bedian, then the plant probably is a bedian. 🙂 Or in our case probably ‘bedāy(a)n’.
          Derek, I like your erudition, youthful fervour and especially your desire to learn so just admit ‘ch’ is ‘d(a)’ and let’s be done with it. Btw almost a year ago I began decoding the Voinich script after down-loading the professor’s work and a neat simple chart of yours.
          As for some botanists coming up with their plants’ identifications for some reason I have a strong suspicion that most of the botanists are often simply blind. 🙂
          2. The problem is that NO ONE before professor Steven Bax could even roughly read a single word from the manu-script let alone identifying the underlying language, hopefully I’ve managed to do a bit more – to determine (at least approximately) the geographic region where to search it. And thus – if using your expression – “narrowed down the list of options” for the sound assignments of the entire script (and not only that).
          I propose considering the presence of Sanskritisms in the VM’s text as a valid working hypothesis and accordingly to base all the future sound assignments on the Sanskrit phonological system. See
          https://en.wikipedia.org/wiki/Sanskrit_grammar#Phonology and
          Btw there exists no your or my phonetic system of the VM script, there exists only the phonetic system of the underlying language so let’s attempt to reconstruct it.
          P.S. Derek, could you perhaps draw the new scheme of sound-character correspondences for us all?

          • Derek Vogt

            ‘badyan’… ‘badayan’… ‘badiane’… ‘bedian’… why on Earth it was ‘bhajn’ for the VM author’s ears? How could the voiced consonant ‘d’ just vanish from the word?

            This looks like a misunderstanding, which would be my fault, resulting from how I originally spelled the word, so I clarify:

            You might be reading the letter [j] as it is used by many European languages and the International Phonetic Alphabet (IPA), which I believe is equivalent to Cyrillic [й]. In English, that sound is normally represented by [y], which would make “bhajn” equivalent to “bhayn”. But that is not the sound I have in mind; I originally used [j] in Voynichese words the same way we use it in English, which is not that sound.

            The sound I have in mind for that Voynichese letter is a voiced affricate, articulated in a place that has been called post-alveolar, palato-alveolar, and alveolo-palatal. It has no Cyrillic letter of its own, but would be a voiced counterpart for [ч], or like the result of running the sequence [дж] together very quickly (in the same way that [ч] could be compared with [тш]). Because that sound has evolved from the sound /g/ (Cyrillic [г]) more than once in unrelated languages, various languages using both the Latin alphabet and the Arabic alphabet often represent it with some version of the letter that originally only represented /g/ (ج in Arabic). Similarly, I also think of that Voynichese letter as a derivative of Voynichese letter ^g^, so I now avoid the problem with [j] by using ^ǧ^ for that sound in Voynichese words as well as ^j^: ^bhajn^=^bhaǧn^.

            Because that sound begins like a [d], it can also result from a /d/ merging with a /y,й/ that would follow immediately after it, like in English “educate”, where the letter [u] would have been pronounced /yu/, but the unwritten /y/ merges with the preceding /d/ into /ǧ/, so the word is pronounced “eǧucate/ejucate/edžucate” (roughly, turning from “эдйук-” or “эдюк-” to “эджук-“).

            So the way I spelled that word was not meant to seem as if I thought the /d/ had vanished. What it was meant to indicate was that a sequence like /dya,дйа,дя/ had become /ǧa,ja/, or at least was spelled like it in Voynichese because of a lack of a distinct letter for /d/ or /y,й/ or vowels that act like [ю] and [я].

  6. Derek Vogt

    I’ve been going through the latest version of my list of cognates, including both plants and stars & constellations, collecting examples that deviate in any way from the sounds I associate with each Voynichese letter, to see any general patterns/tendencies among those deviations, which could call for a change in my sound assignments. I’ve already posted the most complicated part of this, on the letters ^h^ and ^x^ and the range of sounds from /h/ to /x/.

    Now here is the rest, with no wordy explanations about them this time, just tables showing which letters get deviated the most and the most common types of deviation. I am sticking with the original sound assignments in each case, and you can see for yourselves what you think about that.

    Also, some little changes & corrections to the master list since the last time I commented on it: on my version at home, I’ve added the Arabic name for 11v in the Arabic alphabet (مازريون, “mazrywn”), fixed the misspelling of “languages” as “lanuages” in the line describing /ǧ/ near the top, and, most importantly, fixed the misidentification of the sounds /ǧ/ and /č/ near the top as “alveolar” instead of “post-alveolar”. Their alveolar counterparts would be /t/ and /d/, which have no connection to what I meant there. But I shall resist the urge to upload another new version of the image over so few changes.

  7. Hello, I notice that there is no character for an /l/ sound in the script, and that /m, n/ are represented by a single character (which may also have some kind of a vowel-like quality). I’m concerned that this is an unlikely arrangement.

    • Stephen Bax

      Hi – can I at the same time draw attention to your own site:


      which might be of interest to others?


      • Thank you. That is very kind of you.

        • Daniel White

          You have a very nice website, EMSmith! Thanks for posting it, Professor.

    • Derek Vogt

      With only about 20 symbols in regular use, whatever the real solution is, it must be something unlikely. Either the spoken language had an unusually small range of sounds, or the alphabet used an unusually large amount of compression and gimmickry to get multiple sounds onto few letters. No solutions that would seem likely are possible. All we can do is try to pick one unlikelihood over others.

      I attribute no more “vowel-like” qualify to ^n,m^ than to any other consonant in an abjad (where vowels are often unwritten but some consonants may be more often associated with one vowel sound than with others). To save space in this post, I won’t write out a defense of the lack of a distinct /L/ consonant and the merging of /m/ and /n/ here, other than to say that my list of cognates shows most of what I would say if I did, and that whatever the cognates tell us must be right whether it seems odd to outsiders like us or not. But I will add what I think are two bigger, more challenging issues than those.

      Where foreign cognates have /r/, Voynichese seems to have at least five different written manifestations. Two are simply symbols that we can easily say represent /r/ on their own nearly every time they’re used and look like versions of a single letter. But in several cases, when foreign root words end with /r/ and their Voynichese cognate has a suffix attached, the Voynichese letter in its place is ^w^. And Voynichese ^oo^ and ^oa,oe^ repeatedly (perhaps even always) appear to correlate with sequences in foreign cognates involving not just vowels but /r/ plus vowels, and foreign /ar/ or /er/ often correlates with Voynichese ^a,e^ alone, as if the /r/ has vanished from those words or is sometimes implied. And that last one also applies to /L/, not just /r/ (even to the point that that letter is used in one case where cognates have only the /L/ and no vowel sound), but none of the others do. On top of that, there are even a few cases where the Voynichese letter ^r^ correlates with /i,y/ in other languages.

      There are several ways that “r” can manifest in different languages, or in the same language with multiple “r”-type sounds, or when getting transferred from one language to another. It could be a curling of the tip of the tongue, it could be a lifting of the tongue somewhere farther back, or it could be equivalent to other sounds including /d/, /L/, /w/, and /y/. And it can be lost from words that had it before, vanishing without a trace. So Voynichese’s five things in writing that correlate with /r/ in other languages could represent up to five different things (including nothing) in spoken Voynichese, with foreign /r/ meeting up to five different spoken fates when imported to Voynichese. That could be either a sign that it didn’t have /r/ at all, or that it had multiple sounds that could be equated with /r/ in another language.

      Even more perplexing than that is the lack of any sign of a letter for /i,y/. That seems like a pretty important sound for a language to have and for an alphabet to represent. In a few cases, the spot where that sound is found in foreign words has the Voynichese letter ^r^, and in a few other cases, it has ^a,e^. But there aren’t many of those, and more often than not, there’s just some other consonant or nothing at all. There’s no particular overall pattern, just a scatter of options. I can’t imagine a spoken language lacking this sound, but if one did, this is exactly what its alphabet and a sample of cognates with foreign words would look like. (There is an unused symbol, but if it’s /i/, then we need more explanations about why that sound would show up only in suffixes, and why it would have the only letter that often appears consecutively with itself or why there are still other letters that look like multiples of it and also only show up in suffixes.)

      I suspect /i/ could be the default vowel that’s always assumed unless another one is specified instead. There is precedent for something like that. In several alphabets of India, none of the consonants are really just consonants; they’re syllables consisting of the consonant sound and then an immediate /a/ by default. To indicate a syllable consisting of that same consonant sound followed by another vowel sound, you must add more vowel-specifying strokes to it. If you want to represent the consonant sound alone with no vowel sound after it, you must actually add marks to the letter in order to indicate the lack of an /a/. But there are two catches with this analogy. First, even those alphabets do have a separate letter for /a/ when it isn’t preceded by a consonant sound, so although they are examples of a universal default vowel, they aren’t examples of a universal default vowel without a letter at all. And second, they only illustrate /a/ as a default vowel, which is also the vowel sound that most often goes unwritten in Semitic languages and is the least-articulated vowel, so they do little or nothing to indicate the possibility of any other vowel sound as a default.

      • Darren Worley

        Hi Derek – have you considered that the VM might be written in some kind of shorthand? This could be an explanation for “compression” that you refer to. If the intended audience of the VM was small (perhaps the author alone), then I can imagine the omission of certain letters, or use of abbreviations, would be quite acceptable, whilst still retaining all relevant information for comprehension. I do this myself occasionally, it doesn’t need to be a “formal” shorthand system – just an informal ink-and-time saving method.

        I think its also quite plausible that the VM was copied by scribes, who might not have known what the underlying text meant. Perhaps the “underlying VM text” was in a script employing vowel points, and in the translation/copying process these have been omitted, or only partially re-instated?

        • Derek Vogt

          Well, if scribes made changes like that, I hope they didn’t get paid!

          Shorthand is like the idea of plain vowels versus nasalized vowels (like in French before a silent or nearly-silent [n]): things for which I know of no indicator for them or against them and no way to try to test for their accuracy or inaccuracy.

          • Darren Worley

            Derek – the VM does contain an example of careless or incomplete copying – there are the outlines of an unfinished T-O map on f86v3.

            I’ve come across a couple of examples where texts have been re-copied to the point of illegibility, so I think this is a genuine possibility in the case of the VM.

            The first is described by Drower in the introduction to the Mandaean “Book of the Zodiac” (p.2) where “copyists recopy ancient errors, with disaster to the clarity of the text, a not uncommon feature of ancient manuscripts”. There are frequent references in the footnotes of this book to corrupted manuscripts as a result of semi-literate copyists.

            The second example, concerns the Kaifeng Jews (the wikipedia entry is well worth a read.) I think this could be a excellent example (archetype) of the kind of isolated community that could produce something like the VM.



            It seems that this isolated Jewish community lost the ability to read Hebrew, (the last Rabbi died c1810), and it seems they developed their own unique pronounciation of Hebrew.

            Ref: “The Haggadah of the Kaifeng Jews of China” By Fook-Kong Wong, Dalia Yasharpour


            A passage (excerpted below) from the British Library page gave me an idea. I notice that there are several repairs in the VM. (eg.f37v) – I wonder if these kind of repairs (whether made with silk, thread or sinew) could give a clue to the location of the copying of the VM?

            Quote: The [Kaifeng Torah] scroll is made up of 94 strips of thick sheepskin sewn together with silk thread, rather than with the customary animal sinew. It has 239 columns of text copied in a Hebrew square script similar to that used by the Jews of Persia, without signs to show any vowel sounds. Of the 15 Torah scrolls that are said to have been held in the Kaifeng synagogue, only seven complete scrolls have survived.

          • Darren Worley

            An interesting article on a method of shorthand – used from 1st century BC through to the 16th century AD.


      • Darren Worley

        Derek – I find it quite puzzling that the VM script contains so few characters (only about 20 in regular use, as you mentioned).

        Is there a precedent for a written scripts having so few characters?

        I’ve seen that several early Persian scripts have few graphemes.

        For example:

        Psalter Pahlavi contained only 18 graphemes.

        Book Pahlavi contained only 12 or 13 graphemes, representing 24 sounds, and employed complicated ligatures.

        Do you consider these to be likely candidates (i.e related to Voynichese)?

        • Derek Vogt

          I don’t see graphical similarity between letters for the same or similar sounds there, and I don’t consider alphabet size a good basis for assessing how related two alphabets are. Letters get dropped/merged or added/split too easily and often for that.

  8. Derek Vogt

    Here are all of the words I have now according to this phonetic system, including star/constellation names alongside the original table of plant names. Footnotes 12-15 are new with the addition of the stars & constellations, but some also apply to some plant names and have been inserted there accordingly. Also, the original table of plant names has had a few updates in this version: it’s now in folio order, it includes English common names as well as scientific names, several changes in font faces & colors have been made just for slightly better presentation of the same original ideas, and more cognates/spellings (particularly in their original alphabets where only transliterations were available before) have been added for 3v, 4r, 5v, 6v, 8r, 15v, both items on 16v, 21r, and 38v.

    Aside from those little changes in how the ideas are presented, there are also a few changes to the ideas themselves:

    1. One more plant identification has been added, for folio 1v (first paragraph), as proposed here by Darren Worley.
    2. Alternative pronunciations have been added in the small tables at the top and in the specific words where they are found below, including {š:s}, {a:e}, and {n:m}. The latter two alternatives appear in clearly a minority of cases, but still multiple examples apiece.
    3. Error correction (and a change in spellings): For folio 1v, second paragraph, I had originally transliterated the proposed Arabic and Persian cognates of ^bagan^ as ending with [-gan], because I was used to equating the letterج with its cousins, our [g] and Greek gamma. Although /g/ was the original pronunciation and is still used in some Arabic-speaking areas, it has evolved to the same sound as English [j] for most modern Arabic and Persian speakers, and is typically romanized as [j] for English-speaking audiences. Another common way to romanize this letter is [ǧ], to simultaneously acknowledge its origin, most common modern pronunciation, and minority pronunciation, all while also getting around the problem that only English uses [j] that way while other languages use some implementation of [g] (similar to what happened in Arabic and apparently also in Voynichese). I have corrected the error on folio 1v to [-ǧan] and now use this transliteration standard for any other words with ج. For the Voynichese letter I rather Englishly called ^j^ before, I believe ^ǧ^ is the more logically consistent representation (and even neatly parallels the use of [č] for its unvoiced counterpart), but I don’t want people thinking the sound /g/ is intended, so I still sometimes rewrite the same word using [j] just for clarification. (IPA uses combinations of two letters for affricates, but I’m avoiding turning one foreign letter into two Latin letters in the same word as much as I can.)
    4. Error correction (and a change in spellings): The Arabic letter ح was equated with [h] at my source for plant names, which is a common simplification (especially in settings where special characters can not be produced) which I haven’t been aware of for very long and failed to catch and improve on at first. This made the word for “seed, bean, nut, grain”, حبة, come out as “h-b-ah” in our alphabet. But in Arabic, the letter for plain /h/ is ه, and ح is an “emphatic H” (), pronounced with more constriction in the throat, for a sound represented in the IPA with a modified [h]: /ħ/ or /ʜ/. A second Arabic ““, خ, is constricted at a higher level, up in the top of the throat or back of the mouth: IPA /x/ or /χ/, normally romanized as [kh]. That renders those symbols unavailable for a distinct representation of ح, which is between that and a plain /h/. So, in my transliterations (the ones that appear in quotation marks right after foreign-alphabet spellings), ح is now rendered as [ħ], as in “ħ-b-ah“. (Any names I found in the Latin alphabet are unaltered from my sources’ spellings regardless of their origin.)
    5. On the subject of various sounds around /k/, /x/, and /h/ as I just mentioned above, footnote 5 below the tables of cognates is modified, and the subject will be pursued in more detail in its own separate post later.

    • Derek Vogt

      I’ve finally completed my recheck on two letters that seemed to fairly often be correlated with sounds a bit different from what I had thought (although recognizably close): ^h^ and ^x^.

      The bottom line is that I’ve concluded that the original assignments of /h/ to ^h^ and /x/ to ^x^ were valid, but with one additional rule needed in order to explain the exceptions: a sound shift series in spoken Voynichese, in which /k/ had sometimes become /x/, /x/ had sometimes become /h/, and the author(s) of the Voynich Manuscript spelled words according to their “new” sounds.

      The details of how I arrived at that are in the attached image.

      …We have two “Browse” buttons? The top one says “JPEG only” and the bottom one says it can also take other file types, so I’ll figure the bottom one is probably a non-functioning fluke and just try the top one first…

      • Derek Vogt

        Hmmm… it looks like I got so wrapped up in the k-x-h thing that I rushed the last paragraph on the remaining misfits and wrote it backward! (A sound appearing in Voynichese that’s absent from foreign cognates wasn’t dropped by Voynichese; it was either added by Voynichese or dropped by the others!) So here’s what I really meant on that bit at the end…

  9. Daniel White

    Using your phonetic scheme, I’ve managed to transliterate the possible pronunciation of the first paragraph of f1r:

    puhans nkush ur uguur xar jros n kar xashtn sarn chur arn kuir hguur xur uro jur jur tur snaur xokun ar nkaur xat jaurn jos turuur aaur agoon agoas rashag(a)n jur taur aguur ar akur tairn hour juur vur fuur nturuixn

  10. Derek Vogt


    a reminder on bracket symbols to start with:
    /letter/ = sound, regardless of how it’s spelled
    [letter] = written symbol, regardless of how it’s pronounced
    ^letter^ = reference to Voynichese letter that I believe represents that sound

    Plosives followed by ^h^: aspirated?

    After noticing that there seemed to be a lot of examples of ^bh^ and ^ph^ in the Voynich Manuscript, I got the idea that ^_h^ might be a way of representing an aspirated plosive, or that the sound sequences /b/+/h/ and /p/+/h/ were inordinately common because of having evolved from former aspirated plosives. (When a plosive is aspirated, it is released with a momentary unvoiced increase in airflow, sounding similar to /h/, but significantly quicker. In English, it isn’t written, or thought of as a separate sound at all, so if we want to represent it, we normally resort to adding a superscript [h], as in “pʰepper” and “cʰoconut” and “tattʰoo”.)

    So I did some checking at http://www.Voynichese.com to see exactly how common those and some other comparable letter-pairs really are. I compared the fraction of all letters in the manuscript that are ^h^ (EVA-ch) to the fractions of letters immediately after plosives that are ^h^. Reporting the results as ranges is necessary because the calculation requires knowing how many letters the manuscript has in total, and different methods of determining that yield different totals around 140000 to 170000. If the pair of letters together indicates something in particular, then ^h^ should be more likely right after a plosive than it is in general, as English [g] is more likely after [n] than it is in general and [h] is more likely after [t], [s], [c], or even [g] than it is in general. This is what I found in Voynichese:
    After ^b^, ^h^ is 7.1-8.6 times as common as it is overall.
    After ^p^, ^h^ is 6.0-7.3 times as common as it is overall.
    After ^g^, ^h^ is 2.2-2.7 times as common as it is overall.
    After ^k^, ^h^ is 1.4-1.8 times as common as it is overall.
    After ^t^, ^h^ is 0.8-1.0 times as common as it is overall.

    So, the likelihood of ^h^ increases a lot after a bilabial plosive, increases by a smaller margin after a velar one, is slightly higher after the voiced one than the unvoiced one in both pairs, and has no change or actually goes slightly down after ^t^. This is the kind of result we would see if the language had, or recently previously had had, many aspirated bilabials and fewer aspirated velars, with aspiration being slightly more likely for voiced plosives than unvoiced ones, and no aspirated /t/ but even a slight tendency for /h/ to seem to be avoided after /t/. A couple of cautions about this interpretation are warranted, but I’ll skip them and my responses to them for now because this already will be a long post anyway. 😀

    Plosives followed by ^x^: emphatic (aspirated/pharyngealized)?

    Curiosity about why a language with aspirated plosives would appear not to aspirate /t/ (especially when that’s also the only plosive that’s not in a voiced-&-unvoiced pair and not represented by a gallows letter), plus another reassessment I’m in the middle of concerning ^h^ and ^x^ and will have to make the subject of a separate post later, made me expand my scope beyond aspirated plosives. Aspirated consonants are one type within a larger category called “emphatic” consonants. Another kind of emphatic consonant is pharyngealized, which is similar to aspirated but with a pharyngeal fricative (ʕ; _ˤ) instead of glottal (h; _ʰ). And another way to make a consonant emphatic, if the plain original is a fricative, is to shorten it and start it with its plosive counterpart at the beginning, turning it into an affricate (s→ᵗs, z→ᵈz).

    Semitic languages have somewhat variable sets of emphatics based on an ancient pattern consisting of /hˤ/, /tˤ/, and an emphatic “s” that was probably originally emphatic in two ways together: /ᵗsˤ/. When transliterating Semitic emphatics into our alphabet, an option which avoids the need to add a letter (even in superscript) is to add a dot below the letter: ḥ, ṭ, ṣ. This also conveniently includes all derived forms found in different modern languages/dialects, such as either /ᵗs/ or /sˤ/ for /ṣ/, and either /ħ/, /ʜ/, /χ/, or /x/ for /ḥ/ (results of the /h/ and the /_ˤ/ merging).

    Those various recent outcomes of /ḥ/ in Semitic languages and the fact that /ḥ/ in any of those forms is still considered an emphatic sersion of /h/ reminds me of the Voynichese letters I’ve labeled as ^h^ and ^x^. The latter is often depicted outside the Voynich Manuscript in a form that looks possibly as closely related to ^$^ as it is to ^h^, but in the actual manuscript, it’s often drawn more like just ^h^ alone with a diacritical mark, with no apparent connection to ^$^ or any other letter but ^h^. That makes it look like the Voynichese authors saw it as a version of ^h^, which fits with the comparison between their apparent sounds. That’s roughly the same sitautation for them as if we were to use symbols looking like [h] and [ȟ] to us, or [x] and [ẍ]. So, here’s the same number-check as before, but with ^x^ this time instead of ^h^:
    After ^b^, ^x^ is 1.6-2.0 times as common as it is overall.
    After ^p^, ^x^ is 1.5-1.8 times as common as it is overall.
    After ^g^, ^x^ is 1.0-1.2 times as common as it is overall.
    After ^t^, ^x^ is 0.9-1.1 times as common as it is overall.
    After ^k^, ^x^ is 0.7-0.8 times as common as it is overall.

    Except in one case (^kx^), they all fall in the same order as in the first list, in loosely similar porportions to each other, but rescaled so they’re closer to 1. (1 would mean no effect on the odds of ^x^ after the given letter at all, exactly the same odds as anywhere else in the manuscript.) That means the odds of ^x^ change after some of these letters, but not as much as they change for ^h^. The exception is ^x^ after ^k^, which has close to the same magnitude of shift away from 1 as ^h^ after ^k^, but in the opposite direction: ^h^ is more likely after ^k^ but ^x^ is less likely in the same position. I explain this as a result of the similarity between /x/ and /k/, both being unvoiced velar sounds; two similar sounds are less likely to be consecutive because they’re likely to merge. Other than that, the similarity in the effects of placement after plosives for both ^h^ and ^x^ looks like what we might expect if ^_h^ and ^_x^ after plosives had originated as one phenomenon sometime before and then split.

    Semitic emphatic /ḥ/?

    Considering Semitic languages & alphabets gives us more places to look in Voynichese for possible emphatics: not just trios of plosives in the Indo-European pattern, but also the Semitic ḥ-ṭ-ṣ pattern. So if Voynichese ^h^ and ^x^ are related to each other, and both represent something like /h/ or /x/ (or /ħ/, /ʜ/, or /χ/), and are candidates for forming emphatics when added to plosives, then it stands to reason that one of them could be the Semitic /ḥ/, or /ḥ/ could equate to a sequence consisting of one of them and then the other, as in [hˤ]. So, how common are those sequences?
    After ^h^, ^x^ is 0.04-0.04 times as common as it is overall.
    After ^x^, ^h^ is 0.05-0.06 times as common as it is overall.

    They practically don’t exist: a total of 29 occurrences going either way, for letters of which there are thousands of occurrences overall. I had to include another digit just to show that three of the numbers weren’t 0. It’s like they’re scared of each other. Simplest conclusion: they’re rarely consecutive because they’re so similar (like an emphatic-&-plain pair would be), and such a sequence isn’t needed to indicate /ḥ/ because one of them already is /ḥ/ on its own. I’ll save most of the reasons why I figure it’s ^x^, not the other way around, for later, but I will mention one here: following the idea that the Voynichese alphabet was derived from the Syriac one, it’s easy to see that, although ^x^ doesn’t resemble any Syriac letter and Syriac “ḥet” (ܚ) doesn’t resemble any Voynichese letter, Voynichese ^h^ and Syriac “he” (ܗ) do resemble each other (in fact that resemblance is how the sound /h/ for the letter ^h^ was originally predicted). So what we’ve seen here is exactly the result that would happen if they dropped ḥet, kept he, and used the latter plus a diacritical mark to replace the former.

    Semitic emphatic /ṣ/?

    In all Semitic alphabets, /s/ and some kind of /ṣ/ get two separate letters, regardless of whether the emphatic form is /sˤ/ or /ᵗs/. (It should also be pointed out that there’s a tendency for the letter for /š/, “šin”, to drift toward /s/ as well; both Hebrew and Arabic ended up needing to use dots above the main body of the letter to distinguish the two sounds and even calling the version representing /s/ “sin”.) In Voynichese, I’ve previously identified one plain fricative sibilant (^š^, which as far as I can tell could represent /s/ or /š/ or both) and one affricate sibilant (^$^, to which I’ve already assigned the sound /ᵗs/). If those sound assignments are correct, then there should be no sign of /ṣ/ being indicated by ^h^ or ^x^ after ^š^ or ^$^, or by ^t^ before either of them, because ^$^ should already be the emphatic version of ^š^. So what do the numbers tell us?
    After ^$^, ^h^ is 0.4-0.5 times as common as it is overall.
    After ^$^, ^x^ is 0.3-0.4 times as common as it is overall.
    After ^t^, ^$^ is 0.8-1.0 times as common as it is overall.
    After ^š^, ^h^ is 1.0-1.2 times as common as it is overall.
    After ^š^, ^x^ is 0.9-1.1 times as common as it is overall.
    After ^t^, ^š^ is 0.1-0.1 times as common as it is overall.

    …Exactly as predicted. There’s probably something else to be learned from which of these sequences are particularly uncommon instead of just being somewhere around 1, which could reveal more about the exact pronunciation of ^$^ as the Voynichese /ṣ/. (Why would /tṣ/, /sh/, and /sx/ be about as common as you would predict from pure randomness while /tš/, /ṣh/, and /ṣx/ are inordinately rare? It looks as if /ṣ/ were aspirated or pharyngealized, not an affricate, but the affricate is the sound I’ve identified from examples, so could it have retained both?…) But I’ll leave that for now, because just the fact that none of them is substantially higher than 1 is a good indicator that none of them are used to indicate any incarnation of /ṣ/, which leaves ^$^ free to be it on its own.

    This also falls perfectly in line with the original prediction from almost a year ago that both ^š^ and ^$^ would be some kinds of sibilants, based on their resemblances to Sryiac šin (ܫ) and ṣaðe (ܨ). In fact, that prediction turns out to work at a more specific, precise level than even I anticipated, with exact 1:1 correlations in both cases. In Syriac, although neither of these exactly represents /s/ (there’s a third letter for that with no Voynichese counterpart), šin (ܫ) does represent /š/, which is not an emphatic form, and is related to letters that have ended up as /s/ in other Semitic languages anyway, whereas ṣaðe (ܨ) is unmistakably Syriac’s /ṣ/.

    I have been referring to EVA-L as ^š^ and not ^s^ just because I wasn’t sure whether it represented either /s/ or /š/ alone or both, and, with two letters as candidates to at least sometimes represent /s/, I didn’t want to associate the letter [s] with one of them and not the other. Now I think the signs are clear enough, including examples of ^š^ in words correlating with /s/ as well as /š/, that ^$^ is definitely /ṣ/, not /s/, and ^š^ is another example of a Semitic šin turning to sin and representing both sounds at different times, so that I can start dropping the caron, at least for plain /s/. So from now on, a plain ^s^ will start appearing in my Voynichese words where I would have used ^š^ until today, leaving the caron to use only for the sound /š/, specifying which of the letter’s two sounds it’s representing in a given example, like the optional tildes I already use in ^ã^, ^õa^, and ^õo^.

    So, what happened to /tˤ,tʰ/?

    Finally, with the Voynichese counterparts to Semitic /ḥ/ and /ṣ/ in mind, we can take another look at the numbers for potential emphatic plosives and explain the mystery of ^th^ and ^tx^. Why would ^h^ and ^x^ be only as common after ^t^ as elsewhere, or less, when they’re more common after other plosives (with just one exception that already has another explanation)? We’ve seen that when one letter alone is already emphatic, any two-letter combinations that might otherwise have been guessed to be the same emphatic are uncommon, as if suppressed by the presence of the single letter. So the uncommonness of ^th^ and ^tx^ is what you would expect if you thought ^t^ alone represented not a plain /t/ but its emphatic counterpart. And if you thought that, then you might recall that all Semitic alphabets and also the Greek one inherited two separate letters for plain and emphatic “t” from the Phoenicians, and you might expect the Voynichese letter ^t^ to resemble the emphatic one more than its plain counterpart.

    And you would be right: Syriac ṭet (ܛ), not taw (ܬ), is indeed the one that was already observed to resemble Voynichese ^t^ months ago (even more so in the modern eastern Maðnḥāyā and especially western Serṭā fonts than in the “classical” ʾEsṭrangēlā font that’s getting displayed by default). So evidently ^t^ isn’t often followed by ^h^ or ^x^ to from its emphatic because it’s already emphatic by itself. The only mystery that leaves is why the plain /t/ and /d/ are nowhere to be found in Voynichese.


    Nothing new, but this all does fit in perfectly with, and add support to, the idea that the Voynichese alphabet is a modified version of an eastern-Aramaic/Syriac alphabet, adopted and modified by speakers of a non-Semitic language, which I can now say probably had a set of emphatics more like those of some Indo-European languages than the usual Semitic set.

    • Darren Worley

      Derek – regarding your conclusion, what is your current thought on the identity of the non-Semitic language? Is it Greek?

      • Derek Vogt

        I expect it to be a language I’ve never heard of before. The number of languages in that region is in the hundreds, including members of three huge families (Indo-European, Semitic, Turkic), several less-huge major families, and dozens of either isolates or much smaller families of just a few languages apiece. Most are unofficial, minority languages in their own home countries and virtually unknown in other countries. I don’t even know most of their names, nevermind any details about them that would be useful for comparison as candidates for Voynichese. So even if I spotted what seemed to me to be a striking similarity between Voynichese and something I do know of such as Greek, I would have no way to determine that the same or another equivalent similarity didn’t exist with one of the others.

        An alphabet that’s only found in one surviving book is one that was never particularly widespread or influential, and a language fitting that description would probably have escaped my attention so far. For example, Mandaic and each one of Malayalam’s multiple historic alphabets have all produced larger collections of writing than Voynichese did, and I was unaware of those before reading about them here as related to the Voynich Manuscript.

        Spoken Voynichese existed in a world where writing was already very common in other nearby communities, but adopted an alphabet that the neighbors weren’t using. So it presumably either lacked writing before, or was motivated to drop a previous alphabet by a strong urge to establish a unique identity separate from the more powerful entities around them. Either of those would be the mark of a small community I wouldn’t be very familiar with right now.

    • Derek Vogt

      A couple of little discoveries with implications for Voynichese emphatic consonants have emerged from the star/constellation names on f68r1 and f68r2.

      Star labels 35 and 40 include ^th^: ^thãs^ and ^ãnthn^ (^ãnthm^). These eventually turned out to equate to Arabic “Ṭhalīm” and “Al Niṭhām, Alnaṭhm”. It is interesting that when we do get ^th^ clearly equating with a sequence in a known language in which /h/ clearly comes immediately after some kind of “t”, it is not just a plain /t/ but an emphatic // (//).

      Also, we now have something we didn’t have before: a clear example of a Voynichese word whose apparent cognate is unmistakably and explicitly spelled in a way that exclusively specifies an aspirated plosive, not anything else such as a plosive followed by a separate /h/. (The evidence for my idea that aspirated plosives came out as plosives plus ^h^ in the Voynich Manuscript was only indirect until now.) Star 7 on f68r1, ^nghãtn^, which is located in the right part of the sky-map to be Pegasus & Equuleus together, which have been named “the horses” in other languages, fits with the Urdu word for “horses”, گھوڑون , “gʰwṟwn”. The Urdu alphabet has a separate letter for a separate /h/ sound, and doesn’t use that letter here; it uses a symbol whose only function is to indicate aspiration of another letter, like our [ʰ] instead of [h].

  11. Stephen Bax

    Derek, I wanted to ask you about your thoughts on nasalised vowels. I have mused sometimes that EVA:y – which I have suggested elsewhere might represent /n/ as in the word for Centaur – might actually instead represent a nasalised vowel something like the French bon /bõ/, so that in places it is to be read as /n/ but elsewhere it is to be read more as a vowel.

    Do have any ideas about this, or about the vowels in general?

    • Derek Vogt

      Well, it seems to me that the easiest vowels to nasalize, and most likely to be nasalized by default even if others aren’t, are /e/ and /i/, and those do happen to be the peculiar gap in our vowel collection so far that’s been bothering me, so EVA-Y being sometimes /n/ and sometimes a nasalized /e/ or /i/ would certainly solve my “Where’s the /i/” problem!

      To check the letter’s use in examples, I’ll start by eliminating the ones that use it in a suffix, appearing only after all sounds in the cognates are already accounted for, so we’re only dealing with examples that seem to have some role in the root word.

      First the ones that really only work if we just treat it as /n/ and nothing else… which seem to have an inordinate tendency to still be the last letter even though they aren’t in suffixes…
      16v ^bhajn^ (Hindi badayan)
      95v1 ^ãokxn^ (Arabic aldxan)
      1v ^bagan^ (various languages’ b_g_n)
      31r ^kõoton^ (Greek kroton, Arabic krwtwn, Latin croton)

      …but not not every time…
      14v ^btnha—^ (Latin betonia)
      2r ^kntw–…^ (various languages’ kent_r_)
      21r ^ãpnhon… bhãpnhn^ (Persian perpehen)

      In the first two of those three cases, although there is an /n/ in the cognates, there’s also a need for an unwritten vowel (unless the /n/ is /&#7751/, which in some languages can be a syllable by itself with no vowel), so /n/ could be supplying its own vowel in some way, but so far only in the same way that any consonant can in an abjad. The third one is more interesting because it’s in a position where a vowel belongs but there’s no /n/ or any other consonant in that position in the cognates, which goes against the idea of syllabic /ṇ/ almost as soon as I brought it up. So, at least between the /p/ and the /h/ in “perpehen”, it could be acting as only a vowel, and it could even have been inserted there just for that because of lack of another letter to use for that vowel sound. But, with only one example like that to work with, I can’t say how likely that solution is compared to others, such as that Voynichese simply had an /n/ in there that Persian lacks.

      I also went through the f68r star names I have so far to look for examples of EVA-Y correlating with some part of a root word instead of tacked on after the end of it like a suffix again, and what I got there was an unhelpful mix…

      In both 11 ^vajn^ (Vega) and 22 ^avn^ (Auva) it’s in the same place as a Latin /-a/, but there’s really no way to say this isn’t just another suffix replacing the Latin suffix, instead of a phonetic representation of the same sound. And /a/ is neither a sound we lack a Voynichese letter for, nor the same thing as the /e/ we might have had going above.

      8 ^agxon^ (Ghamb, Gihon): just a consonant at the end of the word again like my first list in this post

      13 ^agnx/ãgn$/agnr^ (Regulus, Magha): If it’s related in some way to “Magha”, then there are two letters with no clearly correlating sounds in the cognate: EVA-Y and whatever that thing after it is. I think the best match is as ^ãgn$^, for “Regulus”, which would put the EVA-Y in the equivalent position to /-ulu-/ or /-L-/ or such, in the middle of the word. This could be taken as a sign of switching or equivalency between /L/ and /n/, but still doesn’t tell us anything about the vowels that couldn’t happen with any consonant in an abjad. If we are to really catch it acting as a vowel, we need to catch it in another “perpehen”, occupying a space where nothing else but vowels belongs, not vowels plus a similar consonant.

      12 ^ãgnkh$^… this one could be related to Greek “Argokeros”, for “Capricorn”, in which case the EVA-Y is trapped between the ^g^ and the ^k^, equivalent to the /o/ in “Argokeros” and no other consonants. Unfortunately, it also could be related to any of three other names for Capricorn which would apparently treat the letter as a consonantal /n/ or /L/: Alcantaurus, Ughlak, and Gnoum.

    • Derek Vogt

      Here’s something I’ve had in mind lately about EVA-O, ^a^, but hadn’t actually collected & listed the examples of it until you brought up vowels in general: although the majority of its occurrences do still correlate with /a/, the minority that correlate with something else around /e/ or /i/ or /ai/ are a pretty large minority…

      1vb ^bagan^: Urdu baigan; Bengali begun; Hindi baigan, baijani, baingan

      4r ^katwshn^: Turkish keten

      5v ^kahoar^: Persian khiru; gulkhair in India; Arabic khitmi

      11v ^basthatn^: Hindi bes, bis, bhainsala; Marathi bitsa, bithsa; Punjabi bisa; Urdu burg-baid-sada

      21r ^ãpnhon… bhãpnhn^: Persian perpehen

      38v ^xagas_agõas^: Greek ἁγριες “hagries” agries

      93v ^ba$xõatn^: Hindi bajguriya

      68r1-11 ^vajn^: Latin Vega

      68r1-13 ^ãgn$^: Latin Regulus

      68r1-14 ^asar^ or ^ašar^: Persian Ser, Shīr

      68r1-17 ^ãghtã^: (modified Arabic) Alchayr, Euphratean Erigu, Turkish Taush Augjil

      68r1-25 ^akoõatws^: Coptic Khoritos

      68r1-29 ^tãshotn^: Persian Terāzū

      68r2-42 ^haswr^: Greek Hestia, Heskhara

      68r2-49 ^ãbavar^: (modified Arabic) Albezze, Albizze

      (I left out a few other examples of the same kind of correlation that I’m pretty sure are too recent and too western to be relevant.) For contrast, I’m not aware of a single example correlating with /o/ or /u/. I think it would be a bit silly to conclude that the same Voynichese letter covered everything from /a/ to /e/ to /i/, but if it was pronounced /e/ (or sometimes /a/ and sometimes /e/), then it’s not too far off to think /i/ in foreign words might get converted to that /e/ when imported.

      • Derek Vogt

        Given the lack of a clear letter for /e/ that I knew of in any Aramaic-based alphabet, I did a bit more checking on those rare words I can see that get transliterated with an [e] in Syriac, Hebrew, and Arabic.

        In Arabic, transliteration as [e] seems to happen only with the letter ‘alif, which is more conventionally transliterated as [a] (when it’s used as a vowel at all). It looks like this is a difference between one dialect and another, not a sign of the two phonemes co-existing in any single dialect. Is that correct?

        In Hebrew, more often than not, the letter getting transliterated as [e] is yodh (otherwise mostly used like English [y], in both of its manifestations as a consonant and a vowel; related to our [i]). But it can also sometimes be alef (otherwise equivalent to our [a] as a vowel) or he (pronounced /h/ as a consonant, related to our [e] because the Greeks adopted it as a vowel). Each of these already has its own other sound that’s usually ascribed to it instead, so I can’t tell what this mess means about pronunciations in spoken Hebrew, except that a sound like /e/ does not have its own dedicated letter.

        In Syriac (as you can see in the list of letters I linked to above, where each letter’s name is spelled out in both our alphabet and the Syriac one), ‘alap and only ‘alap gets rendered as both [a] and [e] quite routinely (and which one the transliterator chooses might be influenced by the letter’s placement in the word; for example, it’s always [e] at the end, never [a]). So it looks like they are two distinct phonemes sharing a letter.

        So transliteration as [e] for the letter we would otherwise equate with [a] has precedents in all three, under different circumstances. Of the three, the one model that comes closest to the way I have EVA-o freely switching back & forth between /a/ and /e/ in the same dialect/language, like one letter representing two phonemes, is Syriac.

    • Daniel White

      Stephen, I agree that EVA:y should likely be a vowel. I ran into a lot of strange pronunciations when I was transliterating the first paragraph of f1r using Derek’s scheme (some of the strangest were “nkush” and “nkaur”). Here’s what I got:

      puhans nkush ur uguur xar jros n kar xashtn sarn chur arn kuir hguur xur uro jur jur tur snaur xokun ar nkaur xat jaurn jos turuur aaur agoon agoas rashagn jur taur aguur ar akur tairn hour juur vur fuur nturuixn

  12. orun rubacı

    some plant names contribution

    Turk : hıyar
    Pers: ħiyār

    Pers : panbuk

    Rum: kentauron

    Pers : şahterre

  13. Olga

    Hello, Mr. Bax! I would like to thank you for your optimistical work and that give hope and to us, that who is interested in the manuscript. It is valid excitingly and really wonderfully! Good luck to you in this work, allow this riddle! With love, from Russia.
    P.S. Sorry!my English is very baaaad((((

  14. Derek Vogt

    One thing I wish I could change about the original table: it shows “arar” as the word for “juniper in both Hebrew and Arabic, but, soon after sending it to Professor Bax, I found out that the Arabic word is actually عرعر “ʕrʕr”, so a more thorough transliteration would be [`ar`ar], even if we normally pretend the /ʕ/ isn’t there. It makes a difference for my phonetic interpretations because it offers an explanation for the word that comes out as “garar” in my sound system, because /g/ is one of the plausible outcomes of /ʕ/ in a word imported to a language that doesn’t have /ʕ/. (I think I’ve seen that another time or two in star names as well.)

    • Derek Vogt

      Another addendum…

      I’ve found out a bit more about one particular example in this table that had been bugging me: 08r, ^agxwš^. (I’m using carets to mark proposed Voynichese transliterations from now on, rather than keep adding length & convolusion to my sentences to indicate the same thing, or wrestle with the non-phonetic EVA, or unintentionally imply that my phonetics are completely standard & universally accepted by using the IPA slash.)

      The relevant words at my main plant name source were “faqqûs” and “faghoos”. It seemed reasonable that ^gx^ could be a digraph for /ɣ/ (voiced velar fricative; voiced counterpart to /x/), because that is normally the sound of [gh] in a Romanized Arabic word (from غ), and because just trying to pronounce /g/ + /x/ tends to slip into /ɣ/ anyway. But that idea could work or fail depending on exactly which Arabic letter had gotten rendered as [gh] and why, and whether [qq] had somehow come from the same letter or a different one.

      I now know that both came from one original word: فقّوس. The letter in the middle is ق, which is actually related to our [q] and normally transliterated as [q]. Its conventional pronunciation is /q/ (unvoiced uvular plosive). In this word, it has a diacritical mark above it to indicate the same thing as doubling. So “faqqûs” fits perfectly, but where did “faghoos” come from?

      I’ve discovered that most of the Arabic-speaking world has been losing /q/. In most areas it’s become voiced, yielding /ɢ/, and then drifted from uvular to velar, yielding /g/. Furthermore, in Persian (and subsequently in Arabic in places with much Persian influence), it’s even changed in a third way, becoming a fricative instead of a plosive, which brings us to /ɣ/ as a common pronunciation of ق. And [gh] is a common way to transliterate it, although some transliteration schemes are still stuck on using [q]. So the best letter-or-letter transliteration of فقّوس in much of the Arabic-speaking world, particularly areas near Persian-speaking places, is not “fqws” or “fqqws” but “fɣws, fɣɣws”.

      That gives us:
      ▶A straightforward connection between ^gx^ and /ɣ/, where the known language definitely has exactly that sound in the right place, so no more need to infer it indirectly
      ▶Another example of Arabic influences in the Voynich Manuscript being related to Persian in some way

      • Neticis

        Just for wider audience here are links to described sounds (IMHO) in wikipedia:

      • Darren Worley

        Derek – thanks for your interesting update. You’ve no-doubt read my work on the T-O map suggesting that the VM is a Greek-influenced text that has been written in a language closely-related to Middle Persian (Pahlavi). There is not yet enough evidence to say its if entirely in Middle Persian as the identified words could be isolated loan-words, in a closely-related language.

        Middle Persian is an Iranian Aramaic-related language used between c300CE – c950CE. One of the peculiar features of this language is its use of logograms. I wanted to ask how your phonetic analysis accommodates the possible use of logograms?

        Let me explain – the use of logograms, would result in multiple symbols generating the same phonetic pronunciations. There is already some evidence that suggests this is indeed the case in the VM. In Stephens original paper, he suggests 3 different symbols (or letter combinations) that are all pronounced as “R” or vowel+R.

        I notice that his analysis, was criticized for this reason, but this is exactly what would be expected if the VM were in a language employing logograms.

        Quote from wikipedia: A peculiar system of logograms developed within the Pahlavi scripts (developed from the Aramaic abjad) used to write Middle Persian during much of the Sassanid period; the logograms were composed of letters that spelled out the word in Aramaic but were pronounced as in Persian (for instance, the combination “M-L-K” would be pronounced “shah”). These logograms, called hozwarishn (a form of heterograms), were dispensed with altogether after the Arab conquest of Persia and the adoption of a variant of the Arabic alphabet.

        Logograms are not often used in English, but an example would be the use of the ampersand symbol for “and”. Its possible to create word constructions like “FishAndChips” or “Fish&Chips” that can be read and pronounced identically. Taking this idea further we can create constructions like “SeaAndSand”, “Sea&Sand” or an extreme example would be “Sea&S&” all of which could be read identically. This illustrates how different symbols can produce the same sound.

        • Darren Worley

          Furthermore – the identification of Middle Persian within the VM, does seems to align with some existing evidence.

          Quote from Prof. Bax, elsewhere on this site: Hunayn ibn Ishaq (809-873) was a Christian Nestorian scientist who translated a
          number of Greek medical and scientific works into Arabic and Syriac while working in Baghdad. See my list of herbal manuscripts for an article about his translation of Galen, which gives many Arabic and Hebrew names for plants, including ‘kharbaq’ for hellebore, a word I have discussed in relation to the Voynich manuscript.


          Wikipedia further states that “He mastered four languages: Arabic, Syriac, Greek and Persian.”

          Given the time-period when ibn Ishaq, was active, it seems likely that a Middle Persian influence would be found in writings of this time.

          I don’t think there is sufficient evidence to identify the VM as a Nestorian Christian text, as there were other groups translating Greek texts into Middle Persian at this time, (and other possible explanations), however, this attribution does align with existing evidence.

        • Derek Vogt

          I’d say there are three places logograms (or ideograms) could be hiding in my system:

          A. Rare symbols, especially those that appear alone instead of in words

          B. EVA-i and EVA-n (which I believe is a form of EVA-i; the more of them there are in a cluster, the more likely the writer is to put a tail on the final one)

          C. EVA-q

          …in other words, symbols for which I’ve identified no sound (including dropping previously identified sounds for #2). It seems that logograms in general should be uncommon, which B and C aren’t, but their peculiar distributions help, with C occurring only as the first symbol in a word and B occurring only in suffixes. I don’t believe any language has sounds appearing only in such narrowly-defined settings. The closest you can get in English are (1) “ng”, which never begins a word or even a syllable or appears in a prefix but does appear in root words, not just a suffix, and (2) aspirated plosives, which mostly appear at the beginning of a word but can appear later, especially if the first syllable is not stressed (as in “tatʰoo”), and which, more conceptually importantly, are not separate phonemes from their non-aspirated counterparts which can appear anywhere in a word. This makes me think that EVA-q and EVA-i&n are meant not to represent sounds but to convey some other kind of information… in EVA-i&n’s case, likely a type of information which can be enumerated. Also, EVA-q being common on some pages but absent from others potentially written by different scribes indicates to me that whatever it indicated was probably optional, like conversion of “and” to “&”.

          If there are any logograms in Voynichese, I don’t think we’re in a position to find them yet, with so few words we can apply meaning to and not even the whole sound system finished. (Just imagine what the odds are that you would have discovered “&” at a similar stage in the deciphering of English after all knowledge of English had been lost.) But I do think that logograms are, in general, going in the opposite direction from what we need. Voynichese has so few distinct characters in regular use that, unless the spoken langauge just had an absurdly small collection of sounds, it needed ways that few symbols can represent many sounds (such as digraphs and dual-use letters), not ways that many symbols can represent few sounds!

      • Mrs. Sulaiman

        Fuqqoos: a cucumber, became fuggoos ” English g” in rural or Bedouin areas . However, due to Ottoman occupation, theق “qaaf” was replaced by a glottal sound or “a” in towns and cities.
        An Egyptian or a Lebanese would call the moon “Amar” instead of “Qamar”.
        Interestingly, a Qatari, a Kuwaiti or a Sudanese person is likely to pronounce qaff as غ ghain.

  15. Derek Vogt

    I’ve been trying to collect the alternative sounds used in transliteration of Voynichese words by others, so they can be brought together here for comparison, not just with each other, but also with known languages’ sound systems. I’m just dumping the table here for now; any more detailed stuff I can say about individual letters or sounds will need to wait til later.

    In a few cases where the same person has used different sounds for the same letter in separate posts here on different dates, I excluded the earlier one and included only the most current one I know of, but if I missed any other changes or new sound suggestions, just say so.

    Because the Voynichese system uses so few letters, the more we fill it in, the more we’ll run into conspicuous gaps in it that should help with identification of the language. For example, if my row of this table is 100% correct, we’re missing any letters for /i,y/, /d/, /z/, or /m/. There can’t be many languages fitting that description. But if any of the letters do represent those sounds, then they don’t represent whatever else I think they do, so we’d be left with a different set of “missing” ones.

    • MarcoP

      Thank you for presenting this information so clearly, Derek! I want to point out that the equivalence of EVA:a and EVA:y was proposed and extensively discussed by T H Ing.

  16. SanaullahM

    Hello – I agree that it is IMPOSSIBLE to accept Sukhwant’s ideas at the moment because simply they are not clear! I am from Sindh and I have studied Landa and Khojki scripts, but even for me his explanation is not clear or convincing.

    Yes, you must give us a full table showing how each Voynich symbol matches each sound you claim, and then tell us a full paragraph showing how each word is made up.

    If you just write out a paragraph from your head without this detail, then it mighty be from your imagination, brother. Make it clear so we can see what you mean.

  17. Alan Hughes

    Mr. Sukhwant.

    I can’t understand why you are so rude to Mr. Bax when he set up this website to discuss and explore the Voynich manuscript! You say he is on his ‘high horse’ but that is unfair, when he publishes even your attacks on him! I think it is wrong to attack people personally – surely we are all trying to solve this thing together?

    Also, I agree with him that your explanation and ‘translation’ are very unclear. Why don’t you publish a complete table, like Derek Vogt has done, showing how each sound in your system maps onto each Voynich sign? Then tell us the first ten words of the first folio, word-by-word, showing us what each word means, its grammar and where we can find it in published dictionaries in Sindhi or the language you think it is in? Then we could believe you.

    But no insults please!

  18. If Sukhwant will permit me – rendered less literally and into more colloquial forms of English, I think it might have been expressed:
    ‘Many 100’s of years desire tradition and as requested by the cultivator from his pouring knowledge in under increasing guidance To accomplish it this promise of the interrogation of field subjects ……. ‘ ”

    Many generations have wanted [to maintain] traditions, and the same has been demanded by the cultivator… To accomplish this, and fulfil a promise to enquire about agricultural subjects /?of agriculturalists?..

    It has a tone which strongly reminds me of that informing Ibn Wahshiyya’s “Nabataean Agriculture” – translated by him into Arabic from older texts written, as we believe, in a Babylonian dialect of Aramaic.

    However, while the script is admittedly rather similar to some ancient scripts used in the Arabian peninsula (I’m thinking more of Sabaic), I found little in the manuscript’s imagery that appeared to connect with Ibn Washiyaa’s text (Kitab al-falaha al-nabatiya c. 904 AD).

    Then again almost any English treatise of the sort, from the 17thC to the end of the 19th is likely to start with an apologia of the same sort. You know the kind of thing.. “sorry as I am to inflict yet another work on agricultural practice on the public, yet it has become necessary due to the endless enquiries offered by students anxious to understand traditional practices, as are workers in the field concerned to gain greater knowledge which they can pour into their work, even as the public reaps the benefit’…

    Could be.

    • Sukhwant Singh


      With due respect – Do you have the character set to support your statements. I am 100% you don’t have it because what I have explained is, this is what it is. There is no guess work. You proabbly are faminliar with ancient vedic writen documents, if not many of these thick books have been written for 1000′ of years in that part of the world.

      I presented each and every word by word meaning from the paragraphs I presented for everyone to see, There is no guess work here.
      Landa language has one character meanings and was the basis of many northern Indus valley civillization languages.

      This is what it is, nothing more than this. These kinds of books were existing with rich merchants. Even still in many parts of that region some book/s like this one ( not exactly with that much details ) exist. People rely on moon cycles and astrological aspects more than doctors.

      The false propoganda associated with the book about being from Europe is baseless and astonishing cliams and guess work just to get some fame.
      Book after book, researchers ( On the wrong path ) using some software program is laughable.

      Look Diane, The documentation I have presented is with all the proof.
      It is not in Arabic, If you see my explanation. I show each and every word , one by one from the existing character set of Khojki / Landa Khudabadi languages along with Multani, Gurmukhi. There is no guess work.

      I know some people have made this their lifelong goal to somehow comeup with guesses on a picture or some word, But you know there is not any concrete proof, you know why because that proof never exists, because it is not what people are made to believe and put on wrong path.

      Lots of folks in that region are busy with their daily lives as it goes along in villages, They don’t even know that the books which there ancestors talk about and were familiar with is sitting under some false name in a developed nation.
      But, one day that will be open and there is definite documentation as proofed by me that will show exactly what I mean about this book.
      For the last 1000 years urdu language was made main stream and the regions languages were forced to be converted into urdu language which the Mughals understood and by which they could control and administrate. The holy men were killed and their books which these men treated as sacred because they were copied from generation to generation and kept in the family. All this has been explained in detail by me.

      Again, I say if someone comes to me with a chinese or some other language book whose characters as well as Phoentics I am not aware of then I wouldn’t make a guess.

      You get my point,

      • Dear Sukhwant,
        My point was that the criticisms were less about your translating the Voynich text than about the fluidity of your expression in English.

        I was reminded of the introduction to another work, simply because similar thoughts were expressed by that author, too.

        • Sukhwant Singh

          I am not going to argue on my typed English, sometimes it doesn’t come out perfect.
          For many years NSA scientist have held a view that this is some eastern Language.

          I haven’t seen any one close to even one word correct, beside my explanation.
          The irony is that English speakers who are doing research on this cannot easily come to this conclusion, but again, that’s only a matter of time.
          As I keep saying, it is not a guess work.

          • Stephen Bax

            Hello Sukhwant. This page is really devoted to Derek Vogt’s ideas.

            I’m sorry, but In future please write your ideas and comments on your own website. I look forward to following your work there. Good luck with it.

  19. MarcoP

    Hello Derek, thank you very much for this very clear presentation of your ideas! I find your interpretation of the gallows letters particularly fascinating for its elegance and symmetry. I wonder if there are know alphabets that present something similar to the “ligaturization scheme” you describe. In Italian, c+i reads ‘ch’, g+i reads ‘j’, but this certainly does not compare to the much richer variants of your Voynichese hypothesis. Do you know of alphabets in which the ‘b’ and ‘v’ sounds are written in a way similar to the one you propose?

    On another subject: I think that T H Ing’s proposal of the equivalence of EVA ‘a’ and ‘y’ could be helpful. I think it could improve a couple of the star names you have proposed: #11 as ‘vaju’ instead of ‘vajn’ (the ancient Arabic name is usually ‘waqi’). #15 ‘avu’ instead of ‘avn’ (the ancient Arabic name is usually ‘awwa’). But of course this equivalence would create difficulties for a few plant names, so I don’t know.

    I hope that star diagrams, wind diagrams or other similar structures will help us validating the different phonetic variants the are being discussed on these pages!

    • Derek Vogt

      I wonder if there are know alphabets that present something similar to the “ligaturization scheme” you describe.. Do you know of alphabets in which the ‘b’ and ‘v’ sounds are written in a way similar to the one you propose?

      In Hebrew, most of the letters that were originally just plosives (cousins of Greek beta, gamma, delta, kappa, pi, and tau, and thus also of our [b], [g], [d], [k], [p], and [t]) are now both a plosive and a fricative. They’ll get pronounced one way in one situation and the other way in another situation. They’ve come up with a system for marking in writing which sound is intended when the writer wants to be able to use both and eliminate all doubt or variation: a dot added inside the space occupied by the letter means it is definitely the fricative, so, in a writing sample which uses those dots, one of these letters with no dot is the original plosive.
      ב =b; בּ =v
      ג =g; גּ =ɣ
      ד =d; דּ =ð
      כ =k; כּ =x
      פ =p; פּ =f
      ת =t; תּ =s
      The only original plosives which have escaped this fate are ‘alef, which moonlights as a vowel and has no fricative counterpart, and qof & teth, which started out different in sound from kaf & tav (with no fricative counterparts) and have shifted toward them as if to cover the plosive sounds they were wandering away from.

      The modern Hebrew alphabet is a version of the older Aramaic alphabet, in which this process of fricatizing plosives apparently began sometime between 1000 and 1600 years ago. So the same sound duality in these letters is shared by other alphabets derived from Aramaic in the same time frame or later, including Syriac, which calls the plosives those letters’ “hard” sounds and the fricatives their “soft” sounds. In these other alphabets, I don’t know of any way to specify in writing which of the two sounds you want at any particular time like Hebrew has, nor do I know whether it would matter in those languages if you could. But I know it would be likely to be important to speakers of a non-Semitic language importing such an alphabet. The Syriac alphabet was the model I had in mind when it occured to me that ligaturizing plosives with a letter for /h/ could be a method of doing the same job done by Hebrew’s centrally-placed dots: indicating the soft sound instead of the hard one, when dealing with a letter that already had both.

      Also, given that one of the letters that this happens with is tav/tau going from /t/ to /s/, you can see part of why I would think EVA-dch, /t/ + /h/ if we take my phonetics here as simply as possible, might be a digraph for a sibilant sound; they might have just needed to do it as a digraph instead of a ligature because the forms of those two letters don’t look very ligaturizable with each other.

      I’ve noticed that Professor Bax uses [i] or [y] for EVA-ch, the letter I equate with [h]. I don’t recall seeing what his basis was for that, but if that’s what that letter was, then its effects in ligaturization would need to be something else, like iotation/yotification or palatalization.

      Before those six letters started evolving that way, when they (or at least five of them) still just represented plosives, another branch of the Aramaic alphabet family called Nabataean broke off. It would later evolve on its own into the modern Arabic alphabet, so five of the same six letters lack those fricative sounds in Arabic. The one exception is that /p/ has been replaced by /f/ so completely that there are no occurences of /p/ in the language at all anymore and its letter is called “fa” instead of “pa”. But other kinds of border-jumping between plosives and fricatives or affricates still happened in other ways. Some of the original 22 letters, in their modern Arabic forms, are distinguished from each other by dots, and it means nothing phonetic among them, but another six letters were added to the original 22 by adding a dot above an original letter. And if you strictly compare the six added letters to their six individal originals, you can see that the sounds of each pair are related, so the added dot is effectively a phonetic marker in those cases. And four of those six pairs consist of a plosive and a fricative. (The others are fricative & fricative.)
      د =d; ذ =ð
      ط =ṭ; ظ =ð̣/ẓ
      ص =ṣ; ض =ḍ
      ت =t; ث =θ
      (Dots below European letters here indicate pharyngealization, a form of emphasis comparable to aspiration but with more constriction in the throat; for example, “ṭ” is pharyngealized/emphatic “t”.)

      Also, the Arabic letter ج, related to Greek gamma and our [g], similarly once simply represented /g/ but is now /j/ in most Arabic-speaking places (although there are some left still using the original pronunciation, and others using other variations).

      Meanwhile, outside the Semitic family…

      In Greek, all three letters that represented voiced plosives in ancient times represent nearby voiced fricatives now: beta is /v/, gamma is /ɣ/, and delta is /ð/. So, before that transition was complete, there was a time (or there were separate times) when each of those letters also could be pronounced either way. The Cyrillic alphabet captured a snapshot of that time for beta, when a distinction arose between two ways to write the letter to specify which of its two sounds was intended; those slightly different versions of beta got frozen and treated as separate letters now called “be” and “ve”:
      [Б,б] = /b/
      [В,в] = /v/

      For ancient Greek’s unvoiced plosives, there were originally separate letters for the aspirated and non-aspirated ones: tʰeta & tau, pʰi & pi, and cʰi & kappa. The Romans heard each aspirated plosive as a plosive followed by /h/ and transliterated them that way. In all three cases, aspiration evolved into fricatization in Greek, which is how [th] and [ph] became digraphs representing what they do to us now, and how chi ended up as the basis for a new letter the Romans added to their own alphabet, [x], which we now use as the IPA symbol for chi’s modern fricative sound.

      There are also cases of phonetic evolution between fricatives or affricates and plosives which didn’t affect alphabets but do show how easy it is for language evolution to link them in ways that could have affected alphabets. Of course you’re familiar with what happened to [c] and [g] and [z] in Europe. In Spanish, there’s no phonetic difference between [b] and [v]; their pronunciation can be different in different regions of the world, but not different from each other in the same region. One of the defining features of Proto-Germanic was the conversion of initial /p/ to /f/, especially at the beginning of a word, and then, once early English was separated from the others, /b/ in the middle of a word became /v/, giving us words like “seven” and “over” where the other Germanic languages still had something like German’s “sieben” and “über”.

      On another subject: I think that T H Ing’s proposal of the equivalence of EVA ‘a’ and ‘y’ could be helpful. I think it could improve a couple of the star names you have proposed: #11 as ‘vaju’ instead of ‘vajn’ (the ancient Arabic name is usually ‘waqi’). #15 ‘avu’ instead of ‘avn’ (the ancient Arabic name is usually ‘awwa’). But of course this equivalence would create difficulties for a few plant names, so I don’t know.

      I say we need to stick with the sounds we get for each letter in root words, and whatever sounds we end up with in suffixes that way, those are just what the suffixes sound like.

      • MarcoP

        Thank you Derek! I find your examples very helpful!

      • Jason Willson

        Hi Derek!

        Thank you very much for your post, I found it incredibly fascinating. It’s garnering my excitement for further decoding of the VMs. If I may suggest one very minor correction on the Hebrew chart (it looks like you have the sounds in reverse, see http://en.wikipedia.org/wiki/Begadkefat):

        ב =b; בּ =v
        ג =g; גּ =ɣ
        ד =d; דּ =ð
        כ =k; כּ =x
        פ =p; פּ =f
        ת =t; תּ =s

        Should be this:
        ב =v; בּ =b
        ג =ɣ; גּ =g
        ד =ð; דּ =d
        כ =x; כּ =k
        פ =f; פּ =p
        ת =θ*; תּ =t
        *s in Ashkenazi Hebrew

  20. Derek, I note that you have two of the gallows becoming fricatives with the addition of a pedestal, but the other two turning into affricates. Why do you think that the addition of a pedestal works differently for the two pairs?

    Also, you have EVA ‘l’ becoming a /sh/ sound and EVA ‘r’ becoming an /r/ sound. Do you think this can be reconciled with the two characters having a very similar distribution in the text?

    • Derek Vogt

      I note that you have two of the gallows becoming fricatives with the addition of a pedestal, but the other two turning into affricates. Why do you think that the addition of a pedestal works differently for the two pairs?

      My original prediction was that, when ligaturized, all four would have the same soft sounds as in the Syriac and Hebrew alphabets (f, v, x, ɣ).

      That created a complication for the fricatized /k/ because we already had another letter that appeared to be /x/, but I could see a way to solve it with two similar sounds, one more forward-articulated and one more back-articulated, like the two pronunciations of [ch] in German depending on the preceding vowel (represented as /ç/ and /X/ or /χ/ in IPA). Another way out of that would be if EVA-sh was not actually any kind of /x/-like sound at all, but I didn’t and still don’t yet know of a better alternative for it.

      It also, in combination with other letters for /h/ and /ɣ / and /š/ and probably another sibilant or two, created a complication in the overal general form of the language’s already-small sound inventory, concentrating a lot on roof-of-the-mouth fricatives and having severe shortages in other sound categories. When I read lines/pages from the manuscript to myself that way, it sounded less like a real language and more like just the same thing over and over again, even worse than it is among cultural & geographic words in Hawai’ian, the world’s most notoriously repetitive language with the world’s most famously puny sound inventory.

      But neither of those issues was anough to break me out of the pattern I saw in the Semitic dual-sound consonants until I saw EVA-cth, the letter I was expecting to be /ɣ/, in a plant name (second one on 16v, identified from the picture as Illicium verum) that would make sense only if it could somehow equate to “dya”, for which /ɣ/ just couldn’t work. To work in that name, the letter’s sound needed to be something that could be derived from /g/ and sounded reasonably similar to “dya”. And /j/ not only fit that bill perfectly, but also, together with its unvoiced counterpart /č/, cleaned up the other two issues in the previous paragraphs here.

      Since then, I believe I’ve also seen EVA-ckh (my predicted /č/) fitting in a word well enough, although not completely precisely: a group of stars together called `ash in Hebrew, with a Voynichese label I read as /ačn/. It’s not included in the table on this page because I spotted it too recently, it’s not a plant, and it’s not where the idea of EVA-ckh as /č/ came from.

      Also, you have EVA ‘l’ becoming a /sh/ sound and EVA ‘r’ becoming an /r/ sound. Do you think this can be reconciled with the two characters having a very similar distribution in the text?

      I’m not aware of any reason why their distribution would go against their roles in words whose foreign apparent cognates have those sounds. And if I were, I’d be pretty likely to conclude that the latter trump the former. It’s easier for me to picture sounds having unexpected distributions, than to picture apparent cognates seeming to have the right letters in the right places to match each other’s sounds without those letters actually representing those sounds.

  21. Hans Flatoy

    Please check out :

    The Voynich script is decoded by Sukhwant Singh. It is written in different languages as follows:
    1. Landa
    2. Brahmi
    3. Multani
    4. Mahajani
    5. Khojki
    6. Gurmukhi.

    Please contact him and give him credit for a good work in Pakistani and hindi languages.

    Hans Flatoy

    • Stephen Bax

      Thanks – I had a long correspondence with Mr Singh (see here for some of it). Although I do congratulate him for his work, he was not able to send any translations of any sentences, sections or pages, so it is impossible to say whether his approach is successful or not.

      A danger in saying that the manuscript is written in many languages and scripts is that the analyst can pick and choose which language to use to interpret each word. If this is not done with any clear system, then it can just result in wishful thinking, unfortunately.

      • Sukhwant Singh

        Hi Prof Bax,

        This is Sukhwant. I have 2 whole paragraphs word by word translated to their correct meaning. It is on the website I put up for this as well as the details I sent you. I even expalined the 2 words you asked me to decipher from page 2r I think.
        So, the thing is my explanation is not any guess work. I is backed by extensive proof and is available on the web. I know you have spent a lot of time on this thing.
        Sir, with due respect, you send me what else you want me to explain. I will do that.
        It is like is you ask me to explaina chinese script book, I would not be able to.
        Same way, For you Sir, It is next to impossible to guess or phonetically pronunce correct sounds from the region.
        2 Whole paragraphs word by word – I didn’t make a 1 word guess.
        The lesson sometimes in life is to accept the truth, being a respected Prof. I commend your abilities, but at the same time you are completely off track on this book.
        The alphabet is nothing of guess here but I gave concrete proof, If you want to learn the truth is another story.
        I am not going by a guess work, Dr.

        With due respect,

        Sukhwant Singh

        • Stephen Bax

          Thanks Sukhwant. I have looked at the website closely. Maybe I am stupid or ignorant, but I cannot understand the system you use to interpret the language. For example your interpretation of f1r does not make any sense to me:
          “First paragraph from 1r goes like this. ‘Many 100’s of years desire tradition and as requested by the cultivator from his pouring knowledge in under increasing guidance To accomplish it this promise of the interrogation of field subjects ……. ‘ ”

          I’m sorry, but to me this is meaningless. It would help if you could tell us word by word what the language, grammar and sense of each element is. You concentrate on the phonetics, but we need to know more about the language and grammar.

          • Sukhwant Singh

            Sir, You will never be able to understand. I repeat with due respect, don’t take me wrong. This book or phonetics are not for a person who speaks English. How can you even translate / guess a word. Its something like this, prof, are you familiar with Chinese characters?, I don’t think so.
            Are you abel to translate a Japanese book, I don’t think so.
            You asked me 2 words, i translated it for you and put it on your website.
            You, Sir are on completely worng path. I know you don’t want to climb down from your high horse once you have claimed that you had some breakthrough. That was complete guess work. You know that. Just because you wanted to thow something out to the world to see what you have been researching on, was very much just what you know very well, that it’s completely fabricated what you came up with.
            Sir, I respect you work, but as I have been saying English mother tongue person cannot produce those sounds from the region of Sindh and on top of that I gave you full character set of Khojki, Khudabadi, Landa, Mahajani’s scripts widely available on the net.
            If you don’t know the language How can you even claim to understand a word.
            It only serves your purpose. You Sir wanted some fame on your false research. You had it.
            Here is thing – If you think my explanation out in open and with proof along with the character set. let us set a time and debate out in front of the world, Word by word.
            This is not a code system. Please stop claiming this thing because lots of unknowing younger generation look towards professors and rely on them.
            Please do not put people on wrong path, just because you have been researching, when from the get go you were on a wrong path.

            My explanation translated 2 whole paragraphs word by word with exact meaning and I am open to discuss on that with people who speak or understand Sindhi/ landa/ Khojki language.

            Sooner or later people n that region will realize my explanation and I am confident that there are millions living in that region which will vouch for this explanation as per my statements.

            Prof. with due respect, This book is in Landa Khojki language and script including Khwaja khojki vovels on top or Brahmi script.

            Please, stop the guess work. My expalnation is not any guess work. It’s just because of people like you that real credits are snatched from real people.

            Thanks Prof, Please don’t mind my words at all. I doesn’t reflect on your achievements in other fields.

            Warm regards,

            • Daniel White

              Mr. Sukhwant,

              I would like to ask you to explain how you found this.
              “The book is divided into 4 parts as mentioned by the author( details below ) written in early 15th century as that’s the time period when Khojki was more prominent. The book was taken by the “Holy” man from town to town and based on the knowledge he had( He was the go to guy and first person to approach in case of issues, either injury or some depression, bad dreams, marriage and business, Hex etc. ) , and the facts he collected from the inhabitants/customer. This man would then recommend to-do things. The book also deals with what kind of women she is based on the type of hair she has, what type of clothes she wears, what to expect from the second wife of the husband etc. What to do if someone has Hex on you and how to figure it out and recommendations for getting rid of the Hex. The book is not written for others to read and is usually passed within the family from Father to Son or someone more capable whom the Mahajan has taught and guided himself.”

  22. Hi Derek, your idea that the gallows are a set of featural characters representing stops, with the pedestal variants being fricatives, is something I’ve been thinking over for some time too. I don’t know if it is right, but it at least fits with the facts that I have at the moment. It’s good to see that we’ve both come to this conclusion (at least we canbe wrong together!).

    I’ll consider the whole a bit more and comment again later.

Leave Your Comment

Your email will not be published or shared. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>