Parsing VHDL is [very] hard

May 19th, 2009 at 6:53 pm

Ever since I started writing lots of VHDL code at work, I’ve been toying with the idea of writing a parser for the language. It would provide me with a simple means for writing useful small tools for organizing and analyzing large bodies of VHDL code.

A few years ago I even started looking into this seriously, but that project got flushed down the tubes once I realized that it’s hard to find suitable libraries for this in Perl, which I was using at the time. In addition, VHDL turned out to be a very hairy language to parse.

Lately, after the success of PLY-based pycparser, I came back to VHDL. PLY is powerful and fast, I thought, perhaps it’s feasible to parse VHDL with it?

Turns out the task is much harder than I expected. Attempting to translate the VHDL BNF definition into PLY Yacc runs into problems very quickly. The BNF is not suitable for LALR, and is full of reduce/reduce conflicts. At first I rewrote the rules to make them more general (and hence accept a bit more of invalid code, which wasn’t too important for me), but more and more are coming. Yesterday I read some paper claiming that the full translation of the BNF into Yacc results in 576 reduce/reduce errors! Umph…

No problem, I can just rewrite it using a hand-tailored RD parser (which I suspect most commercial VHDL tools are using) that’s more powerful than LALR and hence won’t be troubled by conflicts in the BNF, right?

It’s more difficult than that.

VHDL is context-sensitive in a mean way. Consider this statement inside a process:

jinx := foo(1);

Well, depending on the objects defined in the scope of the process (and its enclosing scopes), this can be either:

  • A function call
  • Indexing an array
  • Indexing an array returned by a parameter-less function call

To parse this correctly, a parser has to carry a hierarchical symbol table (with enclosing scopes), and the current file isn’t even enough. foo can be a function defined in a package. So the parser should first analyze the packages imported by the file it’s parsing, and figure out the symbols defined in them.

This is just an example. The VHDL type/subtype system is a similarly context-sensitive mess that’s very difficult to parse.

After some Googling, today I’ve encountered an old newsgroup post on comp.lang.vhdl from 1993, by a bunch of seemingly knowledgeable people discussing this issue. The verdict: yes, it’s context-sensitive, and very hard to parse. But with (lots of) effort it’s doable.

I’m kind-of bummed by this at the moment. I’ll either find something online to adapt, give the RD parser a shot and try to minimize the damage of context-sensitivity, or drop the idea altogether. We’ll see…

Related posts:

  1. a VHDL parser in Perl
  2. The context sensitivity of C’s grammar
  3. the answer for parsing C ?
  4. Ada and VHDL
  5. VHDL C interaction

6 Responses to “Parsing VHDL is [very] hard”

  1. Claude BIDEAUNo Gravatar Says:

    Have you check the possibility to realize a VHDL Parser with tools as ANTLR
    http://antlr.org/ which is possible to interface with python?

    ps :As you, I’m very interested to parse VHDL for code analyze.
    I’m looking up into your PYCPARSER to known your “lex/yacc” schematic before to see your blog and your interest for the VHDL.

    BR

  2. elibenNo Gravatar Says:

    ANTLR won’t help here – the problems are different. There’s too much context sensitivity and inter-dependency between files. For pure parsing power PLY is sufficient here.

  3. pedroNo Gravatar Says:

    Eli,

    I’m looking for a tool to convert structural vhdl into a netlist file for printed circuit design. Currently standard practice is still to use graphical schematic editors for doing this but I want to bring this task into the 21st century and use text based entry.

    I am willing to do some work to achieve this but I was hoping antlr would get me part way there. Do you still believe antllr is not a worthwhile approach?

    Pedro

  4. elibenNo Gravatar Says:

    pedro, I don’t believe ANTLR is a silver bullet for VHDL parsing. First you have to plan how to address the cross-file context sensitivities I pointed to in the post, and perhaps others I haven’t run into yet.

  5. RinatZakirovNo Gravatar Says:

    Try vMAGIC. It is a java-based VHDL parser and a generator. Very easy to work with. It takes a little bit of running-around to get a java class to run in python, but I managed to do it with 1) JPype, not the best approach, since it is using dynamic-binding. 2) using IKVM from MONO project to convert java classes into .NET library, and running python code in IronPython, then the rest is working beautifully.

  6. JesusNo Gravatar Says:

    This post is preetty old but anyway, what about using DCG in Prolog? You can handle preetty much any complex context sensitivity with ease thorugh assertions and retractions. Could you point out one of the issues you have come up with or even some you have already solve? I would try to find a solution for it.