Tags C & C++
There's a need to code a simple C pre-processor (cpp) at work. Our language (at work) uses some subset of it as a trivial macro language, and we want something well defined. Generally, people are against writing our own, but I feel we have no choice !

The features I want to implement are:

  • #include
  • #define (simple, w/o arguments)
  • #if(n)def ... #else #endif
  • \ (line continuation)
  • C comments

I decided to try to roll-our-own anyway, and now work on a prototype in Perl. So far, the last two features are implemented.

For removing C comments, there's a cryptic, hairy and scary regexp circulating the web, but I can't use that. I must generate sensible error messages, with correct line numbers. Nested comments are disallowed and comments inside strings are also disallowed.

Dealing with \ continuation seemed easy at first, but turned out to be a tad more complicated. See, the line numbers must be preserved (the user, getting an error message must be able to look at his code editor and see the correct line), so if two lines are concatenated, I must leave an empty line instead of the second one (the \n must stay). And this is correct for any number of continuation. For instance, if 3 consecutive lines end with \, there are 4 lines to turn into one, but after it 3 empty lines must be inserted (instead of the pasted lines).

I think #include is the most difficult to implement (especially because it should leave line marks when it "delves into" each file and "comes back" from there). We'll see :-)