Faking standard C header files for pycparser
May 22nd, 2009 at 9:47 amMy Python-based parser and AST generator for ANSI C – pycparser has been downloaded more than 500 times since January, when version 1.03 was released.
From time to time I even get an occasional fan-mail with either feedback or complaints. Though there are much fewer complaints and bug reports than I’d expect, the most common issue that comes up is standard C include headers pycparser is having trouble with.
I’ve written before about the context sensitivity of C, which means that all the headers a C file includes must be parsed before it, in order to find out which identifiers are types. Since most C code uses at least some standard headers (stdio, string and stdlib are probably the most popular), pycparser needs to be able to parse those.
But this is often a problem, since each compiler tool-suite creates its own standard headers, with its own idiosyncrasies, compiler-extensions and weird definitions. pycparser successfully parses the headers supplied with the MinGW GCC port, but it’s a problem for me to make sure it can handle all the varieties of standard headers out there.
So, the other day I had this idea – why won’t I create "fake" standard C header files, just for pycparser. After all, it doesn’t need much out of them – only to know which identifiers are types. It doesn’t care, for example, about declarations of functions, because in C function calls are unambiguous and can be tagged as such without seeing the function declaration (verifying the amount and types of arguments is another matter, but pycparser doesn’t do that anyway).
So, using pycparser itself I’ve parsed the standard header files from MinGW, and detected all typedef statements. Then, I generated fake typedef statements out of them into a single header file, and added an include of this file into empty .h files named exactly like all the standard headers.
The same was done with all the #define constants, since cpp needs those to operate correctly.
Note that I didn’t have to keep the full typedef for each definition, just a fake:
typedef int FILE;
This is because pycparser really doesn’t care about the type FILE was defined to be, it only needs to know that FILE is a type.
The directory with the fake include files was released in utils/fake_libc_include with pycparser version 1.04, and can also be accessed directly from the pycparser SVN. With it in place, pycparser no longer depends on real standard C header files, and also runs faster because the fake includes are much smaller and simpler.
Related posts:

June 10th, 2009 at 18:43
Hi Eli,
I downloaded version 1.04 and the fake headers are not in the util directory.
You might want to re-release to fix that.
Otherwise I got a simple test running which is nice.
Another issue I have is that cpp_args expects a string and I can’t provide a list of -I directives. If I do, the strings are used a single argument by cpp so that it does not work.
Thanks a lot for writing this cool parser, I like the simple visitors and the fact that it provides real C parsing with Pure Python.
June 12th, 2009 at 06:48
@Pierre,
Thanks for letting me know about the missing headers. I’ve fixed it now.
Regarding cpp_args, can’t you just join your -I strings?
October 15th, 2009 at 00:27
@eliben, you can’t just join the -I strings. Look at parse_file, and how it takes cpp_args as a single string, and adds it as one element to the path_list list. If you give cpp_args multiple args, you end up with:
rather than
You might want to either call split() on cpp_args, or have cpp_args be passed in as a list.
Does that make sense?
Oh, and thanks for the awesome pycparser!
October 15th, 2009 at 00:36
@eliben, here’s a patch. I hacked in the solution, and then tried from memory to get the file back to the original, so it may be a little off, but this is the basic idea:
With this, I can pass cpp_args=’-Ifoo -Ibar’ and it works fine now.
Of course, you might rather do it another way; whatever works!
And thanks again. I really enjoy pycparser.
October 16th, 2009 at 15:01
Sam,
I’ve uploaded version 1.05 where this is fixed. Now you can pass a list to that function and it will work as expected. Thanks for raising the issue.
October 16th, 2009 at 22:14
@eliben — thanks!
- Sam
December 14th, 2012 at 17:55
I have used pycparser to parse my C code, which is written to verify HW functions on FPGA, to generate test reports. This program is really awesome!! Thanks!