Making code compatible with Python 2 and 3

Update: Thanks for the great comments! To new readers of this post - make sure to skim the comments after you finish reading. There is some great advice there for making the change simpler - especially when you need to be compatible only with 2.6 and not the earlier versions (2.6 was especially designed to make future transition to 3K simpler).

Python 3 has been available for a long time already, but the migration of modules to it is going slower than many Python afficionados would have hoped. Once code is ported to Py3k, it cannot run on 2.x. This is the reason many library authors are afraid to make the step and port their code - they rightfully refuse to maintain two code bases. So we have a "lack of critical mass" problem.

In my opinion, to make the migration easier, it makes sense to write code that can run on both Python 2 and 3, at least for some time. Yes, this can make some parts of the code a bit ugly (although most of it can be hidden) but it will allow porting without actually having to maintain two code-bases. Once the critical mass assembles, the compatibility to 2.x can be dropped.

To contribute my share to the effort, I've successfully transformed two of my major code-bases to run on both Python 2.6 and 3.1:

pycparser - the ANSI C parser in pure Python: the new version (1.07) can run on both versions of Python (other than that, it isn't different from 1.06)
Luz - the assembler/linker/CPU simulator suite has also been ported.

This porting was easier than I hoped. Since this is the first time I've touched Python 3, I had to use a few resources for help in the transition. Some of the best ones: Dive into Python, Mark Lutz's site and Ned's post

Here's a list of some tricks I had to use, in no particular order. First and foremost, I created a portability.py file too encapsulate the differences as much as possible. Sometimes I had to use the following check:

if sys.hexversion > 0x03000000

To differentiate between Python versions. Luckily, all such checks could be confined to portability.py.

Here's an example of a couple of functions from portability.py:

def printme(s):
    sys.stdout.write(str(s))


def get_input(prompt):
    if sys.hexversion > 0x03000000:
        return input(prompt)
    else:
        return raw_input(prompt)

Python 3 made print into a function, so as a statement it doesn't even parse. printme is a function which can be called by both versions of Python. It's not as versatile as print itself, but it's a small trouble since I mostly used print for debugging, testing and some trivial output.

get_input encapsulates the lack of raw_input in Python 3.

Another problem I commonly had to tackle is catching exceptions. Since the syntax was changed in Python 3, I had to resort to this for portability:

except TypeError:
    err = sys.exc_info()[1]

This code runs in both versions and places the exception message in err.

Some differences were very easy to handle. For example Python 3 removed xrange, so I've just used list(range. Had performance really mattered, I would have had to use something more complex. Also, itertools.imap was removed so I replaced it with iter(map. Dictionaries lost their has_key member, but key in dict works well on both versions of Python, so this is another easy change.

Luz is a relatively large project, sub-divided into packages and many modules, so relative vs. absolute imports gave me some trouble. Luckily, the 2.x version I wanted to be compatible with is 2.6, so I could just use relative imports everywhere and it works well on both versions.

The full-test running capabilities in Luz gave me some trouble because I'm using dynamic Python code loading there. The new module disappeared in Python 3, but happily imp.new_module replaces it and works in 2.6 as well. Also, I had to use a trick borrowed from Ned to replace exec with this monstrosity:

# Borrowed from Ned Batchelder
if sys.hexversion > 0x03000000:
    def exec_function(source, filename, global_map):
        exec(compile(source, filename, "exec"), global_map)
else:
    eval(compile("""\
def exec_function(source, filename, global_map):
    exec compile(source, filename, "exec") in global_map
""",
    "<exec_function>", "exec"))

Just like catching exceptions, since exec is syntax, you just can't nicely hide it behind a version check. The parser chokes on it even if that code section doesn't get executed eventually. Therefore, a brute-force approach using eval(compile is called for, since this one runs at runtime, when only the relevant interpreter sees it.

That's about it. From now on I plan to keep both pycparser and Luz functional on both versions of Python - it shouldn't be too hard. In the future when I feel the time is right to make the switch to Py3k, it will be trivial - I'll just clean-up all the ugly portability code.

P.S.: To complete such a task you really need good unit tests. I can't imagine making it and staying sane without the extensive tests both code-bases have.