Contributing to Python

July 23rd, 2010 at 5:07 pm

Update (13.01.2011): Last week I was granted commit rights to the Python repository. Now I’m officially a core Python developer :-)

I’ve been involved in open-source projects almost since the first days of my "serious" programming (back in 1998), but these were always projects I started myself. I’ve long been thinking about joining one of the big and established open-source projects, both to make a contribution and to improve my own skills by working with some great people on interesting things.

Once I started tinkering with Python around two years ago, it became the major candidate for my contribution – both because working on to Python can really make a difference for a huge amount of users, and because Python’s inner development circles include some of the brightest programmers I ever ran into. Joining this clique, even as a humble minor contributor, is very appealing.

So, a few weeks ago, inspired a couple of articles, I’ve finally made the plunge.

http://eli.thegreenplace.net/wp-content/uploads/2010/07/smilingpython.gif

For now, my contributions are very minor: I’ve been involved in a few issues, and made several patches. A few were even committed into Python – one documentation patch and two patches fixing bugs in the trace.py module in Python 3.x

I’m also "in progress" on several other issues, dealing with the trace.py module (improving its documentation, adding unit tests and debugging some issues with 3.x), documentation fixes for some standard library modules and a bug fix for difflib. Once you make the first step, finding more things to work on is quite easy. Python’s code and documentation are of relatively high quality, but like in any major software project, there’s place for improvement almost everywhere you look, even if the improvements are very minor (making the documentation more consistently formatted or clearer).

A few words on how I work on Python.

Although Python is well-supported on Windows and can be built on it without much trouble, Linux is the most convenient platform to use for development IMO. I’m using a Ubuntu VM running on VirtualBox on top of my Windows XP machine.

Python’s code is kept in a Subversion repository, to which you can get a read-only access when you’re not a core committer. It means you can’t really interact with the repository, and if you want to save your temporary work, you’re on your own.

Luckily, Python is in the process of moving to Mercurial, and already has a functional mirror set up. Mercurial is a much better SCM tool for this purpose, because it allows you to work locally with your repository, only pulling changes from the official one when necessary.

Here’s my workflow with the Mercurial mirror of Python:

http://eli.thegreenplace.net/wp-content/uploads/2010/07/pythonrepos.png

My local Mercurial repo is where I do all my hacking, occasionally backing-up to my personal clone at code.google.com. This lets me explore various ideas, create temporary fixes, all of this with full version control. From time to time, I’m pulling a fresh snapshot from Python’s official Mercurial mirror to get back on track, but I will always be able to get back to my own changes, because everything is safely stored in the history of my repo.

However, I still keep the SVN checkouts around, because:

  1. I want to make sure my changes work on a clean check-out from Python’s official repository, which is still SVN.
  2. I create patches against the SVN repo (with svn diff), because Mercurial creates slightly different diffs. Since committers actually commit into the SVN repo, this makes their lives easier.

It’s easy to keep several versions of Python around. For example, I have the repositories for the 3.x development branch (both Mercurial for hacking and SVN for patches), plus the 2.7 and 2.6 maintenance branches. To get a new version/branch all one needs is:

  1. Check it out from SVN or clone from Mercurial
  2. configure and then make
  3. Create a link somewhere on PATH to the relevant executable (for example I have in ~/bin a link named py27 for the 2.7 version, py3d for the debug build of the latest 3.x, and so on). The Python interpreter, once executed, knows where to find its own libraries, making it very simple to work with several versions of Python simultaneously.

To conclude, now you know what’s been keeping me busy in the past month or so. Contributing to Python is something I’ve long wanted doing, and I’m happy that I finally started. It turned out to be much less difficult than I originally expected, and I now firmly believe that any competent developer with the desire to help and some free time on his hands can become a contributor.

P.S. I had the privilege of receiving useful guidance from Terry Reedy, and I’d like to thank him for that. We still cooperate on several issues, and I hope we’ll continue working together. "Pair-contribution" seems like an interesting model the Python community may want to look into. I also want to thank Alexander Belopolsky for getting my fixes for trace.py quickly committed.

Related posts:

  1. Python development switches to Mercurial source control
  2. Python documentation annoyance
  3. Helping improve the documentation of Python
  4. pyelftools ported to Python 3

Leave a Reply

To post code with preserved formatting, enclose it in `backticks` (even multiple lines)