Tags Python

Two years ago I decided to move my main Python development from version 2.5 to 2.6, and wrote about the new features of 2.6 which I found most useful.

Now I think it's a good time for me to move to 2.7, so this post has a similar goal.

Python 2.7 is not an ordinary version. It's the last major release of the 2.x series - no more language features and new libraries will be added to Python 2.x - all new development is focused on 3.x. However, since the Python core development team recognizes that Python 2.x is going to be used for a long time yet (even if just for legacy systems), the 2.7 release will be in maintenance mode for longer than usual, and there's a real effort to fix any bugs found in it.

Python 2.6 inherited some features from 3.x that allowed easier transition in the future (for example, it's not very hard to write code that runs on both Python 2.6 and 3.x). Version 2.7 takes this another step forward, adding even more features from 3.x, which both make the future transition even easier, and improve the language overall. Here are some examples:

set literals and comprehensions

Thanks to the new set literal syntax, the following two statements are equivalent:

>>> myset = set([2, 3])
>>> myset = {2, 3}

Now sets are as "built-in" in Python as lists and dictionaries are, and this is a good thing, because sets are a very convenient data structure for many different purposes. It's not that sets weren't very usable in previous versions - but their own syntax gives them a "first-class citizen" status.

Additionally, set and dict comprehensions are now possible:

>>> {i*2 for i in range(5)}
set([0, 8, 2, 4, 6])
>>> {i: i*2 for i in range(5)}
{0: 0, 1: 2, 2: 4, 3: 6, 4: 8}

List comprehensions make a lot of problems easy to express succinctly. Dict and set comprehensions further improve Python in this respect.

unittest enhancements

A very welcome new feature is some important additions to the unittest module. Unlike most parts of the standard library, unittest is something I use in virtually every project, so any improvement in it matters a lot. Some highlights:

  • unittest now has simple test discovery capabilities. This means you can run it (by invoking the module with python -m unittest) on a directory and it will automatically discover and execute all test files in it. This is a feature other unit-testing libraries (like nose) boast, and it's great to have it built-in.
  • assertRaises can now work as a context manager, which makes it much more pleasant to use.
  • Resource allocation and cleanup for tests should now be much easier to implement, thanks to new class and module-level setUp and tearDown methods, and a method for adding additional custom cleanup functions.
  • A lot of useful new assertion methods have been added. For example (all of these also have negative variants): assertIs, assertIn, assertIsInstance, assertRaisesRegexp.
  • assertEqual became much smarter about reporting failures when comparing lists, dicts, tuples and sets. For example, if two lists aren't equal, only the mismatch is reported instead of dumping them wholly as the old assertEqual would do.

No more cStringIO

StringIO is a great tool, but it's written in pure Python and thus is slow, which can limit its usefulness when performance is critical. So for a long time we've been used to importing it from the alternative C-based package cStringIO.

In Python 2.6, the io package from 3.x has been backported, but with an old and inefficient pure-Python implementation of StringIO. Finally, Python 2.7 brings the C-optimized io package from 3.1, and cStringIO is no longer needed (it still exists, of course). Just:

>>> from io import StringIO

And you're good to go. This will work in Python 3.x too, of course, so one less compatibility issue to worry about.

Note that in Python 3.x, I/O handling has been considerably redesigned, in part to accommodate for the stricter distinction between textual and binary data that 3.x makes. The io module backported to 2.x allows to use this code, although it's not the default for file and stream handling (i.e. the built-in open function won't use it). When applicable (such as with StringIO), it's recommended to use the io module in 2.7, for easier future transition to 3.x.

OrderedDict

Python prides itself at being a "batteries included" language, but naturally you can't have all possible batteries included from the start, so the standard library slowly grows over time, adding more and more useful tools.

One tool that has been on the "wanted" list for a long time is an ordered dictionary - a dictionary that remembers the order in which the items were inserted into it. So this data structure appeared in many different recipes and forms, "unoficially". Python 2.7 adds the collections.OrderedDict class for this purpose, so that's another battery now included in the standard library.

argparse

Not satisfied with the two existing tools for parsing command-line arguments (getopt and optparse), Python 3.2 added another one - argparse, an enhanced version of optparse (which is also why optparse became deprecated). Python 2.7 takes this module from 3.2. So if you feel optparse is too limiting for your needs, you will be happy with this change.

There are many more changes in 2.7, the above is just a sample of what I found to be interesting. For the full list see the What's New in Python 2.7 documentation page. Overall, it's a pretty good update, making Python coding even more pleasant.

I want to stress again the importance of the 2.7 release, being the last in the 2.x series. That's it, the last incremental upgrade. Consider that while 2.7 borrows features from 3.2 and 3.1, when 3.3 is released there will be no corresponding 2.x release to backport its features to. IMHO this is when Python 3.x will finally start growing really apart from 2.x (in terms of features), and this will hopefully encourage more people to do the switch. For now, my recommendation is to make sure your code runs on both 3.x and 2.x. With 2.7, it isn't very hard to achieve.