Tags Python

I've been waiting to upgrade Python to 2.6 for quite some time now. Python 2.6 is a good starting point for a future transition to Py3K, plus it has some nice new features over 2.5 that I was eager to use.

However, when you're writing serious applications in Python (especially for work) that use 3rd party libraries, upgrading isn't simple as you have to wait until the libraries you're using get upgraded.

Well, this week the last library I've been waiting for has finally announced Python 2.6 support - PyQwt [1]. So I've cleaned up my Python 2.5 installation [2], installed 2.6 [3] with all the modules I routinely use, and now I'm running Python 2.6!

Python 2.6 has quite a few new features. Here are some that I find most interesting:

The documentation

The documentation was revamped (using the Sphinx tool for generating nice HTML from reStructuredText). It can be noticed when browsing documentation - the formatting is definitely friendlier, and it appears that the documentation of some modules was improved with more examples.

multiprocessing

A lot is being written about Python threads and their inability to really use multi-core CPUs because of the GIL. Well, the multiprocessing module solves it by providing an API for spawning child processes that's completely compatible with the threading API. It also supports Queue and Pipe for convenient, synchronized communication between processes.

It looks like multiprocessing is the answer people needed to make Python programs faster by utilizing multiple CPUs and cores. Being compatible with the threading API it's easy to use in safe and powerful ways, which is really great.

'with' statement in the present, not the future

I really like the context managers and with statement features introduced in Python 2.5. The only problem is that you had to use a from __future__ import in every file using them. Well, in 2.6 you no longer have to.

ABCs

I know, I know, Python has duck typing and enforcing interfaces is needlessly restrictive. However, I still find Abstract Base Classes useful from time to time, if only to document an interface user implementations should adhere to. I'll quote Doug Hellman:

This capability is especially useful in situations where a third-party is going to provide implementations, such as with plugins to an application, but can also aid you when working on a large team or with a large code-base where keeping all classes in your head at the same time is difficult or not possible.

This is from PyMOTW: ABC which seems like a nice tutorial.

bin

This is a small feature, but a nice one nevertheless. The new built-in bin function provides a simple and fast way to represent numbers as binary strings:

>>> bin(42)
'0b101010'

Due to the kind of work I mostly do with Python (embedded system communication, binary parsing, etc.) I always had to implement the functionality of bin on my own. Now I don't have to, and the built-in is faster, which is great.

fractions

>>> from fractions import Fraction
>>> Fraction(16, -10)
Fraction(-8, 5)
>>> Fraction(123)
Fraction(123, 1)
>>> Fraction()
Fraction(0, 1)
>>> Fraction('3/7')
Fraction(3, 7)
[40794 refs]
>>> Fraction(' -3/7 ')
Fraction(-3, 7)
>>> Fraction('1.414213 \t\n')
Fraction(1414213, 1000000)
>>> Fraction('-.125')
Fraction(-1, 8)

So far I've only needed fractions (rational numbers) for solving Project Euler problems. I downloaded and used SymPy especially for its Rational class. Now there's a built-in.

namedtuple

namedtuple`` in the collections module is a useful idiom I've borrowed into my 2.5 code a long time ago. It's great to have it built-in at last.

>>> from collections import namedtuple
>>> MessageType = namedtuple('MessageType', 'id src dest data')
>>> new_msg = MessageType(id=12, src=0x1123, dest=0x1255, data='sdasdfsdf')
>>> new_msg
MessageType(id=12, src=4387, dest=4693, data='sdasdfsdf')
>>> new_msg.id
12
>>> for f in new_msg:
...   print f
...
12
4387
4693
sdasdfsdf
>>>

namedtuple is immediately useful, but it's still a bit unpolished, IMHO. Hopefully now that it's in the standard library, it will be worked on and improved even more.

new itertools

Several new generators have been added to the itertools module: product, permutations and combinations. These are very handy when dealing with combinatoric problems.

For example:

>>> list(itertools.combinations('XYZ', 2))
[('X', 'Y'), ('X', 'Z'), ('Y', 'Z')]

os.path.relpath

A new, useful function that was added to the os.path module:

>>> os.path.relpath(r'c:\data\utils\temp', r'c:\data')
'utils\\temp'

Queue.PriorityQueue and Queue.LifoQueue

I just love the Queue module, and use it almost every time I have to write threaded (and soon, multiprocessing) code. It's simply the best way to communicate between threads in Python.

2.6 has added two new types of synchonized queues: a priority queue and a LIFO queue (stack). I still don't know what I'm going to use these for, but it's great to have them in the toolbox.

json

This is more of an anti-feature, as I see it. I would really, really prefer YAML to be included as a built-in. Not only YAML is more useful than JSON [4], but implementing JSON once you have a YAML parser is trivial.

Yes, there's PyYAML, it's great and all. But I wish I had one less module to install.

[1]Well, actually PyQwt supports 2.6 for some time now, but I was waiting for the Windows binary installer.
[2]Which isn't very convenient when you have dozens of modules installed with Windows binary installers. I couldn't find a better way than uninstalling them one by one.
[3]ActivePython 2.6.2
[4]YAML can do everything JSON can, and much more. For example, it's quite convenient as a configuration file format and as a replacement for XML in general.