Welcome


... to my place on the web - a personal website where I keep articles, freeware programs and code snippets I've written, and a weblog where I share my ideas and insights on programming, technology, books, and life in general.

To browse the website, use the site map on the right-hand side of the screen. Below, you will find the ten most recent posts.

New toys

May 19th, 2012 at 1:51 pm

I got a couple of new gadgets/toys recently.

The new iPad – white, 16GB wifi version. It’s really sleek – lighter and slimmer than my old iPad (first generation). The branding is confusing, why didn’t they just call it iPad 3? The screen is great, but I wonder what would be the killer application to really let it shine? HD movies? Because when just viewing photographs and browsing, there isn’t too much difference from the old one’s screen.

Kindle touch. I already have a “normal” kindle (also from the new generation), but it’s so useful in our family that I wanted another one. This time I bought a touch screen version thinking that it should make dictionary translations easier (for reading in Spanish, etc). I like its controls, but it’s somewhat fatter and heavier than the normal Kindle which makes it less convenient for holding. I hope it’s not a big difference when I’m actually used to it.

Coursera/Stanford online algorithms I course – a retrospective

May 8th, 2012 at 1:46 pm

The recent influx of free courses offered by some of the leading technical universities of the USA online was too hard to resist and a couple of months ago I enrolled into the Design and Analysis of Algorithms I course, given by the Stanford prof Tim Roughgargen.

The course is now over and I want to write a short review. First the good things:

  • The course website and the whole web app driving it is well executed, especially for something at such an early stage. Whatever small glitches appeared, were quickly fixed without becoming an annoyance.
  • The teacher is pretty good – I actually enjoyed watching the lectures and weren’t bored, which is unusual with me when it comes to lectures.
  • I really liked the "optional" videos (watched them all). Not sure whether they would have been taught in the real course, but they add a lot.

As for the less-good things, my single real gripe is that the course homework and exam were way too easy. Just a few examples of what I mean:

  • It took me, on average, 15-30 minutes per week for the "dry" (problem sets) part, and another 15-30 minutes per week for the "wet" (programming questions) part. This is way, way too little. I’m used to much more challenging homework from the Technion. Particularly, the wet exercises usually take days to complete. I must admit that one of the reasons of taking the course was an anticipation for some interesting algorithmic coding. It was interesting, but way too short, so I’m a bit disappointed here.
  • The exam was extremely easy. It took me less than an hour to solve (out of 3 allotted hours), and the handful of points I lost are due to pure sloppiness (mis-reading one of the questions). The exam included questions from the lectures, for crying out loud! In the Technion’s EE dept, we had a joke that every course teaches you 4 different materials: one in the lectures, another in the practice with the TAs, another in homework and yet another in the exam.

Yes, I know that the lecturer specifically mentioned that the level of the exercises and exam will be less challenging than in a real Stanford course, and I can see the reasons for that. However, I still wish that the course would try to cater for the stronger students as well, so that completing it with a high grade would give a real sense of accomplishment. To the course staff’s credit it should be added that they did provide optional theoretical questions to ponder, which were much closer to what I’d expect from Algo homework to be. Nevertheless, as we all know, ungraded course work only goes so far with students. It’s hard to find motivation to dig deep into it.

To conclude, I really enjoyed taking the course, and big thanks to Tim Roughgarden and the course staff for offering it and investing effort into making it enjoyable. I will be seriously considering enrolling in the follow-up Part II in late summer.

http://eli.thegreenplace.net/wp-content/uploads/2012/05/acc.png

Automating boring testing activities with tox

May 7th, 2012 at 4:27 am

My pyelftools project has a comprehensive set of tests, which I make sure to run before every release. This turned out to be a somewhat tiresome manual task, because of the following factors:

  1. There are three sets of tests in pyelftools: unit tests, full system tests (comparing the output of my pyelftools-based readelf clone with the real readelf), and example tests (I hate out-of-date examples, so mine are self-testable).
  2. I currently promise to keep pyelftools working on Python versions 2.6, 2.7 and 3.2; the tests should be run on all three.
  3. I want to test that my setup.py script is alright too, so it would be preferable to build a source distribution, install it in a virtualenv and run the tests there. This also helps me test that the package is correctly pip-installable, and that no artifact of my local setup makes the tests pass by chance.

To a programmer, this list just screams "repetition", and hence some form of automation has to be conjured. Before you rush to roll your own, be sure the check out tox.

tox was designed to solve exactly the problem I described above. Its description is "virtualenv-based automation of test activities" – spot on.

All I have to do in order to automate the testing routine described above is install tox, create a configuration file for it, and make sure to execute tox -rv in the root of my project before every release. Done.

Here’s my tox.ini for pyelftools, in all its glory:

[tox]
envlist = py27,py26,py32

[testenv]
commands =
    python test/run_all_unittests.py
    python test/run_examples_test.py
    python test/run_readelf_tests.py

[testenv:py26]
deps =
    unittest2

Basically, tox runs all the commands listed in [testenv] for each "environment" listed in envlist. It creates a virtualenv for each such environment, which is exactly what I wanted.

Moreover, as you can see, tox helps to handle another problem I would have to deal with manually: I’m using the excellent unittest2 package for the pyelftools unit tests. While this package is in the standard library of Python 2.7 and 3.2+, it has to be installed separately in 2.6; tox handles this by providing a deps feature – I can specify the dependencies that need to be installed into the virtualenv for the given environment.

Reading on a Kindle vs. a paper book

April 26th, 2012 at 5:23 am

Note: "Kindle" here can be probably replaced with any good electronic book reader. I only happen to own a Kindle, so my personal opinion is confined to it.

Here are some thoughts on the relative merits of "real" paper books and books read on a Kindle:

Why a Kindle is better (in no particular order):

  • Smaller, slimmer and lighter than most books, while still being big enough to provide a good holding and reading experience.
  • Much easier to put down and come back to the correct spot later. Not only you don’t need a physical marker, but the pages themselves are significantly smaller so it’s easy to stop at the first paragraph and know this is where you should resume reading. I appreciate this a lot when reading while watching my daughter, which means I should put the book down each minute or two and come back to it later.
  • Easier and cleaner to highlight interesting sections and get back to them later – no need to keep a pen around.
  • A blessing when reading several books simultaneously (which is frequently the case for me). This becomes quite important when traveling.
  • Practically every book from Amazon available for you 15 seconds after you decide you want to read it (and you can read a sample before paying).
  • Has a built-in clock.
  • Has built-in dictionaries and you can install more.
  • Adjustable font size is a blessing for people with poor eyesight (although this is thankfully not my problem yet, I can see how important this is).

Why a paper book is better:

  • Its battery lasts longer.
  • Illustration quality is higher, especially in non-fiction books where this is important, and where high-quality color illustrations may be crucial. I think this is temporary, though – it’s hard to believe it will still be an issue 5 years from now when e-readers have better screens.
  • More reliable. In normal day-to-day conditions this isn’t a big issue, though.
  • Costs much less so you don’t worry too much about it getting damaged, lost or stolen.
  • Has the good-old feeling and smell of paper. I must admit, however, that when I read on a Kindle, I miss this feeling much less than initially imagined.
  • Gives you the opportunity to fill shelves with books, which is a terrific adornment for every home, especially the "office room".

Python object creation sequence

April 16th, 2012 at 7:03 am

[The Python version described in this article is 3.x]

This article aims to explore the process of creating new objects in Python. As I explained in a previous article, object creation is just a special case of calling a callable. Consider this Python code:

class Joe:
    pass

j = Joe()

What happens when j = Joe() is executed? Python sees it as a call to the callable Joe, and routes it to the internal function PyObject_Call, with Joe passed as the first argument. PyObject_Call looks at the type of its first argument to extract its tp_call attribute.

Now, what is the type of Joe? Whenever we define a new Python class, unless we explicitly specify a metaclass for it, its type is type. Therefore, when PyObject_Call attempts to look at the type of Joe, it finds type and picks its tp_call attribute. In other words, the function type_call in Objects/typeobject.c is invoked [1].

This is an interesting function, and it’s short, so I’ll paste it wholly here:

static PyObject *
type_call(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    PyObject *obj;

    if (type->tp_new == NULL) {
        PyErr_Format(PyExc_TypeError,
                     "cannot create '%.100s' instances",
                     type->tp_name);
        return NULL;
    }

    obj = type->tp_new(type, args, kwds);
    if (obj != NULL) {
        /* Ugly exception: when the call was type(something),
           don't call tp_init on the result. */
        if (type == &PyType_Type &&
            PyTuple_Check(args) && PyTuple_GET_SIZE(args) == 1 &&
            (kwds == NULL ||
             (PyDict_Check(kwds) && PyDict_Size(kwds) == 0)))
            return obj;
        /* If the returned object is not an instance of type,
           it won't be initialized. */
        if (!PyType_IsSubtype(Py_TYPE(obj), type))
            return obj;
        type = Py_TYPE(obj);
        if (type->tp_init != NULL &&
            type->tp_init(obj, args, kwds) < 0) {
            Py_DECREF(obj);
            obj = NULL;
        }
    }
    return obj;
}

So what arguments is type_call being passed in our case? The first one is Joe itself – but how is it represented? Well, Joe is a class, so it’s a type (all classes are types in Python 3). Types are represented inside the CPython VM by PyTypeObject objects [2].

What type_call does is first call the tp_new attribute of the given type. Then, it checks for a special case we can ignore for simplicity, makes sure tp_new returned an object of the expected type, and then calls tp_init. If an object of a different type was returned, it is not being initialized.

Translated to Python, what happens is this: if your class defines the __new__ special method, it gets called first when a new instance of the class is created. This method has to return some object. Usually, this will be of the required type, but this doesn’t have to be the case. Objects of the required type get __init__ invoked on them. Here’s an example:

class Joe:
    def __new__(cls, *args, **kwargs):
        obj = super(Joe, cls).__new__(cls)
        print('__new__ called. got new obj id=0x%x' % id(obj))
        return obj

    def __init__(self, arg):
        print('__init__ called (self=0x%x) with arg=%s' % (id(self), arg))
        self.arg = arg

j = Joe(12)
print(type(j))

This prints:

__new__ called. got new obj id=0x7f88e7218290
__init__ called (self=0x7f88e7218290) with arg=12
<class '__main__.Joe'>

Customizing the sequence

As we saw above, since the type of Joe is type, the type_call function is invoked to define the creation sequence for Joe instances. This sequence can be changed by specifying a custom type for Joe – in other words, a metaclass. Let’s modify the previous example to specify a custom metaclass for Joe:

class MetaJoe(type):
    def __call__(cls, *args, **kwargs):
        print('MetaJoe.__call__')
        return None

class Joe(metaclass=MetaJoe):
    def __new__(cls, *args, **kwargs):
        obj = super(Joe, cls).__new__(cls)
        print('__new__ called. got new obj id=0x%x' % id(obj))
        return obj

    def __init__(self, arg):
        print('__init__ called (self=0x%x) with arg=%s' % (id(self), arg))
        self.arg = arg

j = Joe(12)
print(type(j))

So now the type of Joe is not type, but MetaJoe. Consequently, when PyObject_Call picks the call function to execute for j = Joe(12), it takes MetaJoe.__call__. The latter prints a notice about itself and returns None, so we don’t expect the __new__ and __init__ methods of Joe to be called at all. Indeed, this is the outcome:

MetaJoe.__call__
<class 'NoneType'>

Digging deeper – tp_new

Alright, so now we have a better understanding of the object creation sequence. One crucial piece of the puzzle is still missing, though. While we almost always define __init__ for our classes, defining __new__ is rather rare [3]. Moreover, from a quick look at the code it’s obvious that __new__ is more fundamental in a way. This method is used to create a new object. It is called once and only once per instantiation. __init__, on the other hand, already gets a constructed object and may not be called at all; it can also be called multiple times.

Since the type parameter passed to type_call in our case is Joe, and Joe does not define a custom __new__ method, then type->tp_new defers to the tp_new slot of the base type. The base type of Joe (and all other Python objects, except object itself) is object. The object.tp_new slot is implemented in CPython by the object_new function in Objects/typeobject.c.

object_new is actually very simple. It does some argument checking, verifies that the type we’re trying to instantiate is not abstract, and then does this:

return type->tp_alloc(type, 0);

tp_alloc is a low-level slot of the type object in CPython. It’s not directly accessible from Python code, but should be familiar to C extension developers. A custom type defined in a C extension may override this slot to supply a custom memory allocation scheme for instances of itself. Most C extension types will, however, defer this allocation to the function PyType_GenericAlloc.

This function is part of the public C API of CPython, and it also happens to be assigned to the tp_alloc slot of object (defined in Objects/typeobject.c). It figures out how much memory the new object needs [4], allocates a memory chunk from CPython’s memory allocator and initializes it all to zeros. It then initializes the bare essential PyObject fields (type and reference count), does some GC bookkeeping and returns. The result is a freshly allocated instance.

Conclusion

Lest we lose the forest for the trees, let’s revisit the question this article began with. What happens when CPython executes j = Joe()?

  • Since Joe has no explicit metaclass, type is its type. So the tp_call slot of type, which is type_call, is called.
  • type_call starts by calling the tp_new slot of Joe:
    • Since Joe has no explicit base clase, its base is object. Therefore, object_new is called.
    • Since Joe is a Python-defined class, it has no custom tp_alloc slot. Therefore, object_new calls PyType_GenericAlloc.
    • PyType_GenericAlloc allocates and initializes a chunk of memory big enough to contain Joe.
  • type_call then goes on and calls Joe.__init__ on the newly created object.
    • Since Joe does not define __init__, its base’s __init__ is called, which is object_init.
    • object_init does nothing.
  • The new object is returned from type_call and is bound to the name j.

This is the vanilla flow for an object of a class that doesn’t have a custom metaclass, doesn’t have an explicit base class, and doesn’t define its own __new__ and __init__ methods. However, this article should have made it quite clear where these custom capabilities plug in to modify the object creation sequence. As you can see, Python is amazingly flexible. Practically every single step of the process described above can be customized, even for user-defined types implemented in Python. Types implemented in a C extension can customize even more, such as the exact memory allocation strategy used to create instances of the type.

http://eli.thegreenplace.net/wp-content/uploads/hline.jpg

[1] The PyTypeObject structure definition for type is PyType_Type in Objects/typeobject.c. You can see that type_call is being assigned to its tp_call slot.
[2] A future article will show how this comes to be when a new class is created.
[3] Even when we do explicitly override __new__ in our classes, we almost certainly defer the actual object creation to the base’s __new__.
[4] This information is available in the PyObject header of any type.

Implementing a generator/yield in a Python C extension

April 5th, 2012 at 6:35 am

In Python, a generator is a function that returns an iterator object. There are a couple of ways to do this, but the most elegant and common one is to use the yield statement.

For example, here is a simple synthetic example:

def pyrevgen(seq):
    for i, elem in enumerate(reversed(seq)):
        yield i, elem

The pyrevgen function is a generator. Given any sequence, it returns an iterator that yields the sequences’s elements in reversed order, and also enumerates them. For example:

>>> for i, e in pyrevgen(['a', 'b', 'c']):
...   print(i, e)
...
0 c
1 b
2 a

The goal of this post is to demonstrate how to achieve the same effect using the Python C API; in other words, in a C extension module. The focus is on Python 3 – for Python 2 the principle is the same although there may be some differences in the details.

yield is a very powerful construct in Python. In C there is no equivalent ability (unless you use some co-routine macro-fu, but this is outside our scope here). Therefore, we have to explicitly return an iterator object and let it handle the details of the iteration.

To write an iterator in Python we have to create a class that implements the __iter__ and __next__ special methods. The equivalent methods in the C API are tp_iter and tp_iternext, respectively.

We’ll create a new extension module, named spam. It will export a single object – the revgen type which will be callable similarly to the Python code above. In other words, the client Python code will be able to do:

import spam
for i, e in spam.revgen(['a', 'b', 'c']):
  print(i, e)

Let’s begin (a link to the full source code is available at the end of this post):

PyTypeObject PyRevgen_Type = {
    PyVarObject_HEAD_INIT(&PyType_Type, 0)
    "revgen",                       /* tp_name */
    sizeof(RevgenState),            /* tp_basicsize */
    0,                              /* tp_itemsize */
    (destructor)revgen_dealloc,     /* tp_dealloc */
    0,                              /* tp_print */
    0,                              /* tp_getattr */
    0,                              /* tp_setattr */
    0,                              /* tp_reserved */
    0,                              /* tp_repr */
    0,                              /* tp_as_number */
    0,                              /* tp_as_sequence */
    0,                              /* tp_as_mapping */
    0,                              /* tp_hash */
    0,                              /* tp_call */
    0,                              /* tp_str */
    0,                              /* tp_getattro */
    0,                              /* tp_setattro */
    0,                              /* tp_as_buffer */
    Py_TPFLAGS_DEFAULT,             /* tp_flags */
    0,                              /* tp_doc */
    0,                              /* tp_traverse */
    0,                              /* tp_clear */
    0,                              /* tp_richcompare */
    0,                              /* tp_weaklistoffset */
    PyObject_SelfIter,              /* tp_iter */
    (iternextfunc)revgen_next,      /* tp_iternext */
    0,                              /* tp_methods */
    0,                              /* tp_members */
    0,                              /* tp_getset */
    0,                              /* tp_base */
    0,                              /* tp_dict */
    0,                              /* tp_descr_get */
    0,                              /* tp_descr_set */
    0,                              /* tp_dictoffset */
    0,                              /* tp_init */
    PyType_GenericAlloc,            /* tp_alloc */
    revgen_new,                     /* tp_new */
};

static struct PyModuleDef spammodule = {
   PyModuleDef_HEAD_INIT,
   "spam",                  /* m_name */
   "",                      /* m_doc */
   -1,                      /* m_size */
};

PyMODINIT_FUNC
PyInit_spam(void)
{
    PyObject *module = PyModule_Create(&spammodule);
    if (!module)
        return NULL;

    if (PyType_Ready(&PyRevgen_Type) < 0)
        return NULL;
    Py_INCREF((PyObject *)&PyRevgen_Type);
    PyModule_AddObject(module, "revgen", (PyObject *)&PyRevgen_Type);

    return module;
}

This is standard code for creating a new module and a new type in it. The module initialization function (PyInit_spam) adds a single object to the module, named revgen. This object is the PyRevgen_Type type. By "calling" this object, the user is able to create new instances of the type.

The following structure ("subclass" of PyObject) is going to represent an instance of revgen:

/* RevgenState - reverse generator instance.
 *
 * sequence: ref to the sequence that's being iterated
 * seq_index: index of the next element in the sequence to yield
 * enum_index: next enumeration index to yield
 *
 * In pseudo-notation, the yielded tuple at each step is:
 *  enum_index, sequence[seq_index]
 *
*/
typedef struct {
    PyObject_HEAD
    Py_ssize_t seq_index, enum_index;
    PyObject *sequence;
} RevgenState;

The most interesting thing to note here is the reference to the sequence we’re iterating on. The iterator needs this instance to be able to access the sequence each time next is called on it.

And here is the function responsible for creating new instances. It is assigned to the tp_new slot of the type. Note that we don’t assign tp_init, so the default "do-nothing" initializer will be used:

static PyObject *
revgen_new(PyTypeObject *type, PyObject *args, PyObject *kwargs)
{
    PyObject *sequence;

    if (!PyArg_UnpackTuple(args, "revgen", 1, 1, &sequence))
        return NULL;

    /* We expect an argument that supports the sequence protocol */
    if (!PySequence_Check(sequence)) {
        PyErr_SetString(PyExc_TypeError, "revgen() expects a sequence");
        return NULL;
    }

    Py_ssize_t len = PySequence_Length(sequence);
    if (len == -1)
        return NULL;

    /* Create a new RevgenState and initialize its state - pointing to the last
     * index in the sequence.
    */
    RevgenState *rgstate = (RevgenState *)type->tp_alloc(type, 0);
    if (!rgstate)
        return NULL;

    Py_INCREF(sequence);
    rgstate->sequence = sequence;
    rgstate->seq_index = len - 1;
    rgstate->enum_index = 0;

    return (PyObject *)rgstate;
}

This is a straightforward tp_new implementation. It makes sure the object it’s expected to iterate on is a sequence, and initializes the state to prepare returning the last element in the first next call. The corresponding de-allocation function is also unsurprising:

static void
revgen_dealloc(RevgenState *rgstate)
{
    /* We need XDECREF here because when the generator is exhausted,
     * rgstate->sequence is cleared with Py_CLEAR which sets it to NULL.
    */
    Py_XDECREF(rgstate->sequence);
    Py_TYPE(rgstate)->tp_free(rgstate);
}

Now what’s left to see is the implementations of the tp_iter and tp_iternext slots. Since our type is an iterator, it can just assign PyObject_SelfIter function for the tp_iter slot. tp_iternext is where the interesting work is happening. This is what gets invoked when the next built-in is called on the iterator, and by extension also when the iterator is consumed by a for loop:

static PyObject *
revgen_next(RevgenState *rgstate)
{
    /* seq_index < 0 means that the generator is exhausted.
     * Returning NULL in this case is enough. The next() builtin will raise the
     * StopIteration error for us.
    */
    if (rgstate->seq_index >= 0) {
        PyObject *elem = PySequence_GetItem(rgstate->sequence,
                                            rgstate->seq_index);
        /* Exceptions from PySequence_GetItem are propagated to the caller
         * (elem will be NULL so we also return NULL).
        */
        if (elem) {
            PyObject *result = Py_BuildValue("lO", rgstate->enum_index, elem);
            rgstate->seq_index--;
            rgstate->enum_index++;
            return result;
        }
    }

    /* The reference to the sequence is cleared in the first generator call
     * after its exhaustion (after the call that returned the last element).
     * Py_CLEAR will be harmless for subsequent calls since it's idempotent
     * on NULL.
    */
    rgstate->seq_index = -1;
    Py_CLEAR(rgstate->sequence);
    return NULL;
}

The most important point to remember here is this – the state of the iteration should be wholly kept in the iterator object. Comparing to the Python implementation, this means quite a bit more work. The Python yield statement allows us to use the Python interpreter itself to keep the state of execution for us. This is what makes co-routines in Python so powerful – very little explicit state has to be kept manually. As I mentioned in the beginning of the post, in C extensions we don’t have this luxury, so we have to go the manual way. Since the current example is very simple and linear, this is relatively easy. In more complex examples, some ingenuity is often required to design the state object and tp_iternext function correctly.

The full code of this post, together with a simple Python test script and a setup.py for building the extension with distutils is available for download here.

Summary of reading: January – March 2012

April 5th, 2012 at 4:30 am
  • "Do not raise your hand against the boy" by Rabbi Israel Meir Lau – Mr. Lau was the chief Rabbi in Israel in 1993-2003, and this book is his autobiography, divided into two parts. The first part describes his early childhood which was mostly spent in the Treblinka extermination camp. The second part is about him reaching Israel, studying in several Yeshivas and moving higher and higher in the rung of religious leadership. Most of the last chapters of the book are spent on various events he recalls from his life, meeting political and religious leaders around the world. Although I can certainly admire Mr. Lau’s achievements and obvious intelligence, I couldn’t shake the feeling while reading the book that we just live in different ideological worlds. I suppose he would think the same, since he explicitly lists atheism as one of the humanity’s biggest enemies, together with AIDS, cancer, nuclear weapons and crime. The book itself is reasonably readable, although it’s tiring to read too much of it in a single sitting, so I smeared its reading over a couple of months.
  • "Being Geek" by Michael Lopp – based on the blog "Rands in repose", this book claims to be "the software developer’s career handbook". In my opinion this mostly applies to developers who plan to become managers, and even more so to fresh development managers who just recently stopped being engineers. The book is essentially a loose collection of blog posts, and thus the chapters wildly vary in quality. Some chapters are insightful, and some are just a waste of time. In addition, the book is written in a very specific style that may be entertaining and fun for some people, while being unbearable for others. I’m closer to the latter in this spectrum :) To conclude, depending on your style preferences and career goals you may either like or hate this book. Personally, I don’t feel I gained very much by reading it.
  • "Crucial Conversations" by K. Patterson, J. Grenny, R. McMillan and A. Switzler – self improvement books usually have a very clear pattern. There’s an idea or two, that would perhaps take a 10-15 page article to describe. Then, for the sake of book-format-publish-ability, that idea is smeared over 200 pages in generous font with generous margins and a few meaningless diagrams. The absolute key factor, however, is whether the original idea is really worth knowing about. If it is, then wasting an extra few hours on such a book usually still pays off. If it isn’t, well then you get pissed. Anyway, "Crucial Conversations" is basically a caricature of the smeared-to-a-book idea I was talking about. On the other hand, the basic ideas it tries to get across are pretty good. So, I do recommend to read it, especially to those who don’t find conversations (with other human beings) easy.
  • "The Last Song" by Nicholas Sparks (read in Spanish) – a typical Sparks novel. Not bad as far as these books go, although the third quarter is a drag. The end was worth it, however. I read it for the Spanish, of course ;-)
  • "Genes, Peoples and Languages" by Luigi Luca Cavalli-Sforza – I picked this one up because several books I enjoyed in the past referred to Cavalli-Sforza as a father of a lot of ground-breaking research on genetics, population migrations and linguistics in the second half of the 20th century. So this was an attempt to read some material "straight from the horse’s mouth". I must admit I didn’t enjoy it, though. Cavalli-Sforza writes with confidence and ambition, but not in a very readable way. This book tries to hit somewhere between popular science and a textbook, and misses both targets.
  • Re-reads:

    • "Exodus" by Leon Uris

    The fundamental types of Python – a diagram

    April 3rd, 2012 at 8:33 pm

    The aim of this post is to present a succinct diagram that correlates some basic properties of all Python objects with the fundamental types type and object. This is not a tutorial – it’s more of a reference snapshot that puts things in order. To properly understand why things are the way they are, check out the existing and future writings in the Python internals category of this blog, as well as other resources available online.

    In Python, every object has a type. Types are also objects – rather special objects. A type object, like any other object, has a type of its own. It also has a sequence of "base types" – in other words, types from which it inherits. This is unlike non-type objects, which don’t have base types.

    Consider this exemplary piece of code (Python 3):

    # Some types
    class Base:
        pass
    
    class Klass(Base):
        pass
    
    class Meta(type):
        pass
    
    class KlassWithMeta(metaclass=Meta):
        pass
    
    # Non-types
    kwm = KlassWithMeta()
    mylist = []
    

    The following diagram describes the types and bases of all the objects created in this code. Non-type objects only have types and no bases:

    http://eli.thegreenplace.net/wp-content/uploads/2012/04/typediagram.png

    Some interesting things to note:

    • The default type of all new types is type. This can be overridden by explicitly specifying the metaclass for a type.
    • Built-in types like list and user-defined types like Base are equivalent as far as Python is concerned.
    • The special type type is the default type of all objects – including itself. It is an object, and as such, inherits from object.
    • The special type object is the pinnacle of every inheritance hierarchy – it’s the ultimate base type of all Python types.
    • type and object are the only types in Python that really stand out from other types (and hence they are colored differently). type is its own type. object has no base type.

    Python objects, types, classes, and instances – a glossary

    March 30th, 2012 at 7:35 am

    While writing the article on the internals of Python callables, it occurred to me that some things in Python have more than one name. At the same time, some names are sometimes used to refer to more than one entity, and which one is implied has to be understood from context. Therefore, I think it’s a good idea to collect this nomenclature in a single place for the sake of my future writings. This way I’ll just be able to point here every time I discuss these topics, instead of explaining them over and over again.

    Specifically, I want to define what I mean by types, objects, classes and instances. Note that this refers to Python 3.x, but is mostly applicable for 2.x as well [1].

    Objects

    It’s easiest to start with objects. The Python data model reference has a pretty good definition:

    Objects are Python’s abstraction for data. All data in a Python program is represented by objects or by relations between objects. (In a sense, and in conformance to Von Neumann’s model of a “stored program computer,” code is also represented by objects.)

    Every object has an identity, a type and a value.

    So, everything in Python is an object. Lists are objects. 42 is an object. Modules are objects. Functions are objects. Python bytecode is also kept in an object. All of these have types and unique IDs:

    >>> def foo(): pass
    ...
    >>> type(foo), id(foo)
    (<class 'function'>, 38110760)
    >>> type(foo.__code__), id(foo.__code__)
    (<class 'code'>, 38111680)
    

    This "everything is an object" model is backed by the CPython implementation. Indeed, if you look into the code of CPython, you’ll notice that every entity mentioned above can be manipulated via a pointer to the PyObject base struct.

    Types

    The data model reference is useful here too:

    [...] An object’s type determines the operations that the object supports (e.g., “does it have a length?”) and also defines the possible values for objects of that type.

    So, every object in Python has a type. Its type can be discovered by calling the type builtin function [2]. The type is an object too, so it has a type of its own, which is called type. This last fact may not be very exciting or useful when you’re just writing Python code, but it’s hugely important if you want to understand the internals of CPython:

    >>> type(42)
    <class 'int'>
    >>> type(type(42))
    <class 'type'>
    >>> type(type(type(42)))
    <class 'type'>
    

    Yep, it’s turtles all the way down.

    Classes

    In the olden days, there was a difference between user-defined classes and built in types. But since 2.2, as long as you’re using "new-style" classes (classes that inherit from object in 2.x, and are default in 3.x), there is no real difference. Essentially, a class is a mechanism Python gives us to create new user-defined types from Python code.

    >>> class Joe: pass
    ...
    >>> j = Joe()
    >>> type(j)
    <class '__main__.Joe'>
    

    Using the class mechanism, we’ve created Joe – a user-defined type. j is an instance of the class Joe. In other words, it’s an object and its type is Joe.

    As any other type, Joe is an object itself, and it has a type too. This type is type:

    >>> type(type(j))
    <class 'type'>
    

    The terms "class" and "type" are an example of two names referring to the same concept. To avoid this confusion, I will always try to say "type" when I mean a type, and "user-defined class" (or "user-defined type") when referring to a new type created using the class construct. Note that when we create new types using the C API of CPython, there’s no "class" mentioned – we create a new "type", not a new "class".

    Instances

    Not unlike the ambiguity between "class" and "type", "instance" is synonymous to "object". Think of it this way: objects are instances of types. So, "42 is an instance of the type int" is equivalent to "42 is an int object". I usually use "instance" and "object" interchangeably. In some cases when I want to specifically refer to objects as artifacts of the CPython implementation, I will try to use "instance" to refer to actual instances of classes. Another place where the term "instance" is explicitly used by Python is in built-ins like isinstance and the special __instancecheck__ attribute.

    Conclusion

    As we’ve seen, there are two pairs of roughly synonymous terms in Python nomenclature. Types and classes are interchangeable concepts. I prefer to say "type" wherever possible, leaving the term "class" for user-defined types created with the "class" construct. IMHO "type" is a better term, and Python wouldn’t be worse if the "class" concept was wiped out completely.

    Similarly, objects and instances are terms that mean the same thing, but perhaps from slightly different angles. Sometimes it’s more convenient to use "instance" (i.e. when specifically talking about specific objects being instances of specific types – as in "j is an instance of Joe"), and sometimes it’s better to use "object" (i.e. when discussing the guts of the CPython implementation).

    I sincerely hope this post is more helpful than confusing! For me, it’s an aid that serves as a simple glossary when my usage of these terms in some article may be unclear or ambiguous.

    http://eli.thegreenplace.net/wp-content/uploads/hline.jpg

    [1] As long as you forget about the existence of classic 2.x classes and take it as a fact that all user-defined classes inherit from object.
    [2] An alternative is the __class__ attribute.

    Python internals: how callables work

    March 23rd, 2012 at 10:53 am

    [The Python version described in this article is 3.x, more specifically - the 3.3 alpha release of CPython.]

    The concept of a callable is fundamental in Python. When thinking about what can be "called", the immediately obvious answer is functions. Whether it’s user defined functions (written by you), or builtin functions (most probably implemented in C inside the CPython interpreter), functions were meant to be called, right?

    Well, there are also methods, but they’re not very interesting because they’re just special functions that are bound to objects. What else can be called? You may, or may not be familiar with the ability to call objects, as long as they belong to classes that define the __call__ magic method. So objects can act as functions. And thinking about this a bit further, classes are callable too. After all, here’s how we create new objects:

    class Joe:
      ... [contents of class]
    
    joe = Joe()
    

    Here we "call" Joe to create a new instance. So classes can act as functions as well!

    It turns out that all these concepts are nicely united in the CPython implementation. Everything in Python is an object, and that includes every entity described in the previous paragraphs (user & builtin functions, methods, objects, classes). All these calls are served by a single mechanism. This mechanism is elegant and not that difficult to understand, so it’s worth knowing about. But let’s start at the beginning.

    Compiling calls

    CPython executes our program in two major steps:

    1. The Python source code is compiled to bytecode.
    2. A VM executes that bytecode, using a toolbox of built-in objects and modules to help it do its job.

    In this section I’ll provide a quick overview of how the first step applies to making calls. I won’t get too deep since these details are not the really interesting part I want to focus on in the article. If you want to learn more about the flow Python source undergoes in the compiler, read this.

    Briefly, the Python compiler identifies everything followed by (arguments...) inside an expression as a call [1]. The AST node for this is Call. The compiler emits code for Call in the compiler_call function in Python/compile.c. In most cases, the CALL_FUNCTION bytecode instruction is going to be emitted. There are some variations I’m going to ignore for the purpose of the article. For example, if the call has "star args" – func(a, b, *args), there’s a special instruction for handling that – CALL_FUNCTION_VAR. It and other special instructions are just variations on the same theme.

    CALL_FUNCTION

    So CALL_FUNCTION is the instruction we’re going to focus on here. This is what it does:

    CALL_FUNCTION(argc)

    Calls a function. The low byte of argc indicates the number of positional parameters, the high byte the number of keyword parameters. On the stack, the opcode finds the keyword parameters first. For each keyword argument, the value is on top of the key. Below the keyword parameters, the positional parameters are on the stack, with the right-most parameter on top. Below the parameters, the function object to call is on the stack. Pops all function arguments, and the function itself off the stack, and pushes the return value.

    CPython bytecode is evaluated by the the mammoth function PyEval_EvalFrameEx in Python/ceval.c. The function is scary but it’s nothing more than a fancy dispatcher of opcodes. It reads instructions from the code object of the given frame and executes them. Here, for example, is the handler for CALL_FUNCTION (cleaned up a bit to remove tracing and timing macros):

    TARGET(CALL_FUNCTION)
    {
        PyObject **sp;
        sp = stack_pointer;
        x = call_function(&sp, oparg);
        stack_pointer = sp;
        PUSH(x);
        if (x != NULL)
            DISPATCH();
        break;
    }
    

    Not too bad – it’s actually very readable. call_function does the actual call (we’ll examine it in a bit), oparg is the numeric argument of the instruction, and stack_pointer points to the top of the stack [2]. The value returned by call_function is pushed back to the stack, and DISPATCH is just some macro magic to invoke the next instruction.

    call_function is also in Python/ceval.c. It implements the actual functionality of the instruction. At 80 lines it’s not very long, but long enough so I won’t paste it wholly here. Instead I’ll explain the flow in general and paste small snippets where relevant; you’re welcome to follow along with the code open in your favorite editor.

    Any call is just an object call

    The most important first step in understanding how calls work in Python is to ignore most of what call_function does. Yes, I mean it. The vast majority of the code in this function deals with optimizations for various common cases. It can be removed without hurting the correctness of the interpreter, only its performance. If we ignore all optimizations for the time being, all call_function does is decode the amount of arguments and amount of keyword arguments from the single argument of CALL_FUNCTION and forwards it to do_call. We’ll get back to the optimizations later since they are interesting, but for the time being, let’s see what the core flow is.

    do_call loads the arguments from the stack into PyObject objects (a tuple for the positional arguments, a dict for the keyword arguments), does a bit of tracing and optimization of its own, but eventually calls PyObject_Call.

    PyObject_Call is a super-important function. It’s also available to extensions in the Python C API. Here it is, in all its glory:

    PyObject *
    PyObject_Call(PyObject *func, PyObject *arg, PyObject *kw)
    {
        ternaryfunc call;
    
        if ((call = func->ob_type->tp_call) != NULL) {
            PyObject *result;
            if (Py_EnterRecursiveCall(" while calling a Python object"))
                return NULL;
            result = (*call)(func, arg, kw);
            Py_LeaveRecursiveCall();
            if (result == NULL && !PyErr_Occurred())
                PyErr_SetString(
                    PyExc_SystemError,
                    "NULL result without error in PyObject_Call");
            return result;
        }
        PyErr_Format(PyExc_TypeError, "'%.200s' object is not callable",
                     func->ob_type->tp_name);
        return NULL;
    }
    

    Deep recursion protection and error handling aside [3], PyObject_Call extracts the tp_call attribute [4] of the object’s type and calls it. This is possible since tp_call holds a function pointer.

    Let it sink for a moment. This is it. Ignoring all kinds of wonderful optimizations, this is what all calls in Python boil down to:

    • Everything in Python is an object [5].
    • Every object has a type; the type of an object dictates the stuff that can be done to/with the object.
    • When an object is called, its type’s tp_call attribute is called.

    As a user of Python, your only direct interaction with tp_call is when you want your objects to be callable. If you define your class in Python, you have to implement the __call__ method for this purpose. This method gets directly mapped to tp_call by CPython. If you define your class as a C extension, you have to assign tp_call in the type object of your class manually.

    But recall that classes themselves are "called" to create new objects, so tp_call plays a role here as well. Even more fundamentally, when you define a class there is also a call involved – on the class’s metaclass. This is an interesting topic and I’ll cover it in a future article.

    Extra credit: Optimizations in CALL_FUNCTION

    This part is optional, since the main point of the article was delivered in the previous section. That said, I think this material is interesting, since it provides examples of how some things you wouldn’t usually think of as objects, actually are objects in Python.

    As I mentioned earlier, we could just use PyObject_Call for every CALL_FUNCTION and be done with it. In reality, it makes sense to do some optimizations to cover common cases where that may be an overkill. PyObject_Call is a very generic function that needs all its arguments in special tuple and dictionary objects (for positional and keyword arguments, respectively). These arguments need to be taken from the stack and arranged in the containers PyObject_Call expects. In some common cases we can avoid a lot of this overhead, and this is what the optimizations in call_function are about.

    The first special case call_function addresses is:

    /* Always dispatch PyCFunction first, because these are
       presumed to be the most frequent callable object.
    */
    if (PyCFunction_Check(func) && nk == 0) {
    

    This handles objects of type builtin_function_or_method (represented by the PyCFunction type in the C implementation). There are a lot of those in Python, as the comment above notes. All functions and methods implemented in C, whether in the CPython interpreter, or in C extensions, fall into this category. For example:

    >>> type(chr)
    <class 'builtin_function_or_method'>
    >>> type("".split)
    <class 'builtin_function_or_method'>
    >>> from pickle import dump
    >>> type(dump)
    <class 'builtin_function_or_method'>
    

    There’s an additional condition in that if – that the amount of keyword arguments passed to the function is zero. This allows some important optimizations. If the function in question accepts no arguments (marked by the METH_NOARGS flag when the function is created) or just a single object argument (METH_0 flag), call_function doesn’t go through the usual argument packing and can call the underlying function pointer directly. To understand how this is possible, reading about PyCFunction and the METH_ flags in this part of the documentation is highly recommended.

    Next, there’s some special handling for methods of classes written in Python:

    else {
      if (PyMethod_Check(func) && PyMethod_GET_SELF(func) != NULL) {
    

    PyMethod is the internal object used to represent bound methods. The special thing about methods is that they carry around a reference to the object they’re bound to. call_function extracts this object and places it on the stack, in preparation for what comes next.

    Here’s the rest of the call code (after it in call_object there’s only some stack cleanup):

    if (PyFunction_Check(func))
        x = fast_function(func, pp_stack, n, na, nk);
    else
        x = do_call(func, pp_stack, na, nk);
    

    do_call we’ve already met – it implements the most generic form of calling. However, there’s one more optimization – if func is a PyFunction (an object used internally to represent functions defined in Python code), a separate path is taken – fast_function.

    To understand what fast_function does, it’s important to first consider what happens when a Python function is executed. Simply put, its code object is evaluated (with PyEval_EvalCodeEx itself). This code expects its arguments to be on the stack. Therefore, in most cases there’s no point packing the arguments into containers and unpacking them again. With some care, they can just be left on the stack and a lot of precious CPU cycles can be spared.

    Everything else falls back to do_call. This, by the way, includes PyCFunction objects that do have keyword arguments. A curious aspect of this fact is that it’s somewhat more efficient to not pass keyword arguments to C functions that either accept them or are fine with just positional arguments. For example [6]:

    $ ~/test/python_src/33/python -m timeit -s's="a;b;c;d;e"' 's.split(";")'
    1000000 loops, best of 3: 0.3 usec per loop
    $ ~/test/python_src/33/python -m timeit -s's="a;b;c;d;e"' 's.split(sep=";")'
    1000000 loops, best of 3: 0.469 usec per loop
    

    This is a big difference, but the input is very small. For larger strings the difference is almost invisible:

    $ ~/test/python_src/33/python -m timeit -s's="a;b;c;d;e"*1000' 's.split(";")'
    10000 loops, best of 3: 98.4 usec per loop
    $ ~/test/python_src/33/python -m timeit -s's="a;b;c;d;e"*1000' 's.split(sep=";")'
    10000 loops, best of 3: 98.7 usec per loop
    

    Summary

    The aim of this article was to discuss what it means to be callable in Python, approaching this concept from the lowest possible level – the implementation details of the CPython virtual machine. Personally, I find this implementation very elegant, since it unifies several concepts into a single one. As the extra credit section showed, Python entities we don’t usually think of as objects – functions and methods – actually are objects and can also be handled in the same uniform manner. As I promised, future article(s) will dive deeper into the meaning of tp_call for creating new Python objects and classes.

    http://eli.thegreenplace.net/wp-content/uploads/hline.jpg

    [1] This is an intentional simplification – () serve other roles like class definitions (for listing base classes), function definitions (for listing arguments), decorators, etc – these are not in expressions. I’m also ignoring generator expressions on purpose.
    [2] The CPython VM is a stack machine.
    [3] Py_EnterRecursiveCall is needed where C code may end up calling Python code, to allow CPython keep track of its recursion level and bail out when it’s too deep. Note that functions written in C don’t have to abide by this recursion limit. This is why do_call special-cases PyCFunction before calling PyObject_Call.
    [4] By "attribute" here I mean a structure field (sometimes also called "slot" in the documentation). If you’re completely unfamiliar with the way Python C extensions are defined, go over this page.
    [5] When I say that everything is an object – I mean it. You may think of objects as instances of classes you defined. However, deep down on the C level, CPython creates and juggles a lot of objects on your behalf. Types (classes), builtins, functions, modules – all these are represented by objects.
    [6] This example will only run on Python 3.3, since the sep keyword argument to split is new in this version. In prior versions of Python split only accepted positional arguments.