<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Eli Bendersky's website - Python internals</title><link href="https://eli.thegreenplace.net/" rel="alternate"></link><link href="https://eli.thegreenplace.net/feeds/python-internals.atom.xml" rel="self"></link><id>https://eli.thegreenplace.net/</id><updated>2023-06-30T23:16:27-07:00</updated><entry><title>The scope of index variables in Python's for loops</title><link href="https://eli.thegreenplace.net/2015/the-scope-of-index-variables-in-pythons-for-loops/" rel="alternate"></link><published>2015-01-17T06:24:00-08:00</published><updated>2023-02-04T13:41:52-08:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2015-01-17:/2015/the-scope-of-index-variables-in-pythons-for-loops/</id><summary type="html">&lt;p&gt;I'll start with a quiz. What does this function do?&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you think &amp;quot;computes the sum and product of the items in &lt;tt class="docutils literal"&gt;lst&lt;/tt&gt;&amp;quot;, don't feel
too bad about …&lt;/p&gt;</summary><content type="html">&lt;p&gt;I'll start with a quiz. What does this function do?&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;*=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you think &amp;quot;computes the sum and product of the items in &lt;tt class="docutils literal"&gt;lst&lt;/tt&gt;&amp;quot;, don't feel
too bad about yourself. The bug here is often tricky to spot. If you did see it,
well done - but buried in mountains of real code, and when you don't &lt;em&gt;know&lt;/em&gt; it's
a quiz, discovering the bug is significantly more difficult.&lt;/p&gt;
&lt;p&gt;The bug here is due to using &lt;tt class="docutils literal"&gt;i&lt;/tt&gt; instead of &lt;tt class="docutils literal"&gt;t&lt;/tt&gt; in the body of the second
&lt;tt class="docutils literal"&gt;for&lt;/tt&gt; loop. But wait, how does this even work? Shouldn't &lt;tt class="docutils literal"&gt;i&lt;/tt&gt; be invisible
outside of the first loop? &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt; Well, no. In fact, Python formally acknowledges
that the names defined as &lt;tt class="docutils literal"&gt;for&lt;/tt&gt; loop targets (a more formally rigorous name
for &amp;quot;index variables&amp;quot;) leak into the enclosing function scope. So this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Is valid and prints 3, by design. In this writeup I want to explore why this is
so, why it's unlikely to change, and also use it as a tracer bullet to dig into
some interesting parts of the CPython compiler.&lt;/p&gt;
&lt;p&gt;And by the way, if you're not convinced this behavior can cause real problems,
consider this snippet:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;lst&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you'd expect this to print &lt;tt class="docutils literal"&gt;[0, 1, 2, 3]&lt;/tt&gt;, no such luck. This code will,
instead, emit &lt;tt class="docutils literal"&gt;[3, 3, 3, 3]&lt;/tt&gt;, because there's just a single &lt;tt class="docutils literal"&gt;i&lt;/tt&gt; in the scope
of &lt;tt class="docutils literal"&gt;foo&lt;/tt&gt;, and this is what all the &lt;tt class="docutils literal"&gt;lambda&lt;/tt&gt;s capture.&lt;/p&gt;
&lt;div class="section" id="the-official-word"&gt;
&lt;h2&gt;The official word&lt;/h2&gt;
&lt;p&gt;The Python reference documentation explicitly documents this behavior in the
&lt;a class="reference external" href="https://docs.python.org/dev/reference/compound_stmts.html#for"&gt;section on for loops&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
The for-loop makes assignments to the variables(s) in the target list. [...]
Names in the target list are not deleted when the loop is finished, but if
the sequence is empty, they will not have been assigned to at all by the
loop.&lt;/blockquote&gt;
&lt;p&gt;Note the last sentence - let's try it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[]:&lt;/span&gt;
    &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Indeed, a &lt;tt class="docutils literal"&gt;NameError&lt;/tt&gt; is raised. Later on, we'll see that this is a natural
outcome of the way the Python VM executes its bytecode.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="why-this-is-so"&gt;
&lt;h2&gt;Why this is so&lt;/h2&gt;
&lt;p&gt;I actually asked Guido van Rossum about this behavior and he was gracious enough
to reply with some historical background (thanks Guido!). The motivation is
keeping Python's simple approach to names and scopes without resorting to hacks
(such as deleting all the values defined in the loop after it's done - think
about the complications with exceptions, etc.) or more complex scoping rules.&lt;/p&gt;
&lt;p&gt;In Python, the scoping rules are fairly simple and elegant: a block is either a
module, a function body or a class body. Within a function body, names are
visible from the point of their definition to the end of the block (including
nested blocks such as nested functions). That's for local names, of course;
global names (and other &lt;em&gt;nonlocal&lt;/em&gt; names) have slightly different rules, but
that's not pertinent to our discussion.&lt;/p&gt;
&lt;p&gt;The important point here is: the innermost possible scope is a function body.
Not a &lt;tt class="docutils literal"&gt;for&lt;/tt&gt; loop body. Not a &lt;tt class="docutils literal"&gt;with&lt;/tt&gt; block body. Python does not have nested
lexical scopes below the level of a function, unlike some other languages (C and
its progeny, for example).&lt;/p&gt;
&lt;p&gt;So if you just go about implementing Python, this behavior is what you'll likely
to end with. Here's another enlightening snippet:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Would it surprise you to find out that &lt;tt class="docutils literal"&gt;d&lt;/tt&gt; is visible and accessible after the
&lt;tt class="docutils literal"&gt;for&lt;/tt&gt; loop is finished? No, this is just the way Python works. So why would
the index variable be treated any differently?&lt;/p&gt;
&lt;p&gt;By the way, the index variables of list comprehensions are also leaked to the
enclosing scope. Or, to be precise, &lt;em&gt;were&lt;/em&gt; leaked, before Python 3 came along.&lt;/p&gt;
&lt;p&gt;Python 3 fixed the leakage from list comprehensions, along with other breaking
changes. Make no mistake, changing such behavior is a major breakage in
backwards compatibility. This is why I think the current behavior stuck and
won't be changed.&lt;/p&gt;
&lt;p&gt;Moreover, many folks still find this a useful feature of Python. Consider:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;somegenerator&lt;/span&gt;&lt;span class="p"&gt;()):&lt;/span&gt;
    &lt;span class="n"&gt;dostuffwith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;The loop executed &lt;/span&gt;&lt;span class="si"&gt;{0}&lt;/span&gt;&lt;span class="s1"&gt; times!&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;If you have no idea how many items &lt;tt class="docutils literal"&gt;somegenerator&lt;/tt&gt; actually returned, this is
a pretty succinct way to know. Otherwise you'd have to keep a separate counter.&lt;/p&gt;
&lt;p&gt;Here's another example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;somegenerator&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;isinteresing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;
&lt;span class="n"&gt;dostuffwith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Which is a useful pattern for finding things in a loop and using them afterwards
&lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There are other uses people came up with over the years that justify keeping
this behavior in place. It's hard enough to instill breaking changes for
features the core developers deem detrimental and harmful. When the feature is
argued by many to be useful, and moreover is used in a huge bunch of code in the
real world, the chances of removing it are zero.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="under-the-hood"&gt;
&lt;h2&gt;Under the hood&lt;/h2&gt;
&lt;p&gt;Now the fun part. Let's see how the Python compiler and VM conspire to make this
behavior possible. In this particular case, I think the most lucid way to
present things is going backwards from the bytecode. I hope this may also serve
as an interesting example on how to go about digging in Python's internals &lt;a class="footnote-reference" href="#footnote-3" id="footnote-reference-3"&gt;[3]&lt;/a&gt;
in order to find stuff out (it's so much fun, seriously!)&lt;/p&gt;
&lt;p&gt;Let's take a part of the function presented at the start of this article and
disassemble it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;lst&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The resulting bytecode is:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt; 0 LOAD_CONST               1 (0)
 3 STORE_FAST               1 (a)

 6 SETUP_LOOP              24 (to 33)
 9 LOAD_FAST                0 (lst)
12 GET_ITER
13 FOR_ITER                16 (to 32)
16 STORE_FAST               2 (i)

19 LOAD_FAST                1 (a)
22 LOAD_FAST                2 (i)
25 INPLACE_ADD
26 STORE_FAST               1 (a)
29 JUMP_ABSOLUTE           13
32 POP_BLOCK

33 LOAD_FAST                1 (a)
36 RETURN_VALUE
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;As a reminder, &lt;tt class="docutils literal"&gt;LOAD_FAST&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;STORE_FAST&lt;/tt&gt; are the opcodes Python uses to
access names that are only used within a function. Since the Python compiler
knows statically (at compile-time) how many such names exist in each function,
they can be accessed with static array offsets as opposed to a hash table, which
makes access significanly faster (hence the &lt;tt class="docutils literal"&gt;_FAST&lt;/tt&gt; suffix). But I digress.
What's really important here is that &lt;tt class="docutils literal"&gt;a&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;i&lt;/tt&gt; are treated identically.
They are both fetched with &lt;tt class="docutils literal"&gt;LOAD_FAST&lt;/tt&gt; and modified with &lt;tt class="docutils literal"&gt;STORE_FAST&lt;/tt&gt;.
There is absolutely no reason to assume that their visibility is in any way
different &lt;a class="footnote-reference" href="#footnote-4" id="footnote-reference-4"&gt;[4]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;So how did this come to be? Somehow, the compiler figured that &lt;tt class="docutils literal"&gt;i&lt;/tt&gt; is just
another local name within &lt;tt class="docutils literal"&gt;foo&lt;/tt&gt;. This logic lives in the symbol table code,
when the compiler walks over the AST to create a control-flow graph from which
bytecode is later emitted; there are more details about this process in
&lt;a class="reference external" href="https://eli.thegreenplace.net/2010/09/18/python-internals-symbol-tables-part-1"&gt;my article about symbol tables&lt;/a&gt;
- so I'll just stick to the essentials here.&lt;/p&gt;
&lt;p&gt;The symtable code doesn't treat &lt;tt class="docutils literal"&gt;for&lt;/tt&gt; statements very specially. In
&lt;tt class="docutils literal"&gt;symtable_visit_stmt&lt;/tt&gt; we have:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;For_kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;VISIT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;expr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;For&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;VISIT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;expr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;For&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iter&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;VISIT_SEQ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;For&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;For&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;orelse&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;VISIT_SEQ&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;stmt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;For&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;orelse&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The loop target is visited as any other expression. Since this code visits the
AST, it's worthwhile to dump it to see how the node for the &lt;tt class="docutils literal"&gt;for&lt;/tt&gt; statement
looks:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;For(target=Name(id=&amp;#39;i&amp;#39;, ctx=Store()),
    iter=Name(id=&amp;#39;lst&amp;#39;, ctx=Load()),
    body=[AugAssign(target=Name(id=&amp;#39;a&amp;#39;, ctx=Store()),
                    op=Add(),
                    value=Name(id=&amp;#39;i&amp;#39;, ctx=Load()))],
    orelse=[])
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So &lt;tt class="docutils literal"&gt;i&lt;/tt&gt; lives in a &lt;tt class="docutils literal"&gt;Name&lt;/tt&gt; node. These are handled in the symbol table code by
the following clause in &lt;tt class="docutils literal"&gt;symtable_visit_expr&lt;/tt&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="no"&gt;Name_kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;symtable_add_def&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;                          &lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Load&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;USE&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;DEF_LOCAL&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;VISIT_QUIT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="cm"&gt;/* ... */&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Since the name &lt;tt class="docutils literal"&gt;i&lt;/tt&gt; is clearly tagged with &lt;tt class="docutils literal"&gt;DEF_LOCAL&lt;/tt&gt; (because of the
&lt;tt class="docutils literal"&gt;*_FAST&lt;/tt&gt; opcodes emitted to access it, but this is also easy to observe if the
symbol table is dumped using the &lt;tt class="docutils literal"&gt;symtable&lt;/tt&gt; module), the code above evidently
calls &lt;tt class="docutils literal"&gt;symtable_add_def&lt;/tt&gt; with &lt;tt class="docutils literal"&gt;DEF_LOCAL&lt;/tt&gt; as the third argument. This is
the right time to glance at the AST above and notice the &lt;tt class="docutils literal"&gt;ctx=Store&lt;/tt&gt; part
of the &lt;tt class="docutils literal"&gt;Name&lt;/tt&gt; node of &lt;tt class="docutils literal"&gt;i&lt;/tt&gt;. So it's the AST that already comes in carrying
the information that &lt;tt class="docutils literal"&gt;i&lt;/tt&gt; is stored to in the &lt;tt class="docutils literal"&gt;target&lt;/tt&gt; part of the &lt;tt class="docutils literal"&gt;For&lt;/tt&gt;
node. Let's see how that comes to be.&lt;/p&gt;
&lt;p&gt;The AST-building part of the compiler goes over the parse tree (which is a
fairly low-level hierarchical representation of the source code - some
background is available &lt;a class="reference external" href="https://eli.thegreenplace.net/2009/02/16/abstract-vs-concrete-syntax-trees"&gt;here&lt;/a&gt;)
and, among other things, sets the &lt;tt class="docutils literal"&gt;expr_context&lt;/tt&gt; attributes on some nodes,
most notably &lt;tt class="docutils literal"&gt;Name&lt;/tt&gt; nodes. Think about it this way, in the following
statement:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bar&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Both &lt;tt class="docutils literal"&gt;foo&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;bar&lt;/tt&gt; are going to end up in &lt;tt class="docutils literal"&gt;Name&lt;/tt&gt; nodes. But while
&lt;tt class="docutils literal"&gt;bar&lt;/tt&gt; is only being loaded from, &lt;tt class="docutils literal"&gt;foo&lt;/tt&gt; is actually being stored into in this
code. The &lt;tt class="docutils literal"&gt;expr_context&lt;/tt&gt; attribute is used to distinguish between uses for
later consumption by the symbol table code &lt;a class="footnote-reference" href="#footnote-5" id="footnote-reference-5"&gt;[5]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Back to our &lt;tt class="docutils literal"&gt;for&lt;/tt&gt; loop targets, though. These are handled in the function that
creates an AST for &lt;tt class="docutils literal"&gt;for&lt;/tt&gt; statements -  &lt;tt class="docutils literal"&gt;ast_for_for_stmt&lt;/tt&gt;. Here are the
relevant parts of this function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;static&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;stmt_ty&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="nf"&gt;ast_for_for_stmt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;compiling&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;asdl_seq&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;_target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;suite_seq&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;expr_ty&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;expression&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;expr_ty&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="cm"&gt;/* ... */&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;node_target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;CHILD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;_target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ast_for_exprlist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;node_target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Store&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;_target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="cm"&gt;/* Check the # of children rather than the length of _target, since&lt;/span&gt;
&lt;span class="cm"&gt;       for x, in ... has 1 element in _target, but still requires a Tuple. */&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;expr_ty&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;asdl_seq_GET&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NCH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node_target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Store&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;lineno&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;col_offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;c_arena&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="cm"&gt;/* ... */&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;For&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;expression&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;suite_seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;LINENO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;n_col_offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;               &lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;c_arena&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;tt class="docutils literal"&gt;Store&lt;/tt&gt; context is created in the call to &lt;tt class="docutils literal"&gt;ast_for_exprlist&lt;/tt&gt;, which
creates the node for the target (recall the the &lt;tt class="docutils literal"&gt;for&lt;/tt&gt; loop target may be a
sequence of names for tuple unpacking, not just a single name).&lt;/p&gt;
&lt;p&gt;This function is probably the most important part in the process of explaining
why &lt;tt class="docutils literal"&gt;for&lt;/tt&gt; loop targets are treated similarly to other names &lt;em&gt;within&lt;/em&gt; the loop.
After this tagging happens in the AST, the code for handling such names in the
symbol table and VM is no different from other names.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="wrapping-up"&gt;
&lt;h2&gt;Wrapping up&lt;/h2&gt;
&lt;p&gt;This article discusses a particular behavior of Python that may be considered a
&amp;quot;gotcha&amp;quot; by some. I hope the article does a decent job of explaining how this
behavior flows naturally from the naming and scoping semantics of Python, why
it can be useful and hence is unlikely to ever change, and how the internals of
the Python compiler make it work under the hood. Thanks for reading!&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Here I'm tempted to make a Microsoft Visual C++ 6 joke, but the fact that
most readers of this blog in 2015 won't get it is somewhat disturbing
(because it reflects my age, not the abilities of my readers).&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;You could argue that &lt;tt class="docutils literal"&gt;dowithstuff(i)&lt;/tt&gt; could go into the &lt;tt class="docutils literal"&gt;if&lt;/tt&gt; right
before the &lt;tt class="docutils literal"&gt;break&lt;/tt&gt; here. But this isn't always convenient. Besides,
according to Guido there's a nice separation of concerns here - the loop
is used for searching, and only that. What happens with the value after
the search is done is not the loop's concern. I think this is a very good
point.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-3" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-3"&gt;[3]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;As usual for my articles on Python's internals, this is about Python 3.
Specifically, I'm looking at the &lt;tt class="docutils literal"&gt;default&lt;/tt&gt; branch of the Python
repository, where work on the next release (3.5) is being done. But for
this particular topic, the source code of any release in the 3.x series
should do.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-4" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-4"&gt;[4]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Another thing clear from the disassembly is why &lt;tt class="docutils literal"&gt;i&lt;/tt&gt; remains invisible
if the loop doesn't execute. The &lt;tt class="docutils literal"&gt;GET_ITER&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;FOR_ITER&lt;/tt&gt; pair of
opcodes treat the thing we loop over as an iterator and then call its
&lt;tt class="docutils literal"&gt;__next__&lt;/tt&gt; method. If that call ends up raising &lt;tt class="docutils literal"&gt;StopIteration&lt;/tt&gt;, the
VM catches it and exits the loop. Only if an actual value is returned
does the VM proceed to execute &lt;tt class="docutils literal"&gt;STORE_FAST&lt;/tt&gt; to &lt;tt class="docutils literal"&gt;i&lt;/tt&gt;, thus bringing it
into existence for subsequent code to refer to.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-5" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-5"&gt;[5]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;It's a curious design, which I suspect stems from the desire for
relatively clean recursive visitation code in AST consumers such as the
symbol table code and CFG generation.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
</content><category term="misc"></category><category term="Python internals"></category><category term="Python"></category></entry><entry><title>Using ASDL to describe ASTs in compilers</title><link href="https://eli.thegreenplace.net/2014/06/04/using-asdl-to-describe-asts-in-compilers" rel="alternate"></link><published>2014-06-04T06:25:55-07:00</published><updated>2023-06-30T23:16:27-07:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2014-06-04:/2014/06/04/using-asdl-to-describe-asts-in-compilers</id><summary type="html">
        &lt;p&gt;ASTs (Abstract Syntax Trees) are an &lt;a class="reference external" href="https://eli.thegreenplace.net/2009/02/16/abstract-vs-concrete-syntax-trees/"&gt;important data structure&lt;/a&gt; in compiler front-ends. If you've written a few parsers, you almost definitely ran into the need to describe the result of the parsing in terms of an AST. While the kinds of nodes such ASTs have and their structure is very …&lt;/p&gt;</summary><content type="html">
        &lt;p&gt;ASTs (Abstract Syntax Trees) are an &lt;a class="reference external" href="https://eli.thegreenplace.net/2009/02/16/abstract-vs-concrete-syntax-trees/"&gt;important data structure&lt;/a&gt; in compiler front-ends. If you've written a few parsers, you almost definitely ran into the need to describe the result of the parsing in terms of an AST. While the kinds of nodes such ASTs have and their structure is very specific to the source language, many commonalities come up. In other words, coding &amp;quot;yet another AST&amp;quot; gets really old after you've done it a few times.&lt;/p&gt;
&lt;p&gt;Worry not, as you'd expect from the programmer crowd, this problem was &amp;quot;solved&amp;quot; by adding another level of abstraction. Yes, an &lt;strong&gt;abstraction&lt;/strong&gt; over &lt;strong&gt;Abstract&lt;/strong&gt; Syntax Trees, oh my! The abstraction here is some textual format (let's call it a DSL to sound smart) that describes what the AST looks like, along with machinery to auto-generate the code that implements this AST.&lt;/p&gt;
&lt;p&gt;Most solutions in this domain are ad-hoc, but one that I've seen used more than once is &lt;a class="reference external" href="http://asdl.sourceforge.net/"&gt;ASDL&lt;/a&gt; - Abstract Syntax Definition Language. The self-description from the website sounds about right:&lt;/p&gt;
&lt;blockquote&gt;
The Zephyr Abstract Syntax Description Lanuguage (ASDL) is a language designed to describe the tree-like data structures in compilers. Its main goal is to provide a method for compiler components written in different languages to interoperate. ASDL makes it easier for applications written in a variety of programming languages to communicate complex recursive data structures.&lt;/blockquote&gt;
&lt;p&gt;To give an example, here's a short snippet from an ASDL definition of a simple programming language:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;program = Program(class* classes)

class = Class(identifier name, identifier? parent, feature* features)

[...]

expression = Assign(identifier name, expression expr)
           | StaticDispatch(expression expr, identifier type_name,
                            identifier name, expression* actual)
           | Dispatch(expression expr, identifier name, expression* actual)

[...]
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The way to read this is: a &lt;em&gt;program&lt;/em&gt; node consists of one or more &lt;em&gt;classes&lt;/em&gt;. Each &lt;em&gt;class&lt;/em&gt; has these children: a &lt;em&gt;name&lt;/em&gt; which is an identifier, an optional &lt;em&gt;parent&lt;/em&gt; which is also an identifier, and a (potentially empty) list of &lt;em&gt;features&lt;/em&gt;, each of which is a feature node. And so on.&lt;/p&gt;
&lt;p&gt;The full details are available in the paper &amp;quot;The Zephyr Abstract Syntax Definition Language&amp;quot; by Wang et.al. Unfortunately, a link to this paper isn't always trivial to find, so I have a PDF copy in the &lt;tt class="docutils literal"&gt;docs&lt;/tt&gt; directory of my &lt;a class="reference external" href="https://github.com/eliben/asdl_parser"&gt;asdl_parser project&lt;/a&gt;, which I'm going to discuss soon.&lt;/p&gt;
&lt;div class="section" id="type-safety-in-asdl"&gt;
&lt;h3&gt;Type safety in ASDL&lt;/h3&gt;
&lt;p&gt;In addition to providing a concise description of nodes from which code (in many languages) can be generated automatically, I like ASDL for another reason. It provides some type safety when constructing the AST in the parser.&lt;/p&gt;
&lt;p&gt;Take the snippet above, for example. A &lt;em&gt;program&lt;/em&gt; has the &lt;em&gt;classes&lt;/em&gt; attribute, which is a (potentially empty) sequence of &lt;em&gt;class&lt;/em&gt; nodes. Each such class has to be a &lt;em&gt;Class&lt;/em&gt;, which is precisely defined. It can be nothing else. The &lt;em&gt;expression&lt;/em&gt; below that shows it differently - an expression can be either a &lt;em&gt;Assign&lt;/em&gt;, &lt;em&gt;StaticDispatch&lt;/em&gt;, etc.&lt;/p&gt;
&lt;p&gt;The set of possibilities is statically defined. This makes it possible to insert some degree of static checking into the (auto-generated) AST construction code. So the constructed AST can't be completely bogus even before semantic analysis is applied. Even though I love Python, I do appreciate a bit of static type checking in the right places. Key data structures like ASTs are, I believe, one of the places when such type checking makes sense.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="asdl-in-upstream-cpython"&gt;
&lt;h3&gt;ASDL in upstream CPython&lt;/h3&gt;
&lt;p&gt;Starting with Python 2.5, the CPython compiler (the part responsible for emitting bytecode from Python source) uses an ASDL description to create an AST for Python source. The AST is created by the parser (from the parse tree - more details in &lt;a class="reference external" href="http://legacy.python.org/dev/peps/pep-0339/"&gt;PEP 339&lt;/a&gt;), and is then used to create the control-flow graph, from which bytecode is emitted.&lt;/p&gt;
&lt;p&gt;The ASDL description lives in &lt;tt class="docutils literal"&gt;Parser/Python.asdl&lt;/tt&gt; in the CPython source tree. &lt;tt class="docutils literal"&gt;Parser/asdl_c.py&lt;/tt&gt; is a script that runs whenever someone modifies this ASDL description. It uses the &lt;tt class="docutils literal"&gt;Parser/asdl.py&lt;/tt&gt; module to parse the ASDL file into an internal form and then emits C code that describes the ASTs. This C code lives in &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Include/Python-ast.h&lt;/span&gt;&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;Python/Python-ast.c&lt;/span&gt;&lt;/tt&gt; &lt;a class="footnote-reference" href="#id2" id="id1"&gt;[1]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This may be more details than you wanted to hear :-) The gist of it, however, is - CPython's ASTs are described in ASDL, and if you want a quick glance of how these ASTs look, &lt;tt class="docutils literal"&gt;Parser/Python.asdl&lt;/tt&gt; is the file to look at.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="my-rewrite-of-the-asdl-parser"&gt;
&lt;h3&gt;My rewrite of the ASDL parser&lt;/h3&gt;
&lt;p&gt;Up until very recently, the ASDL description of the CPython AST was parsed by a tool that relied on the &lt;a class="reference external" href="http://pages.cpsc.ucalgary.ca/~aycock/spark/"&gt;SPARK parsing toolkit&lt;/a&gt;. In fact, &lt;tt class="docutils literal"&gt;Parser/spark.py&lt;/tt&gt; was carried around in the distribution just for this purpose.&lt;/p&gt;
&lt;p&gt;A few months ago I was looking for something to conveniently implement the AST for a toy compiler I was hacking on. Being a CPython developer, ASDL immediately sprang to mind, but I was reluctant to carry the SPARK dependency and/or learn how to use it. The ASDL language seemed simple enough to not require such machinery. Surely a simple recursive-descent parser would do. So I implemented my own stand-alone parser for ASDL, using modern Python 3.x - and it's available in a &lt;a class="reference external" href="https://github.com/eliben/asdl_parser"&gt;public Github repository right here&lt;/a&gt;. Feel free to use it, and let me know how it goes!&lt;/p&gt;
&lt;p&gt;Since my parser turned out to be much simpler and easier to grok than upstream CPython's SPARK-based parser, I proposed to replace it in &lt;a class="reference external" href="http://bugs.python.org/issue19655"&gt;Issue 19655&lt;/a&gt;. After some delays (caused mainly by waiting for 3.4 release and then getting distracted by other stuff), the change &lt;a class="reference external" href="http://hg.python.org/cpython/rev/b769352e2922"&gt;landed in the default branch&lt;/a&gt; (on its way to 3.5) about a month ago. The result is pleasing - the new parser is shorter, doesn't require the SPARK dependency (which was now dropped), has tests and is much more maintainable.&lt;/p&gt;
&lt;p&gt;In the interest of not changing too much at once, I left the interface to the C code generator (&lt;tt class="docutils literal"&gt;Parser/asdl_c.py&lt;/tt&gt;) the same, so there is absolutely no difference in the produced C code. Some time in the future it may make sense to revise this decision. The C generator is also fairly old code that could use some modernization and tests.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="historical-note-ast-description-in-pycparser"&gt;
&lt;h3&gt;Historical note - AST description in pycparser&lt;/h3&gt;
&lt;p&gt;I first ran into this problem (high-level description of ASTs) when I was working on &lt;a class="reference external" href="https://github.com/eliben/pycparser"&gt;pycparser&lt;/a&gt; (which is &lt;a class="reference external" href="https://eli.thegreenplace.net/2008/11/15/pycparser-v10-is-out/"&gt;quite an old project&lt;/a&gt; by now).&lt;/p&gt;
&lt;p&gt;Back at the time, I looked at the &lt;tt class="docutils literal"&gt;compiler&lt;/tt&gt; module of Python 2.x and liked its approach of simple textual description of the AST which is then parsed and from which the code for AST nodes is emitted. The &lt;tt class="docutils literal"&gt;compiler&lt;/tt&gt; module was a maintenance headache (because it duplicated a lot of the AST logic from the actual compiler) and is gone in Python 3.x, replaced by the &lt;tt class="docutils literal"&gt;ast&lt;/tt&gt; module which provides access to the same C-based AST generated by &lt;tt class="docutils literal"&gt;Parser/asdl_c.py&lt;/tt&gt; as is used by the CPython compiler.&lt;/p&gt;
&lt;p&gt;pycparser's AST description is a simple textual file that's very similar in spirit to ASDL. If I were to do this today, I'd probably also pick ASDL since it's more &amp;quot;standard&amp;quot;, as well as for the extra type safety guarantees it provides.&lt;/p&gt;
&lt;img class="align-center" src="https://eli.thegreenplace.net/images/hline.jpg" style="width: 320px; height: 5px;" /&gt;
&lt;table class="docutils footnote" frame="void" id="id2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Even though these files are auto-generated, they are also checked into the CPython Mercurial repository. This is because we don't want people building Python from source to depend on the tools required to generate such files. Only core CPython developers who want to play with the internals need them.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;

    </content><category term="misc"></category><category term="Compilation"></category><category term="Python"></category><category term="Python internals"></category></entry><entry><title>Faster XML iteration with ElementTree</title><link href="https://eli.thegreenplace.net/2012/06/17/faster-xml-iteration-with-elementtree" rel="alternate"></link><published>2012-06-17T05:28:44-07:00</published><updated>2023-02-04T13:41:52-08:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2012-06-17:/2012/06/17/faster-xml-iteration-with-elementtree</id><summary type="html">
        &lt;p&gt;As I've &lt;a class="reference external" href="https://eli.thegreenplace.net/2012/03/02/python-development-improving-elementtree-for-3-3/"&gt;mentioned previously&lt;/a&gt;, starting with Python 3.3 the C accelerator of the &lt;tt class="docutils literal"&gt;xml.etree.ElementTree&lt;/tt&gt; module is going to be imported by default. This should make quite a bit of code faster for those who were not aware of the existence of the accelerator, and reduce the amount …&lt;/p&gt;</summary><content type="html">
        &lt;p&gt;As I've &lt;a class="reference external" href="https://eli.thegreenplace.net/2012/03/02/python-development-improving-elementtree-for-3-3/"&gt;mentioned previously&lt;/a&gt;, starting with Python 3.3 the C accelerator of the &lt;tt class="docutils literal"&gt;xml.etree.ElementTree&lt;/tt&gt; module is going to be imported by default. This should make quite a bit of code faster for those who were not aware of the existence of the accelerator, and reduce the amount of boilerplate importing for everyone.&lt;/p&gt;
&lt;p&gt;As Python 3.3 is nearing its first beta, more work was done in the past few weeks; mostly fixing all kinds of problems that arose from the aforementioned transition. But in this post I want to focus on one feature that was added this weekend - much faster iteration over the parsed XML tree.&lt;/p&gt;
&lt;p&gt;&lt;tt class="docutils literal"&gt;ElementTree&lt;/tt&gt; offers a few tools for iterating over the tree and for finding interesting elements in it, but the basis for them all is the &lt;tt class="docutils literal"&gt;iter&lt;/tt&gt; method:&lt;/p&gt;
&lt;blockquote&gt;
Creates a tree iterator with the current element as the root. The iterator iterates over this element and all elements below it, in document (depth first) order. If tag is not None or '*', only elements whose tag equals tag are returned from the iterator.&lt;/blockquote&gt;
&lt;p&gt;And until very recently, this &lt;tt class="docutils literal"&gt;iter&lt;/tt&gt; was implemented in Python, even when the C accelerator was loaded. This was achieved by calling &lt;tt class="docutils literal"&gt;PyRun_String&lt;/tt&gt; on a &amp;quot;bootstrap&amp;quot; string defining the method (as well as a bunch of other Python code), when the C extension module was being initialized. In the past few months I've been slowly and surely decimating this bootstrap code, trying to move as much functionality as possible into the C code and replacing stuff with actual C API calls. The last bastion was &lt;tt class="docutils literal"&gt;iter&lt;/tt&gt; (and its cousin &lt;tt class="docutils literal"&gt;itertext&lt;/tt&gt;) because its implementation in C is not trivial.&lt;/p&gt;
&lt;p&gt;Well, that last bastion has now fallen and the C accelerator of &lt;tt class="docutils literal"&gt;ElementTree&lt;/tt&gt; no longer has any Python bootstrap code - &lt;tt class="docutils literal"&gt;iter&lt;/tt&gt; is actually implemented in C. And the great &amp;quot;side effect&amp;quot; of this is that the &lt;tt class="docutils literal"&gt;iter&lt;/tt&gt; method (and all the other methods that rely on it, like &lt;tt class="docutils literal"&gt;find&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;iterfind&lt;/tt&gt; and others) is now much faster. On a relatively large XML document I timed a &lt;strong&gt;10x speed boost&lt;/strong&gt; for simple iteration looking for a specific tag. I hope that this will make a lot of XML processing code in Python much faster out-of-the-box.&lt;/p&gt;
&lt;p&gt;This change is already in Python trunk and will be part of the 3.3 release. I must admit that I didn't spend much time optimizing the C code implementing &lt;tt class="docutils literal"&gt;iter&lt;/tt&gt;, so there may still be an area for improvement. I have a hunch that it can be made a few 10s of percents faster with a bit of effort. If you're interested to help, drop me a line and I will be happy to discuss it.&lt;/p&gt;

    </content><category term="misc"></category><category term="C &amp; C++"></category><category term="Python"></category><category term="Python internals"></category></entry><entry><title>Under the hood of Python class definitions</title><link href="https://eli.thegreenplace.net/2012/06/15/under-the-hood-of-python-class-definitions" rel="alternate"></link><published>2012-06-15T05:51:41-07:00</published><updated>2023-02-04T15:35:51-08:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2012-06-15:/2012/06/15/under-the-hood-of-python-class-definitions</id><summary type="html">
        &lt;p&gt;This is a fast-paced walk-through of the internals of defining new classes in Python. It shows what actually happens inside the Python interpreter when a new class definition is encountered and processed. Beware, this is advanced material. If the prospect of pondering the metaclass of the metaclass of your class …&lt;/p&gt;</summary><content type="html">
        &lt;p&gt;This is a fast-paced walk-through of the internals of defining new classes in Python. It shows what actually happens inside the Python interpreter when a new class definition is encountered and processed. Beware, this is advanced material. If the prospect of pondering the metaclass of the metaclass of your class makes you feel nauseated, you better stop now.&lt;/p&gt;
&lt;p&gt;The focus is on the official (CPython) implementation of Python 3. For modern releases of Python 2 the concepts are similar, although there will be some slight differences in the details.&lt;/p&gt;
&lt;div class="section" id="on-the-bytecode-level"&gt;
&lt;h3&gt;On the bytecode level&lt;/h3&gt;
&lt;p&gt;I'll start right with the bytecode, ignoring all the good work done by the Python compiler &lt;a class="footnote-reference" href="#id11" id="id1"&gt;[1]&lt;/a&gt;. For simplicity, this function will be used to demonstrate the bytecode generated by a class definition, since it's easy to disassemble functions:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&lt;span style="color: #00007f; font-weight: bold"&gt;def&lt;/span&gt; &lt;span style="color: #00007f"&gt;myfunc&lt;/span&gt;():
    &lt;span style="color: #00007f; font-weight: bold"&gt;class&lt;/span&gt; &lt;span style="color: #00007f"&gt;Joe&lt;/span&gt;:
        attr = &lt;span style="color: #007f7f"&gt;100.02&lt;/span&gt;
        &lt;span style="color: #00007f; font-weight: bold"&gt;def&lt;/span&gt; &lt;span style="color: #00007f"&gt;foo&lt;/span&gt;(&lt;span style="color: #00007f"&gt;self&lt;/span&gt;):
            &lt;span style="color: #00007f; font-weight: bold"&gt;return&lt;/span&gt; &lt;span style="color: #007f7f"&gt;2&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Disassembling &lt;tt class="docutils literal"&gt;myfunc&lt;/tt&gt; will show us the steps needed to define a new class:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&amp;gt;&amp;gt;&amp;gt; dis.disassemble(myfunc.__code__)
 &lt;span style="color: #007f7f"&gt;14&lt;/span&gt;           &lt;span style="color: #007f7f"&gt;0&lt;/span&gt; LOAD_BUILD_CLASS
              &lt;span style="color: #007f7f"&gt;1&lt;/span&gt; LOAD_CONST               &lt;span style="color: #007f7f"&gt;1&lt;/span&gt; (&amp;lt;code &lt;span style="color: #00007f"&gt;object&lt;/span&gt; Joe at &lt;span style="color: #007f7f"&gt;0x7fe226335b80&lt;/span&gt;, &lt;span style="color: #00007f"&gt;file&lt;/span&gt; &lt;span style="color: #7f007f"&gt;&amp;quot;disassemble.py&amp;quot;&lt;/span&gt;, line &lt;span style="color: #007f7f"&gt;14&lt;/span&gt;&amp;gt;)
              &lt;span style="color: #007f7f"&gt;4&lt;/span&gt; LOAD_CONST               &lt;span style="color: #007f7f"&gt;2&lt;/span&gt; (&lt;span style="color: #7f007f"&gt;&amp;#39;Joe&amp;#39;&lt;/span&gt;)
              &lt;span style="color: #007f7f"&gt;7&lt;/span&gt; MAKE_FUNCTION            &lt;span style="color: #007f7f"&gt;0&lt;/span&gt;
             &lt;span style="color: #007f7f"&gt;10&lt;/span&gt; LOAD_CONST               &lt;span style="color: #007f7f"&gt;2&lt;/span&gt; (&lt;span style="color: #7f007f"&gt;&amp;#39;Joe&amp;#39;&lt;/span&gt;)
             &lt;span style="color: #007f7f"&gt;13&lt;/span&gt; CALL_FUNCTION            &lt;span style="color: #007f7f"&gt;2&lt;/span&gt;
             &lt;span style="color: #007f7f"&gt;16&lt;/span&gt; STORE_FAST               &lt;span style="color: #007f7f"&gt;0&lt;/span&gt; (Joe)
             &lt;span style="color: #007f7f"&gt;19&lt;/span&gt; LOAD_CONST               &lt;span style="color: #007f7f"&gt;0&lt;/span&gt; (&lt;span style="color: #00007f"&gt;None&lt;/span&gt;)
             &lt;span style="color: #007f7f"&gt;22&lt;/span&gt; RETURN_VALUE
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The number immediately preceding the instruction name is its offset in the binary representation of the code object. All the instructions until and including the one at offset 16 are for defining the class. The last two instructions are for &lt;tt class="docutils literal"&gt;myfunc&lt;/tt&gt; to return &lt;tt class="docutils literal"&gt;None&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;Let's go through them, step by step. Documentation of the Python bytecode instructions is available in the &lt;a class="reference external" href="http://docs.python.org/dev/library/dis.htm"&gt;dis module&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;tt class="docutils literal"&gt;LOAD_BUILD_CLASS&lt;/tt&gt; is a special instruction used for creating classes. It pushes the function &lt;tt class="docutils literal"&gt;builtins.__build_class__&lt;/tt&gt; onto the stack. We'll examine this function in much detail later.&lt;/p&gt;
&lt;p&gt;Next, a code object, followed by a name (&lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt;) are pushed onto the stack as well. The code object is interesting, let's peek inside:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&amp;gt;&amp;gt;&amp;gt; dis.disassemble(myfunc.__code__.co_consts[&lt;span style="color: #007f7f"&gt;1&lt;/span&gt;])
 &lt;span style="color: #007f7f"&gt;14&lt;/span&gt;           &lt;span style="color: #007f7f"&gt;0&lt;/span&gt; LOAD_FAST                &lt;span style="color: #007f7f"&gt;0&lt;/span&gt; (__locals__)
              &lt;span style="color: #007f7f"&gt;3&lt;/span&gt; STORE_LOCALS
              &lt;span style="color: #007f7f"&gt;4&lt;/span&gt; LOAD_NAME                &lt;span style="color: #007f7f"&gt;0&lt;/span&gt; (__name__)
              &lt;span style="color: #007f7f"&gt;7&lt;/span&gt; STORE_NAME               &lt;span style="color: #007f7f"&gt;1&lt;/span&gt; (__module__)
             &lt;span style="color: #007f7f"&gt;10&lt;/span&gt; LOAD_CONST               &lt;span style="color: #007f7f"&gt;0&lt;/span&gt; (&lt;span style="color: #7f007f"&gt;&amp;#39;myfunc.&amp;lt;locals&amp;gt;.Joe&amp;#39;&lt;/span&gt;)
             &lt;span style="color: #007f7f"&gt;13&lt;/span&gt; STORE_NAME               &lt;span style="color: #007f7f"&gt;2&lt;/span&gt; (__qualname__)

 &lt;span style="color: #007f7f"&gt;15&lt;/span&gt;          &lt;span style="color: #007f7f"&gt;16&lt;/span&gt; LOAD_CONST               &lt;span style="color: #007f7f"&gt;1&lt;/span&gt; (&lt;span style="color: #007f7f"&gt;100.02&lt;/span&gt;)
             &lt;span style="color: #007f7f"&gt;19&lt;/span&gt; STORE_NAME               &lt;span style="color: #007f7f"&gt;3&lt;/span&gt; (attr)

 &lt;span style="color: #007f7f"&gt;16&lt;/span&gt;          &lt;span style="color: #007f7f"&gt;22&lt;/span&gt; LOAD_CONST               &lt;span style="color: #007f7f"&gt;2&lt;/span&gt; (&amp;lt;code &lt;span style="color: #00007f"&gt;object&lt;/span&gt; foo at &lt;span style="color: #007f7f"&gt;0x7fe226335c40&lt;/span&gt;, &lt;span style="color: #00007f"&gt;file&lt;/span&gt; &lt;span style="color: #7f007f"&gt;&amp;quot;disassemble.py&amp;quot;&lt;/span&gt;, line &lt;span style="color: #007f7f"&gt;16&lt;/span&gt;&amp;gt;)
             &lt;span style="color: #007f7f"&gt;25&lt;/span&gt; LOAD_CONST               &lt;span style="color: #007f7f"&gt;3&lt;/span&gt; (&lt;span style="color: #7f007f"&gt;&amp;#39;myfunc.&amp;lt;locals&amp;gt;.Joe.foo&amp;#39;&lt;/span&gt;)
             &lt;span style="color: #007f7f"&gt;28&lt;/span&gt; MAKE_FUNCTION            &lt;span style="color: #007f7f"&gt;0&lt;/span&gt;
             &lt;span style="color: #007f7f"&gt;31&lt;/span&gt; STORE_NAME               &lt;span style="color: #007f7f"&gt;4&lt;/span&gt; (foo)
             &lt;span style="color: #007f7f"&gt;34&lt;/span&gt; LOAD_CONST               &lt;span style="color: #007f7f"&gt;4&lt;/span&gt; (&lt;span style="color: #00007f"&gt;None&lt;/span&gt;)
             &lt;span style="color: #007f7f"&gt;37&lt;/span&gt; RETURN_VALUE
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This code defines the innards of the class. Some generic bookkeeping, followed by definitions for the &lt;tt class="docutils literal"&gt;attr&lt;/tt&gt; attribute and &lt;tt class="docutils literal"&gt;foo&lt;/tt&gt; method.&lt;/p&gt;
&lt;p&gt;Now let's get back to the first disassembly. The next instruction (at offset 7) is &lt;tt class="docutils literal"&gt;MAKE_FUNCTION&lt;/tt&gt; &lt;a class="footnote-reference" href="#id12" id="id2"&gt;[2]&lt;/a&gt;. This instruction pulls two things from the stack - a name and a code object. So in our case, it gets the name &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; and the code object we saw disassembled above. It creates a function with the given name and the code object as its code and pushes it back to the stack.&lt;/p&gt;
&lt;p&gt;This is followed by once again pushing the name &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; onto the stack. Here's what the stack looks like now (TOS means &amp;quot;top of stack&amp;quot;):&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;TOS&amp;gt; name &amp;quot;Joe&amp;quot;
     function &amp;quot;Joe&amp;quot; with code for defining the class
     function builtins.__build_class__
     -----------------------------------------------
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;At this point (offset 13), &lt;tt class="docutils literal"&gt;CALL_FUNCTION 2&lt;/tt&gt; is executed. The 2 simply means that the function was passed two positional arguments (and no keyword arguments). &lt;tt class="docutils literal"&gt;CALL_FUNCTION&lt;/tt&gt; first takes the arguments from the stack (the rightmost on top), and then the function itself. So the call is equivalent to:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;builtins.__build_class__(function defining &amp;quot;Joe&amp;quot;, &amp;quot;Joe&amp;quot;)
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="build-me-a-class-please"&gt;
&lt;h3&gt;Build me a class, please&lt;/h3&gt;
&lt;p&gt;A quick peek into the &lt;tt class="docutils literal"&gt;builtins&lt;/tt&gt; module in &lt;tt class="docutils literal"&gt;Python/bltinmodule.c&lt;/tt&gt; reveals that &lt;tt class="docutils literal"&gt;__build_class__&lt;/tt&gt; is implemented by the function &lt;tt class="docutils literal"&gt;builtin___build_class__&lt;/tt&gt; (I'll call it BBC for simplicity) in the same file.&lt;/p&gt;
&lt;p&gt;As any Python function, BBC accepts both positional and keyword arguments. The positional arguments are:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;func, name, base1, base2, ... baseN
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So we see only the function and name were passed for &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt;, since it has no base classes. The only keyword argument BBC understands is &lt;tt class="docutils literal"&gt;metaclass&lt;/tt&gt; &lt;a class="footnote-reference" href="#id13" id="id3"&gt;[3]&lt;/a&gt;, allowing the Python 3 way of defining &lt;a class="reference external" href="https://eli.thegreenplace.net/2011/08/14/python-metaclasses-by-example/"&gt;metaclasses&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&lt;span style="color: #00007f; font-weight: bold"&gt;class&lt;/span&gt; &lt;span style="color: #00007f"&gt;SomeOtherJoe&lt;/span&gt;(metaclass=JoeMeta):
  [...]
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So back to BBC, here's what it does &lt;a class="footnote-reference" href="#id14" id="id4"&gt;[4]&lt;/a&gt;:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;The first chunk of code deals with extracting the arguments and setting defaults.&lt;/li&gt;
&lt;li&gt;Next, if no metaclass is supplied, BBC looks at the base classes and takes the metaclass of the first base class. If there are no base classes, the default metaclass &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; is used.&lt;/li&gt;
&lt;li&gt;If the metaclass is really a class (note that in Python any callable can be given as a metaclass), look at the bases again to determine &amp;quot;the most derived&amp;quot; metaclass.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The last point deserves a bit of elaboration. If our class has bases, then some rules apply for the metaclasses that are allowed. The metaclasses of its bases must be either subclasses or superclasses of our class's metaclass. Any other arrangement will result in this &lt;tt class="docutils literal"&gt;TypeError&lt;/tt&gt;:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;metaclass conflict: the metaclass of a derived class must be a (non-strict)
                    subclass of the metaclasses of all its bases
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Eventually, given that there are no conflicts, the most derived metaclass will be chosen. The most derived metaclass is the one which is a subtype of the explicitly specified metaclass and the metaclasses of all the base classes. In other words, if our class's metaclass is &lt;tt class="docutils literal"&gt;Meta1&lt;/tt&gt;, only one of the bases has a metaclass and that's &lt;tt class="docutils literal"&gt;Meta2&lt;/tt&gt;, and &lt;tt class="docutils literal"&gt;Meta2&lt;/tt&gt; is a subclass of &lt;tt class="docutils literal"&gt;Meta1&lt;/tt&gt;, it is &lt;tt class="docutils literal"&gt;Meta2&lt;/tt&gt; that will be picked to serve as the eventual metaclass of our class.&lt;/p&gt;
&lt;ol class="arabic simple" start="4"&gt;
&lt;li&gt;At this point BBC has a metaclass &lt;a class="footnote-reference" href="#id15" id="id5"&gt;[5]&lt;/a&gt;, so it starts by calling its &lt;tt class="docutils literal"&gt;__prepare__&lt;/tt&gt; method to create a namespace dictionary for the class. If there's no such method, an empty dict is used.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;As documented in the &lt;a class="reference external" href="http://docs.python.org/dev/reference/datamodel.html"&gt;data model reference&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
If the metaclass has a __prepare__() attribute (usually implemented as a class or static method), it is called before the class body is evaluated with the name of the class and a tuple of its bases for arguments. It should return an object that supports the mapping interface that will be used to store the namespace of the class. The default is a plain dictionary. This could be used, for example, to keep track of the order that class attributes are declared in by returning an ordered dictionary.&lt;/blockquote&gt;
&lt;ol class="arabic simple" start="5"&gt;
&lt;li&gt;The function argument is invoked, passing the namespace dict as the only argument. If we look back at the disassembly of this function (the second one), we see that the first argument is placed into the &lt;tt class="docutils literal"&gt;f_locals&lt;/tt&gt; attribute of the frame (with the &lt;tt class="docutils literal"&gt;STORE_LOCALS&lt;/tt&gt; instruction). In other words, this dictionary is then used to populate the class attributes. The function itself returns &lt;tt class="docutils literal"&gt;None&lt;/tt&gt; - its outcome is modifying the namespace dictionary.&lt;/li&gt;
&lt;li&gt;Finally, the metaclass is called with the name, list of bases and namespace dictionary as arguments.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The last step defers to the metaclass to actually create a new class with the given definition. &lt;a class="reference external" href="https://eli.thegreenplace.net/2011/08/14/python-metaclasses-by-example/"&gt;Recall that&lt;/a&gt; when some class &lt;tt class="docutils literal"&gt;MyKlass&lt;/tt&gt; has a metaclass &lt;tt class="docutils literal"&gt;MyMeta&lt;/tt&gt;, then the class definition of &lt;tt class="docutils literal"&gt;MyKlass&lt;/tt&gt; is equivalent to &lt;a class="footnote-reference" href="#id16" id="id6"&gt;[6]&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;MyKlass = MyMeta(name, bases, namespace_dict)
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The flow of BBC outlined above directly embodies this equivalence.&lt;/p&gt;
&lt;p&gt;So what happens next? Well, the metaclass &lt;tt class="docutils literal"&gt;MyMeta&lt;/tt&gt; is a class, right? And what happens when a class is &amp;quot;called&amp;quot;? &lt;a class="reference external" href="https://eli.thegreenplace.net/2012/04/16/python-object-creation-sequence/"&gt;It's instantiated&lt;/a&gt;. How is a class's instantiation done? By invoking its metaclass's &lt;tt class="docutils literal"&gt;__call__&lt;/tt&gt;. So wait, this is the metaclass's metaclass we're talking about here, right? Yes! A metaclass is just a class, after all &lt;a class="footnote-reference" href="#id17" id="id7"&gt;[7]&lt;/a&gt;, and has a metaclass of its own - so Python has to keep the meta-flow going.&lt;/p&gt;
&lt;p&gt;Realistically, what probably happens is this:&lt;/p&gt;
&lt;p&gt;Most chances are that your class has no metaclass specified explicitly. Then, its default metaclass is &lt;tt class="docutils literal"&gt;type&lt;/tt&gt;, so the call above is actually:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;MyKlass = &lt;span style="color: #00007f"&gt;type&lt;/span&gt;(name, bases, namespace_dict)
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The metaclass of &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; happens to be &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; itself, so here &lt;tt class="docutils literal"&gt;type.__call__&lt;/tt&gt; is called.&lt;/p&gt;
&lt;p&gt;In the more complex case that your class &lt;em&gt;does&lt;/em&gt; have a metaclass, most chances are that the metaclass itself has no metaclass &lt;a class="footnote-reference" href="#id18" id="id8"&gt;[8]&lt;/a&gt;, so &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; is used for it. Therefore, the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;MyMeta(...)&lt;/span&gt;&lt;/tt&gt; call is also served by &lt;tt class="docutils literal"&gt;type.__call__&lt;/tt&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="type-call"&gt;
&lt;h3&gt;type_call&lt;/h3&gt;
&lt;p&gt;In &lt;tt class="docutils literal"&gt;Objects/typeobject.c&lt;/tt&gt;, the &lt;tt class="docutils literal"&gt;type.__call__&lt;/tt&gt; slot is getting mapped to the function &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt;. I've already spent some time explaining &lt;a class="reference external" href="https://eli.thegreenplace.net/2012/04/16/python-object-creation-sequence/"&gt;how it works&lt;/a&gt;, so it's important to review that article at this point.&lt;/p&gt;
&lt;p&gt;Things are a bit different here, however. The &lt;a class="reference external" href="https://eli.thegreenplace.net/2012/04/16/python-object-creation-sequence/"&gt;object creation sequence article&lt;/a&gt; explained how instances are created, so the &lt;tt class="docutils literal"&gt;tp_new&lt;/tt&gt; slot called from &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; went to &lt;tt class="docutils literal"&gt;object&lt;/tt&gt;. Here, since &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; will actually call &lt;tt class="docutils literal"&gt;tp_new&lt;/tt&gt; on a metaclass, and the metaclass's base is &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; (see &lt;a class="reference external" href="https://eli.thegreenplace.net/2012/04/03/the-fundamental-types-of-python-a-diagram/"&gt;this diagram&lt;/a&gt;), we'll have to study how the &lt;tt class="docutils literal"&gt;type_new&lt;/tt&gt; function (also from &lt;tt class="docutils literal"&gt;Objects/typeobject.c&lt;/tt&gt;) works.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="a-brief-recap"&gt;
&lt;h3&gt;A brief recap&lt;/h3&gt;
&lt;p&gt;I feel that the flow here is relatively convoluted, so lest we lose focus, let's have a brief recap of how we got thus far. The following is a much simplified version of the flow described so far in this article:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;When a new class &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; is defined...&lt;/li&gt;
&lt;li&gt;The Python interpreter arranges the builtin function &lt;tt class="docutils literal"&gt;builtin__build_class__&lt;/tt&gt; (BBC) to be called, giving it the class name and its innards compiled into a code object.&lt;/li&gt;
&lt;li&gt;BBC finds the metaclass of &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; and calls it to create the new class.&lt;/li&gt;
&lt;li&gt;When any class in Python is called, it means that its metaclass's &lt;tt class="docutils literal"&gt;tp_call&lt;/tt&gt; slot is invoked. So to create &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt;, this is the &lt;tt class="docutils literal"&gt;tp_call&lt;/tt&gt; of its metaclass's metaclass. In most cases this is the &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; function (since the metaclass's metaclass is almost always &lt;tt class="docutils literal"&gt;type&lt;/tt&gt;, or something that eventually delegates to it).&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; creates a new instance of the type it's bound to by calling its &lt;tt class="docutils literal"&gt;tp_new&lt;/tt&gt; slot.&lt;/li&gt;
&lt;li&gt;In our case, that is served by the &lt;tt class="docutils literal"&gt;type_new&lt;/tt&gt; function.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The next section picks up from step 6.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="type-new"&gt;
&lt;h3&gt;type_new&lt;/h3&gt;
&lt;p&gt;The &lt;tt class="docutils literal"&gt;type_new&lt;/tt&gt; function is a complex beast - it's over 400 lines long. There's a good reason for this, however, since it plays a very fundamental role in the Python object system. It's literally responsible for creating all Python types. I'll go over its functionality in major blocks, pasting short snippets of code where relevant.&lt;/p&gt;
&lt;p&gt;Let's start at the beginning. The signature of &lt;tt class="docutils literal"&gt;type_new&lt;/tt&gt; is:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&lt;span style="color: #00007f; font-weight: bold"&gt;static&lt;/span&gt; PyObject *
type_new(PyTypeObject *metatype, PyObject *args, PyObject *kwds)
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;When called to create our class &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt;, the arguments will be:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;metatype&lt;/tt&gt; - the metaclass, so it's &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; itself.&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;args&lt;/tt&gt; - we saw in the description of BBC above that this is the class name, list of base classes and a namespace dict.&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;kwds&lt;/tt&gt; - since &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; has no metaclass, this will be empty.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At this point, it may be useful &lt;a class="reference external" href="https://eli.thegreenplace.net/2011/08/14/python-metaclasses-by-example/"&gt;to recall that&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&lt;span style="color: #00007f; font-weight: bold"&gt;class&lt;/span&gt; &lt;span style="color: #00007f"&gt;Joe&lt;/span&gt;:
  ... contents
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Is equivalent to:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;Joe = &lt;span style="color: #00007f"&gt;type&lt;/span&gt;(&lt;span style="color: #7f007f"&gt;&amp;#39;joe&amp;#39;&lt;/span&gt;, (), &lt;span style="color: #00007f"&gt;dict&lt;/span&gt; of contents)
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;tt class="docutils literal"&gt;type_new&lt;/tt&gt; serves both approaches, of course.&lt;/p&gt;
&lt;p&gt;It starts by handling the special 1-argument call of the &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; function, which returns the type. Then, it tries to see if the requested type has a metaclass that's more suitable than the one passed in. This is necessary to handle a direct call to &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; as shown above - if one of the bases has a metaclass, that metaclass should be used for the creation &lt;a class="footnote-reference" href="#id19" id="id9"&gt;[9]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Next, &lt;tt class="docutils literal"&gt;type_new&lt;/tt&gt; handles some special class methods (for example &lt;tt class="docutils literal"&gt;__slots__&lt;/tt&gt;).&lt;/p&gt;
&lt;p&gt;Finally, the type object itself is allocated and initialized. Since the &lt;a class="reference external" href="http://www.python.org/download/releases/2.2/descrintro/"&gt;unification of types and classes&lt;/a&gt; in Python, user-defined classes are represented similarly to built-in types inside the CPython VM. However, there's still a difference. Unlike built-in types (and new types exported by C extension) which are statically allocated and are essentially &amp;quot;singletons&amp;quot;, user-defined classes have to be implemented by dynamically allocated type objects on the heap &lt;a class="footnote-reference" href="#id20" id="id10"&gt;[10]&lt;/a&gt;. For this purpose, &lt;tt class="docutils literal"&gt;Include/object.h&lt;/tt&gt; defines an &amp;quot;extended type object&amp;quot;, &lt;tt class="docutils literal"&gt;PyHeapTypeObject&lt;/tt&gt;. This struct starts with a &lt;tt class="docutils literal"&gt;PyTypeObject&lt;/tt&gt; member, so it can be passed around to Python C code expecting any normal type. The extra information it carries is used mainly for book-keeping in the type-handling code (&lt;tt class="docutils literal"&gt;Objects/typeobject.c&lt;/tt&gt;). &lt;tt class="docutils literal"&gt;PyHeapTypeObject&lt;/tt&gt; is an interesting type to discuss but would deserve an article of its own, so I'll stop right here.&lt;/p&gt;
&lt;p&gt;Just as an example of one of the special cases handled by &lt;tt class="docutils literal"&gt;type_new&lt;/tt&gt; for members of new classes, let's look at &lt;tt class="docutils literal"&gt;__new__&lt;/tt&gt;.  The data model reference &lt;a class="reference external" href="http://docs.python.org/dev/reference/datamodel.html#object.__new__"&gt;says&lt;/a&gt; about it:&lt;/p&gt;
&lt;blockquote&gt;
Called to create a new instance of class cls. __new__() is a static method (special-cased so you need not declare it as such) that takes the class of which an instance was requested as its first argument.&lt;/blockquote&gt;
&lt;p&gt;It's interesting to see how this statement is embodied in the code of &lt;tt class="docutils literal"&gt;type_new&lt;/tt&gt;:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&lt;span style="color: #007f00"&gt;/* Special-case __new__: if it&amp;#39;s a plain function,&lt;/span&gt;
&lt;span style="color: #007f00"&gt;   make it a static function */&lt;/span&gt;
tmp = _PyDict_GetItemId(dict, &amp;amp;PyId___new__);
&lt;span style="color: #00007f; font-weight: bold"&gt;if&lt;/span&gt; (tmp != &lt;span style="color: #00007f"&gt;NULL&lt;/span&gt; &amp;amp;&amp;amp; PyFunction_Check(tmp)) {
    tmp = PyStaticMethod_New(tmp);
    &lt;span style="color: #00007f; font-weight: bold"&gt;if&lt;/span&gt; (tmp == &lt;span style="color: #00007f"&gt;NULL&lt;/span&gt;)
        &lt;span style="color: #00007f; font-weight: bold"&gt;goto&lt;/span&gt; error;
    &lt;span style="color: #00007f; font-weight: bold"&gt;if&lt;/span&gt; (_PyDict_SetItemId(dict, &amp;amp;PyId___new__, tmp) &amp;lt; &lt;span style="color: #007f7f"&gt;0&lt;/span&gt;)
        &lt;span style="color: #00007f; font-weight: bold"&gt;goto&lt;/span&gt; error;
    Py_DECREF(tmp);
}
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So when the dict of the new class has a &lt;tt class="docutils literal"&gt;__new__&lt;/tt&gt; method, it's automatically replaced with a corresponding static method.&lt;/p&gt;
&lt;p&gt;After some more handling of special cases, &lt;tt class="docutils literal"&gt;type_new&lt;/tt&gt; returns the object representing the newly created type.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="conclusion"&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;This has been a relatively dense article. If you got lost, don't despair. The important part to remember is the flow described in &amp;quot;A brief recap&amp;quot; - the rest of the article just explains the items in that list in more detail.&lt;/p&gt;
&lt;p&gt;The Python type system is very powerful, dynamic and flexible. Since this all has to be implemented in the low-level and type-rigid C, and at the same time be relatively efficient, the implementation is almost inevitably complex. If you're just writing Python code, you almost definitely don't have to be aware of all these details. However, if you're writing non-trivial C extensions, and/or hacking on CPython itself, understanding the contents of this article (at least on an approximate level) can be useful and educational.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Many thanks to Nick Coghlan for reviewing this article.&lt;/em&gt;&lt;/p&gt;
&lt;img class="align-center" src="https://eli.thegreenplace.net/images/hline.jpg" style="width: 320px; height: 5px;" /&gt;
&lt;table class="docutils footnote" frame="void" id="id11" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;If you're interested in the compilation part, &lt;a class="reference external" href="https://eli.thegreenplace.net/2010/06/30/python-internals-adding-a-new-statement-to-python/"&gt;this article&lt;/a&gt; provides a good overview.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id12" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;In the distant past, &lt;tt class="docutils literal"&gt;MAKE_FUNCTION&lt;/tt&gt; was used both for creating functions and classes. However, when lexical scoping was added to Python, a new instruction for creating functions was added - &lt;tt class="docutils literal"&gt;MAKE_CLOSURE&lt;/tt&gt;. So nowadays, as strange as it sounds, &lt;tt class="docutils literal"&gt;MAKE_FUNCTION&lt;/tt&gt; is only used for creating &lt;em&gt;classes&lt;/em&gt;, not &lt;em&gt;functions&lt;/em&gt;.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id13" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id3"&gt;[3]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;The other keyword arguments, if they exist, are passed to the metaclass when it's getting called.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id14" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id4"&gt;[4]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;You may find it educational to open the file &lt;tt class="docutils literal"&gt;Python/bltinmodule.c&lt;/tt&gt; from the Python source distribution and follow along.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id15" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id5"&gt;[5]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;There always is &lt;em&gt;some&lt;/em&gt; metaclass, because all classes eventually derive from &lt;tt class="docutils literal"&gt;object&lt;/tt&gt; whose metaclass is &lt;tt class="docutils literal"&gt;type&lt;/tt&gt;.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id16" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id6"&gt;[6]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;With the caveat that BBC also calls &lt;tt class="docutils literal"&gt;__prepare__&lt;/tt&gt;. For a more equivalent sequence, take a look at &lt;a class="reference external" href="http://docs.python.org/dev/library/types.html?highlight=new_class#types.new_class"&gt;types.new_class&lt;/a&gt;.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id17" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id7"&gt;[7]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;As I mentioned earlier, any callable can be specified as a metaclass. If the callable is a function and not a class, it's simply called as the last step of BBC - the rest of the discussion doesn't apply.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id18" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id8"&gt;[8]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;I've never encountered real-world Python code where a metaclass has a metaclass of its own. If you have, please let me know - I'm genuinely curious about the use cases for such a construct.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id19" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id9"&gt;[9]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;If you've noticed that this is a duplication of effort, you're right. BBC also computes the metaclass, but to handle the &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;type(...)&lt;/span&gt;&lt;/tt&gt; call, &lt;tt class="docutils literal"&gt;type_new&lt;/tt&gt; has to do this again. I think that creating new classes is a rare enough occurrence that the extra work done here doesn't count for much.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id20" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id10"&gt;[10]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Since they have to be garbage collected and fully deleted when no longer needed.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;

    </content><category term="misc"></category><category term="Python internals"></category></entry><entry><title>Python object creation sequence</title><link href="https://eli.thegreenplace.net/2012/04/16/python-object-creation-sequence" rel="alternate"></link><published>2012-04-16T07:03:41-07:00</published><updated>2023-06-30T23:16:27-07:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2012-04-16:/2012/04/16/python-object-creation-sequence</id><summary type="html">
        &lt;p&gt;&lt;em&gt;[The Python version described in this article is 3.x]&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;This article aims to explore the process of creating new objects in Python. As I explained in &lt;a class="reference external" href="https://eli.thegreenplace.net/2012/03/23/python-internals-how-callables-work/"&gt;a previous article&lt;/a&gt;, object creation is just a special case of calling a callable. Consider this Python code:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&lt;span style="color: #00007f; font-weight: bold"&gt;class&lt;/span&gt; &lt;span style="color: #00007f"&gt;Joe&lt;/span&gt;:
    &lt;span style="color: #00007f; font-weight: bold"&gt;pass&lt;/span&gt;

j = Joe …&lt;/pre&gt;&lt;/div&gt;</summary><content type="html">
        &lt;p&gt;&lt;em&gt;[The Python version described in this article is 3.x]&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;This article aims to explore the process of creating new objects in Python. As I explained in &lt;a class="reference external" href="https://eli.thegreenplace.net/2012/03/23/python-internals-how-callables-work/"&gt;a previous article&lt;/a&gt;, object creation is just a special case of calling a callable. Consider this Python code:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&lt;span style="color: #00007f; font-weight: bold"&gt;class&lt;/span&gt; &lt;span style="color: #00007f"&gt;Joe&lt;/span&gt;:
    &lt;span style="color: #00007f; font-weight: bold"&gt;pass&lt;/span&gt;

j = Joe()
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;What happens when &lt;tt class="docutils literal"&gt;j = Joe()&lt;/tt&gt; is executed? Python sees it as a call to the callable &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt;, and routes it to the internal function &lt;tt class="docutils literal"&gt;PyObject_Call&lt;/tt&gt;, with &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; passed as the first argument. &lt;tt class="docutils literal"&gt;PyObject_Call&lt;/tt&gt; looks at the type of its first argument to extract its &lt;tt class="docutils literal"&gt;tp_call&lt;/tt&gt; attribute.&lt;/p&gt;
&lt;p&gt;Now, what is the type of &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt;? Whenever we define a new Python class, unless we explicitly specify &lt;a class="reference external" href="https://eli.thegreenplace.net/2011/08/14/python-metaclasses-by-example/"&gt;a metaclass&lt;/a&gt; for it, its type is &lt;tt class="docutils literal"&gt;type&lt;/tt&gt;. Therefore, when &lt;tt class="docutils literal"&gt;PyObject_Call&lt;/tt&gt; attempts to look at the type of &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt;, it finds &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; and picks its &lt;tt class="docutils literal"&gt;tp_call&lt;/tt&gt; attribute. In other words, the function &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; in &lt;tt class="docutils literal"&gt;Objects/typeobject.c&lt;/tt&gt; is invoked &lt;a class="footnote-reference" href="#id5" id="id1"&gt;[1]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This is an interesting function, and it's short, so I'll paste it wholly here:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&lt;span style="color: #00007f; font-weight: bold"&gt;static&lt;/span&gt; PyObject *
&lt;span style="color: #00007f"&gt;type_call&lt;/span&gt;(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    PyObject *obj;

    &lt;span style="color: #00007f; font-weight: bold"&gt;if&lt;/span&gt; (type-&amp;gt;tp_new == &lt;span style="color: #00007f"&gt;NULL&lt;/span&gt;) {
        PyErr_Format(PyExc_TypeError,
                     &lt;span style="color: #7f007f"&gt;&amp;quot;cannot create &amp;#39;%.100s&amp;#39; instances&amp;quot;&lt;/span&gt;,
                     type-&amp;gt;tp_name);
        &lt;span style="color: #00007f; font-weight: bold"&gt;return&lt;/span&gt; &lt;span style="color: #00007f"&gt;NULL&lt;/span&gt;;
    }

    obj = type-&amp;gt;tp_new(type, args, kwds);
    &lt;span style="color: #00007f; font-weight: bold"&gt;if&lt;/span&gt; (obj != &lt;span style="color: #00007f"&gt;NULL&lt;/span&gt;) {
        &lt;span style="color: #007f00"&gt;/* Ugly exception: when the call was type(something),&lt;/span&gt;
&lt;span style="color: #007f00"&gt;           don&amp;#39;t call tp_init on the result. */&lt;/span&gt;
        &lt;span style="color: #00007f; font-weight: bold"&gt;if&lt;/span&gt; (type == &amp;amp;PyType_Type &amp;amp;&amp;amp;
            PyTuple_Check(args) &amp;amp;&amp;amp; PyTuple_GET_SIZE(args) == &lt;span style="color: #007f7f"&gt;1&lt;/span&gt; &amp;amp;&amp;amp;
            (kwds == &lt;span style="color: #00007f"&gt;NULL&lt;/span&gt; ||
             (PyDict_Check(kwds) &amp;amp;&amp;amp; PyDict_Size(kwds) == &lt;span style="color: #007f7f"&gt;0&lt;/span&gt;)))
            &lt;span style="color: #00007f; font-weight: bold"&gt;return&lt;/span&gt; obj;
        &lt;span style="color: #007f00"&gt;/* If the returned object is not an instance of type,&lt;/span&gt;
&lt;span style="color: #007f00"&gt;           it won&amp;#39;t be initialized. */&lt;/span&gt;
        &lt;span style="color: #00007f; font-weight: bold"&gt;if&lt;/span&gt; (!PyType_IsSubtype(Py_TYPE(obj), type))
            &lt;span style="color: #00007f; font-weight: bold"&gt;return&lt;/span&gt; obj;
        type = Py_TYPE(obj);
        &lt;span style="color: #00007f; font-weight: bold"&gt;if&lt;/span&gt; (type-&amp;gt;tp_init != &lt;span style="color: #00007f"&gt;NULL&lt;/span&gt; &amp;amp;&amp;amp;
            type-&amp;gt;tp_init(obj, args, kwds) &amp;lt; &lt;span style="color: #007f7f"&gt;0&lt;/span&gt;) {
            Py_DECREF(obj);
            obj = &lt;span style="color: #00007f"&gt;NULL&lt;/span&gt;;
        }
    }
    &lt;span style="color: #00007f; font-weight: bold"&gt;return&lt;/span&gt; obj;
}
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So what arguments is &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; being passed in our case? The first one is &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; itself - but how is it represented? Well, &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; is a class, so it's a &lt;em&gt;type&lt;/em&gt; (&lt;a class="reference external" href="https://eli.thegreenplace.net/2012/03/30/python-objects-types-classes-and-instances-a-glossary/"&gt;all classes are types in Python 3&lt;/a&gt;). Types are represented inside the CPython VM by &lt;tt class="docutils literal"&gt;PyTypeObject&lt;/tt&gt; objects &lt;a class="footnote-reference" href="#id6" id="id2"&gt;[2]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;What &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; does is first call the &lt;tt class="docutils literal"&gt;tp_new&lt;/tt&gt; attribute of the given type. Then, it checks for a special case we can ignore for simplicity, makes sure &lt;tt class="docutils literal"&gt;tp_new&lt;/tt&gt; returned an object of the expected type, and then calls &lt;tt class="docutils literal"&gt;tp_init&lt;/tt&gt;. If an object of a different type was returned, it is not being initialized.&lt;/p&gt;
&lt;p&gt;Translated to Python, what happens is this: if your class defines the &lt;tt class="docutils literal"&gt;__new__&lt;/tt&gt; special method, it gets called first when a new instance of the class is created. This method has to return some object. Usually, this will be of the required type, but this doesn't have to be the case. Objects of the required type get &lt;tt class="docutils literal"&gt;__init__&lt;/tt&gt; invoked on them. Here's an example:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&lt;span style="color: #00007f; font-weight: bold"&gt;class&lt;/span&gt; &lt;span style="color: #00007f"&gt;Joe&lt;/span&gt;:
    &lt;span style="color: #00007f; font-weight: bold"&gt;def&lt;/span&gt; &lt;span style="color: #00007f"&gt;__new__&lt;/span&gt;(cls, *args, **kwargs):
        obj = &lt;span style="color: #00007f"&gt;super&lt;/span&gt;(Joe, cls).__new__(cls)
        &lt;span style="color: #00007f; font-weight: bold"&gt;print&lt;/span&gt;(&lt;span style="color: #7f007f"&gt;&amp;#39;__new__ called. got new obj id=0x%x&amp;#39;&lt;/span&gt; % &lt;span style="color: #00007f"&gt;id&lt;/span&gt;(obj))
        &lt;span style="color: #00007f; font-weight: bold"&gt;return&lt;/span&gt; obj

    &lt;span style="color: #00007f; font-weight: bold"&gt;def&lt;/span&gt; &lt;span style="color: #00007f"&gt;__init__&lt;/span&gt;(&lt;span style="color: #00007f"&gt;self&lt;/span&gt;, arg):
        &lt;span style="color: #00007f; font-weight: bold"&gt;print&lt;/span&gt;(&lt;span style="color: #7f007f"&gt;&amp;#39;__init__ called (self=0x%x) with arg=%s&amp;#39;&lt;/span&gt; % (&lt;span style="color: #00007f"&gt;id&lt;/span&gt;(&lt;span style="color: #00007f"&gt;self&lt;/span&gt;), arg))
        &lt;span style="color: #00007f"&gt;self&lt;/span&gt;.arg = arg

j = Joe(&lt;span style="color: #007f7f"&gt;12&lt;/span&gt;)
&lt;span style="color: #00007f; font-weight: bold"&gt;print&lt;/span&gt;(&lt;span style="color: #00007f"&gt;type&lt;/span&gt;(j))
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This prints:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;__new__ called. got new obj id=0x7f88e7218290
__init__ called (self=0x7f88e7218290) with arg=12
&amp;lt;class &amp;#39;__main__.Joe&amp;#39;&amp;gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;div class="section" id="customizing-the-sequence"&gt;
&lt;h3&gt;Customizing the sequence&lt;/h3&gt;
&lt;p&gt;As we saw above, since the type of &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; is &lt;tt class="docutils literal"&gt;type&lt;/tt&gt;, the &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; function is invoked to define the creation sequence for &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; instances. This sequence can be changed by specifying a custom type for &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; - in other words, a metaclass. Let's modify the previous example to specify a custom metaclass for &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt;:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&lt;span style="color: #00007f; font-weight: bold"&gt;class&lt;/span&gt; &lt;span style="color: #00007f"&gt;MetaJoe&lt;/span&gt;(&lt;span style="color: #00007f"&gt;type&lt;/span&gt;):
    &lt;span style="color: #00007f; font-weight: bold"&gt;def&lt;/span&gt; &lt;span style="color: #00007f"&gt;__call__&lt;/span&gt;(cls, *args, **kwargs):
        &lt;span style="color: #00007f; font-weight: bold"&gt;print&lt;/span&gt;(&lt;span style="color: #7f007f"&gt;&amp;#39;MetaJoe.__call__&amp;#39;&lt;/span&gt;)
        &lt;span style="color: #00007f; font-weight: bold"&gt;return&lt;/span&gt; &lt;span style="color: #00007f"&gt;None&lt;/span&gt;

&lt;span style="color: #00007f; font-weight: bold"&gt;class&lt;/span&gt; &lt;span style="color: #00007f"&gt;Joe&lt;/span&gt;(metaclass=MetaJoe):
    &lt;span style="color: #00007f; font-weight: bold"&gt;def&lt;/span&gt; &lt;span style="color: #00007f"&gt;__new__&lt;/span&gt;(cls, *args, **kwargs):
        obj = &lt;span style="color: #00007f"&gt;super&lt;/span&gt;(Joe, cls).__new__(cls)
        &lt;span style="color: #00007f; font-weight: bold"&gt;print&lt;/span&gt;(&lt;span style="color: #7f007f"&gt;&amp;#39;__new__ called. got new obj id=0x%x&amp;#39;&lt;/span&gt; % &lt;span style="color: #00007f"&gt;id&lt;/span&gt;(obj))
        &lt;span style="color: #00007f; font-weight: bold"&gt;return&lt;/span&gt; obj

    &lt;span style="color: #00007f; font-weight: bold"&gt;def&lt;/span&gt; &lt;span style="color: #00007f"&gt;__init__&lt;/span&gt;(&lt;span style="color: #00007f"&gt;self&lt;/span&gt;, arg):
        &lt;span style="color: #00007f; font-weight: bold"&gt;print&lt;/span&gt;(&lt;span style="color: #7f007f"&gt;&amp;#39;__init__ called (self=0x%x) with arg=%s&amp;#39;&lt;/span&gt; % (&lt;span style="color: #00007f"&gt;id&lt;/span&gt;(&lt;span style="color: #00007f"&gt;self&lt;/span&gt;), arg))
        &lt;span style="color: #00007f"&gt;self&lt;/span&gt;.arg = arg

j = Joe(&lt;span style="color: #007f7f"&gt;12&lt;/span&gt;)
&lt;span style="color: #00007f; font-weight: bold"&gt;print&lt;/span&gt;(&lt;span style="color: #00007f"&gt;type&lt;/span&gt;(j))
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So now the type of &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; is not &lt;tt class="docutils literal"&gt;type&lt;/tt&gt;, but &lt;tt class="docutils literal"&gt;MetaJoe&lt;/tt&gt;. Consequently, when &lt;tt class="docutils literal"&gt;PyObject_Call&lt;/tt&gt; picks the call function to execute for &lt;tt class="docutils literal"&gt;j = Joe(12)&lt;/tt&gt;, it takes &lt;tt class="docutils literal"&gt;MetaJoe.__call__&lt;/tt&gt;. The latter prints a notice about itself and returns &lt;tt class="docutils literal"&gt;None&lt;/tt&gt;, so we don't expect the &lt;tt class="docutils literal"&gt;__new__&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;__init__&lt;/tt&gt; methods of &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; to be called at all. Indeed, this is the outcome:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;MetaJoe.__call__
&amp;lt;class &amp;#39;NoneType&amp;#39;&amp;gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="digging-deeper-tp-new"&gt;
&lt;h3&gt;Digging deeper - tp_new&lt;/h3&gt;
&lt;p&gt;Alright, so now we have a better understanding of the object creation sequence. One crucial piece of the puzzle is still missing, though. While we almost always define &lt;tt class="docutils literal"&gt;__init__&lt;/tt&gt; for our classes, defining &lt;tt class="docutils literal"&gt;__new__&lt;/tt&gt; is rather rare &lt;a class="footnote-reference" href="#id7" id="id3"&gt;[3]&lt;/a&gt;. Moreover, from a quick look at the code it's obvious that &lt;tt class="docutils literal"&gt;__new__&lt;/tt&gt; is more fundamental in a way. This method is used to create a new object. It is called once and only once per instantiation. &lt;tt class="docutils literal"&gt;__init__&lt;/tt&gt;, on the other hand, already gets a constructed object and may not be called at all; it can also be called multiple times.&lt;/p&gt;
&lt;p&gt;Since the &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; parameter passed to &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; in our case is &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt;, and &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; does not define a custom &lt;tt class="docutils literal"&gt;__new__&lt;/tt&gt; method, then &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;type-&amp;gt;tp_new&lt;/span&gt;&lt;/tt&gt; defers to the &lt;tt class="docutils literal"&gt;tp_new&lt;/tt&gt; slot of the base type. The base type of &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; (&lt;a class="reference external" href="https://eli.thegreenplace.net/2012/04/03/the-fundamental-types-of-python-a-diagram/"&gt;and all other Python objects&lt;/a&gt;, except &lt;tt class="docutils literal"&gt;object&lt;/tt&gt; itself) is &lt;tt class="docutils literal"&gt;object&lt;/tt&gt;. The &lt;tt class="docutils literal"&gt;object.tp_new&lt;/tt&gt; slot is implemented in CPython by the &lt;tt class="docutils literal"&gt;object_new&lt;/tt&gt; function in &lt;tt class="docutils literal"&gt;Objects/typeobject.c&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;&lt;tt class="docutils literal"&gt;object_new&lt;/tt&gt; is actually very simple. It does some argument checking, verifies that the type we're trying to instantiate is not &lt;a class="reference external" href="http://docs.python.org/dev/library/abc.html"&gt;abstract&lt;/a&gt;, and then does this:&lt;/p&gt;
&lt;div class="highlight" style="background: #ffffff"&gt;&lt;pre style="line-height: 125%"&gt;&lt;span style="color: #00007f; font-weight: bold"&gt;return&lt;/span&gt; type-&amp;gt;tp_alloc(type, &lt;span style="color: #007f7f"&gt;0&lt;/span&gt;);
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;tt class="docutils literal"&gt;tp_alloc&lt;/tt&gt; is a low-level slot of the type object in CPython. It's not directly accessible from Python code, but should be familiar to C extension developers. A custom type defined in a C extension may override this slot to supply a custom memory allocation scheme for instances of itself. Most C extension types will, however, defer this allocation to the function &lt;tt class="docutils literal"&gt;PyType_GenericAlloc&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;This function is part of the public C API of CPython, and it also happens to be assigned to the &lt;tt class="docutils literal"&gt;tp_alloc&lt;/tt&gt; slot of &lt;tt class="docutils literal"&gt;object&lt;/tt&gt; (defined in &lt;tt class="docutils literal"&gt;Objects/typeobject.c&lt;/tt&gt;). It figures out how much memory the new object needs &lt;a class="footnote-reference" href="#id8" id="id4"&gt;[4]&lt;/a&gt;, allocates a memory chunk from CPython's memory allocator and initializes it all to zeros. It then initializes the bare essential &lt;tt class="docutils literal"&gt;PyObject&lt;/tt&gt; fields (type and reference count), does some GC bookkeeping and returns. The result is a freshly allocated instance.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="conclusion"&gt;
&lt;h3&gt;Conclusion&lt;/h3&gt;
&lt;p&gt;Lest we lose the forest for the trees, let's revisit the question this article began with. What happens when CPython executes &lt;tt class="docutils literal"&gt;j = Joe()&lt;/tt&gt;?&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Since &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; has no explicit metaclass, &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; is its type. So the &lt;tt class="docutils literal"&gt;tp_call&lt;/tt&gt; slot of &lt;tt class="docutils literal"&gt;type&lt;/tt&gt;, which is &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt;, is called.&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; starts by calling the &lt;tt class="docutils literal"&gt;tp_new&lt;/tt&gt; slot of &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt;:&lt;ul&gt;
&lt;li&gt;Since &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; has no explicit base clase, its base is &lt;tt class="docutils literal"&gt;object&lt;/tt&gt;. Therefore, &lt;tt class="docutils literal"&gt;object_new&lt;/tt&gt; is called.&lt;/li&gt;
&lt;li&gt;Since &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; is a Python-defined class, it has no custom &lt;tt class="docutils literal"&gt;tp_alloc&lt;/tt&gt; slot. Therefore, &lt;tt class="docutils literal"&gt;object_new&lt;/tt&gt; calls &lt;tt class="docutils literal"&gt;PyType_GenericAlloc&lt;/tt&gt;.&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;PyType_GenericAlloc&lt;/tt&gt; allocates and initializes a chunk of memory big enough to contain &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; then goes on and calls &lt;tt class="docutils literal"&gt;Joe.__init__&lt;/tt&gt; on the newly created object.&lt;ul&gt;
&lt;li&gt;Since &lt;tt class="docutils literal"&gt;Joe&lt;/tt&gt; does not define &lt;tt class="docutils literal"&gt;__init__&lt;/tt&gt;, its base's &lt;tt class="docutils literal"&gt;__init__&lt;/tt&gt; is called, which is &lt;tt class="docutils literal"&gt;object_init&lt;/tt&gt;.&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;object_init&lt;/tt&gt; does nothing.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;The new object is returned from &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; and is bound to the name &lt;tt class="docutils literal"&gt;j&lt;/tt&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is the vanilla flow for an object of a class that doesn't have a custom metaclass, doesn't have an explicit base class, and doesn't define its own &lt;tt class="docutils literal"&gt;__new__&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;__init__&lt;/tt&gt; methods. However, this article should have made it quite clear where these custom capabilities plug in to modify the object creation sequence. As you can see, Python is amazingly flexible. Practically every single step of the process described above can be customized, even for user-defined types implemented in Python. Types implemented in a C extension can customize even more, such as the exact memory allocation strategy used to create instances of the type.&lt;/p&gt;
&lt;img class="align-center" src="https://eli.thegreenplace.net/images/hline.jpg" style="width: 320px; height: 5px;" /&gt;
&lt;table class="docutils footnote" frame="void" id="id5" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;The &lt;tt class="docutils literal"&gt;PyTypeObject&lt;/tt&gt; structure definition for &lt;tt class="docutils literal"&gt;type&lt;/tt&gt; is &lt;tt class="docutils literal"&gt;PyType_Type&lt;/tt&gt; in &lt;tt class="docutils literal"&gt;Objects/typeobject.c&lt;/tt&gt;. You can see that &lt;tt class="docutils literal"&gt;type_call&lt;/tt&gt; is being assigned to its &lt;tt class="docutils literal"&gt;tp_call&lt;/tt&gt; slot.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id6" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;A future article will show how this comes to be when a new class is created.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id7" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id3"&gt;[3]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Even when we do explicitly override &lt;tt class="docutils literal"&gt;__new__&lt;/tt&gt; in our classes, we almost certainly defer the actual object creation to the base's &lt;tt class="docutils literal"&gt;__new__&lt;/tt&gt;.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="id8" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#id4"&gt;[4]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;This information is available in the &lt;tt class="docutils literal"&gt;PyObject&lt;/tt&gt; header of any type.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;

    </content><category term="misc"></category><category term="Python internals"></category></entry></feed>