Tags C & C++

Introduction

We would all like to write nice and clean code, with cute little algorithms and crystal-clear structure, without giving much thought to the often ugly topic of error handling.

But unfortunately in programming, perhaps more than in any other kind of engineering, the devil is in the details. The handling of errors and of irregular inputs and data usually requires more code than the straight-line algorithm for solving the problem itself. This is a regrettable but unavoidable artifact of our craft.

But wait, there's more. As difficult as error handling is, coupled with resource allocation and the need for robust deallocation it is nothing short of a huge headache. Fortunately, in newer high-level languages this is less of a problem because of automatic garbage collection. Also, C++ provides tolerably robust solutions in the form of RAII. But as the title states, here I'm concerned with C, which doesn't have exceptions and destructors, so the issue is much more difficult.

In this article I will argue that the much hated goto statement is a valuable tool for simplifying error-handling code in C.

A simple case

Here's a quote from the Wikipedia article on RAII:

C requires significant administrative code since it doesn't support exceptions, try-finally blocks, or RAII at all. A typical approach is to separate releasing of resources at the end of the function and jump there with gotos in the case of error. This way the cleanup code need not be duplicated.

The code sample the article shows is this:

int c_example()
{
    int ret = 0; // return value 0 is success
    FILE *f = fopen("logfile.txt", "w+");

    if (!f)
        return -1;

    if (fputs("hello logfile!", f) == EOF)
    {
        ret = -2;
        goto out;
    }

    // continue using the file resource
    // ...

    // Releasing resources (in reverse order)
out:
    if (fclose(f) == EOF)
        ret = -3;

    return ret;
}

Sure, by inverting the logical comparison, this can be rewritten without a goto as follows:

int c_example()
{
    int ret = 0; // return value 0 is success
    FILE *f = fopen("logfile.txt", "w+");

    if (!f)
        return -1;

    if (fputs("hello logfile!", f) != EOF)
    {
        // continue using the file resource
    }
    else
    {
        ret = -2;
    }

    if (fclose(f) == EOF)
        ret = -3;

    return ret;
}

Although we've gotten rid of the goto, IMHO this code isn't much cleaner. Note that we've just moved the mainline code into a condition. Will we do it for any error condition the function encounters?

A thornier case

Now consider this snippet:

int foo(int bar)
{
    int return_value = 0;

    allocate_resources_1();

    if (!do_something(bar))
        goto error_1;

    allocate_resources_2();

    if (!init_stuff(bar))
        goto error_2;

    allocate_resources_3();

    if (!prepare_stuff(bar))
        goto error_3;

    return_value = do_the_thing(bar);

error_3:
    cleanup_3();
error_2:
    cleanup_2();
error_1:
    cleanup_1();
    return return_value;
}

How would you get rid of the goto here, without duplicating the cleanup code or complicating it considerably? Following the logic of our previous goto hunt, we could use nested conditions:

int foo(int bar)
{
    int return_value = 0;

    allocate_resources_1();

    if (do_something(bar))
    {
        allocate_resources_2();

        if (init_stuff(bar))
        {
            allocate_resources_3();

            if (prepare_stuff(bar))
            {
                return_value = do_the_thing(bar);
            }

            cleanup_3();
        }

        cleanup_2();
    }

    cleanup_1();

    return return_value;
}

But look where our mainline code is now - deep inside the nested conditions. And keep in mind this is still a simplified example - each of the allocations, checks and code chunks could be significantly larger. Is that really any help for the readability?

No, goto is better here. It results in a more readable code, because the operations the function performs are structured in a logical order - errors get thrown somewhere else, while the mainline code goes on where it belongs. In the nested conditionals version, it's outright hard to find where the main code is, buried inside the error checks.

By the way, there's an even more complex case with various resources presented here. And using goto for the purpose of error handling is a common idiom in the source code of the Linux kernel, so that lump of code contains lots of examples as well.

Additional valid uses

Besides the point made above, goto is also sometimes (though much less frequently) useful for breaking out of deeply nested loops. If your algorithm requires a nested for loop (say, 4 levels deep), and in the innermost loop you sometimes encounter a special case that should cause you to break out all the loop levels, do use a goto. The alternative of creating exit flags at each level and checking them in each iteration requires much more code, is uglier, harder to maintain and is much less efficient (nested loops tend to appear in tight algorithmic code that needs speed).

A note on C++

In C++ you don't need goto for clean error handling. Exceptions and adherence to RAII are much better for that.

Sources

Here are some interesting sources on this topic:

  1. A newsgroup discussion from comp.lang.c
  2. "Structured programming with go to statements" - an article by Donald Knuth (google it)
  3. This stackoverflow discussion.
  4. Proggit thread
  5. Chapter 2 of the Linux Device Drivers book
  6. Linux kernel mailing list discussion
  7. RAII in C
  8. Wikipedia entry on RAII