Rust extension traits, greppability and IDEs

Traits are a central feature of Rust, critical for its implementation of polymorphism; traits are used for both static (by serving as bounds for generic parameters) and dynamic (by having trait objects to serve as interfaces) polymorphism.

This post assumes some familiarity with traits and discusses only a specific aspect of them - how extension traits affect code readability. To learn the basics of traits in Rust, the official book is a good starting point.

Extension traits

This Rust RFC provides a good, short definition of extension traits:

Extension traits are a programming pattern that makes it possible to add methods to an existing type outside of the crate defining that type.

For example, here's a trait with a single method:

trait Magic {
    fn magic_num(&self) -> usize;
}

We can now implement the Magic trait for our types:

struct Foobar {
    name: String,
}

impl Magic for Foobar {
    fn magic_num(&self) -> usize {
        if self.name.is_empty() {
            2
        } else {
            33
        }
    }
}

Now a FooBar can be passed wherever a Magic is expected. FooBar is a custom type, but what's really interesting is that we can also implement Magic for any other type, including types that we did not define. Let's implement it for bool:

impl Magic for bool {
    fn magic_num(&self) -> usize {
        if *self {
            3
        } else {
            54
        }
    }
}

We can now write code like true.magic_num() and it will work! We've added a method to a built-in Rust type. Obviously, we can also implement this trait for types in the standard library; e.g.:

impl<T> Magic for Vec<T> {
    fn magic_num(&self) -> usize {
        if self.is_empty() {
            10
        } else {
            5
        }
    }
}

Extension traits in the wild

Extension traits aren't just a fringe feature; they are widely used in the Rust ecosystem.

One example is the popular serde crate, which includes code that serializes and deserializes data structures in multiple formats. One of the traits serde provides is serde::Serialize; once we import this trait and one of the concrete serializers serde provides, we can do stuff like [1]:

let mut serializer = serde_json::Serializer::new(std::io::stdout());
185.serialize(&mut serializer).unwrap();

Importing serde::Serialize is critical for this code to work, even though we don't refer to Serialize anywhere in our code explicitly. Rust requires traits to be explicitly imported to imbue their methods onto existing types; otherwise it's hard to avoid naming collisions in case multiple traits from different crates provide the same methods.

Another example is the byteorder crate, which helps encode numbers into buffers with explicit length and endianness. To write some numbers into a vector byte-by-byte, we have to import the relevant trait and enum first, and then we can call the newly-added methods directly on a vector:

use byteorder::{LittleEndian, WriteBytesExt};

// ...

let mut wv = vec![];
wv.write_u16::<LittleEndian>(259).unwrap();
wv.write_u16::<LittleEndian>(517).unwrap();

The write_u16 method is part of the WriteBytesExt trait, and it's implemented on a Vec by the byteorder crate. To be more precise, it's automatically implemented on any type that implements the Write trait.

Finally, let's look at rayon - a library for simplified data-parallelism. It provides magical iterators that have the same functionality as iter but compute their results in parallel, leveraging multiple CPU cores. The rayon documentation recommends to import the traits the crate injects as follows:

It is recommended that you import all of these traits at once by adding use rayon::prelude::* at the top of each module that uses Rayon methods.

Having imported it thus, we can proceed to use Rayon as follows:

let exps = vec![2, 4, 6, 12, 24];
let pows_of_two: Vec<_> = exps.par_iter().map(|n| 2_u64.pow(*n)).collect();

Note the par_iter, which replaces a regular iter. It's been magically implemented on a vector, as well as a bunch of other types that support iteration.

On greppability and code readability

All these uses of extension traits are pretty cool and useful, no doubt. But that's not the main point of my post. What I really want to discuss is how the general approach relates to code readability, which is in my mind one of the most important aspects of programming we should all be thinking about.

This Rust technique fails the greppability test; it's not a word I made up - google it! If it's not immediately apparent, greppability means the ability to explore a code base using textual search tools like grep, git grep, ripgrep, pss or what have you.

Suppose you encounter this piece of code in a project you're exploring:

let mut wv = vec![];
wv.write_u16::<LittleEndian>(259).unwrap();

"Interesting", you think, "I didn't know that Vec has a write_u16 method". You quickly check the documentation - indeed, it doesn't! So where is it coming from? You grep the project... nothing. It's nowhere in the imports. You examine the imports one by one, and notice the:

use byteorder::{LittleEndian, WriteBytesExt};

"Aha!", you say, "this imports LittleEndian, so maybe this has to do with the byteorder crate". You check the documentation of that crate and indeed, you find the write_u16 method there; phew.

With par_iter you're less lucky. Nothing in imports will catch your eye, unless you're already familiar with the rayon crate. If you're not, then use rayon::prelude::* won't ring much of a bell in relation to par_iter.

Of course, you can just google this symbol like this and you'll find it. Or maybe you don't even understand what the problem is, because your IDE is perfectly familiar with these symbols and will gladly pop up their documentation when you hover over them.

IDEs and language servers

These days we have free, powerful and fast IDEs that make all of this a non-issue (looking at Visual Studio Code, of course). Coupled with smart language servers, these IDEs are as familiar with your code as the compiler; the language servers typically run a full front-end sequence on the code, ending up with type-checked ASTs cross-referenced with symbol tables that let them understand where each symbol is coming from, its type and so on. For Rust the language server is RLS, for Go its gopls; all popular languages have them these days [2].

It's entirely possible that using a language like Rust without a sophisticated IDE is madness, and I'm somewhat stuck in the past. But I have to say, I do lament the loss of greppability. There's something very universal about being able to understand a project using only grep and the official documentation.

In fact, for some languages it's likely that this has been the case for a long while already. Who in their right mind has the courage to tackle a Java project without an IDE? It's just that this wasn't always the case for systems programming languages, and Rust going this way makes me slightly sad. Or maybe I'm just too indoctrinated in Go at this point, where all symbol access happens as package.Symbol, packages are imported explicitly and there is no magic name injection anywhere (almost certainly by design).

I can't exactly put my finger on why this is bothering me; perhaps I'm just yelling at clouds here. While I'm at it, I should finally write that post about printf-based debugging...

[1]	Note that it could be simpler to use `serde`'s `to_json` function here, but I opted for the explicit serializer because I wanted to show how we invoke a new method on an integer literal.

[2]	Apparently, not all tooling has access to sophisticated language servers; for example, as far as I can tell GitHub source analysis won't be able to find where `write_u16` is coming from, and the same is true of Sourcegraph's default configuration (though I've been told this is in the works).