Boost.Asio with Protocol Buffers code sample

Update (2016-03-12): I updated this sample by using the newly released gRPC library. Check out the new post.

Recently I implemented in C++ a mini-project in order to get acquainted with both the networking library Boost.Asio and the serialization library Google's Protocol Buffers (protobuf). I've placed the code online.

The project implements a simple server that receives and answers GET/SET/COUNT queries with string keys and values. In other words, it's an in-memory data-store mapping strings to strings, available to multiple clients simultaneously. Below are some of my impressions of the libraries.

Boost.Asio

The networking part of the project is implemented with Boost.Asio as an asynchronous server capable of serving many clients simultaneously. No threads are involved - only asynchronous callback calls. Asio is probably the most popular networking library for C++ and information about it is easy to come by online. Besides the pretty good official documentation, there's this free book which I found very informative, as well as tons of tutorials and discussions of specific issues in mailing lists and StackOverflow, ready for your Google-fu when you need them.

Asio was relatively easy to learn and use. It comes with a ton of examples, and once you wrap your head around the main concept of asynchronous callbacks, it's quite easy to find everything you need. It helped me to have background in asynchronous processing, but I guess it's not a must. After all, such model of programming is all the rage lately (Node.js, Redis and others) and a lot of information about it exists.

Protobuf

The serialization part is implemented with Protocol Buffers. Both requests and responses to the server are serialized into binary protobuf messages and sent over a socket. Some tweaking was required here, because protobuf is very low-level. The library only specifies how data is serialized - it doesn't help with transmitting this data over the wire. In particular, the two main challenges were (1) being able to send multiple messages types, and (2) encoding the messages to allow sending them on the socket.

Multiple message types

The problem, in brief, is: if you want to send different messages with different data to the server and have it know which message was sent, how is this achieved in protobuf?

The solution I used is from the Techniques documentation page: using "union types". My .proto file looks like this:

// The request has a type and then the relevant optional field is
// filled.
//
message Request {
    enum RequestType {
        GET_VALUE = 1;
        SET_VALUE = 2;
        COUNT_VALUES = 3;
    }

    required RequestType type = 1;

    message RequestGetValue {
        required string key = 1;
    }
    optional RequestGetValue request_get_value = 21;

    message RequestSetValue {
        required string key = 1;
        required string value = 2;
    }
    optional RequestSetValue request_set_value = 22;

    message RequestCountValues {

    }
    optional RequestCountValues request_count_values = 23;
}

The type field tells the recipient which one of the optional request_* fields to look at. Only those fields that were filled in actually take space in the serialized message, so this is an efficient way to encode multiple message types in a single message format.

Sending messages over a socket

A while back, I presented the issue of Framing in serial communications. With sockets it's not much different - you still have to "frame" your message on the socket to allow the recipient to know where it starts and where it ends.

In this project I used the "character count" (or "length prefix") technique. I take the message buffer produced by protobuf and prepend a fixed 4-byte big-endian integer to it, which specifies its length. When the server waits for a message it first expects to receive 4 bytes, decodes the length of the rest of the message from it, and expects to receive exactly this amount of bytes to read the message itself. This technique works very well and is quite commonly used.

In general, protobuf is easy to use. It's a shame the official documentation comes with very few examples, but all in all one can find the information he needs - the docs are quite comprehensive. I really like the idea of code generation that protobuf employs - it's the best way to enforce DRY and avoid writing repetitive code, especially when changes in the protocol are required. Additionally, protobuf has backends for multiple languages - I used this fact to implement a simple Python client that exercises the server (it's part of the project code bundle). Only a couple of lines were required to pack and unpack the message in it, the rest is handled by protobuf generated code.

So, here's the link to the code once again. If you have any questions / comments / insights about it, please let me know.