Boost.Asio with Protocol Buffers code sample

March 20th, 2011 at 7:35 am

Recently I implemented in C++ a mini-project in order to get acquainted with both the networking library Boost.Asio and the serialization library Google’s Protocol Buffers (protobuf). I’ve placed the code online.

The project implements a simple server that receives and answers GET/SET/COUNT queries with string keys and values. In other words, it’s an in-memory data-store mapping strings to strings, available to multiple clients simultaneously. Below are some of my impressions of the libraries.

Boost.Asio

The networking part of the project is implemented with Boost.Asio as an asynchronous server capable of serving many clients simultaneously. No threads are involved – only asynchronous callback calls. Asio is probably the most popular networking library for C++ and information about it is easy to come by online. Besides the pretty good official documentation, there’s this free book which I found very informative, as well as tons of tutorials and discussions of specific issues in mailing lists and StackOverflow, ready for your Google-fu when you need them.

Asio was relatively easy to learn and use. It comes with a ton of examples, and once you wrap your head around the main concept of asynchronous callbacks, it’s quite easy to find everything you need. It helped me to have background in asynchronous processing, but I guess it’s not a must. After all, such model of programming is all the rage lately (Node.js, Redis and others) and a lot of information about it exists.

Protobuf

The serialization part is implemented with Protocol Buffers. Both requests and responses to the server are serialized into binary protobuf messages and sent over a socket. Some tweaking was required here, because protobuf is very low-level. The library only specifies how data is serialized – it doesn’t help with transmitting this data over the wire. In particular, the two main challenges were (1) being able to send multiple messages types, and (2) encoding the messages to allow sending them on the socket.

Multiple message types

The problem, in brief, is: if you want to send different messages with different data to the server and have it know which message was sent, how is this achieved in protobuf?

The solution I used is from the Techniques documentation page: using "union types". My .proto file looks like this:

// The request has a type and then the relevant optional field is
// filled.
//
message Request {
    enum RequestType {
        GET_VALUE = 1;
        SET_VALUE = 2;
        COUNT_VALUES = 3;
    }

    required RequestType type = 1;

    message RequestGetValue {
        required string key = 1;
    }
    optional RequestGetValue request_get_value = 21;

    message RequestSetValue {
        required string key = 1;
        required string value = 2;
    }
    optional RequestSetValue request_set_value = 22;

    message RequestCountValues {

    }
    optional RequestCountValues request_count_values = 23;
}

The type field tells the recipient which one of the optional request_* fields to look at. Only those fields that were filled in actually take space in the serialized message, so this is an efficient way to encode multiple message types in a single message format.

Sending messages over a socket

A while back, I presented the issue of Framing in serial communications. With sockets it’s not much different – you still have to "frame" your message on the socket to allow the recipient to know where it starts and where it ends.

In this project I used the "character count" (or "length prefix") technique. I take the message buffer produced by protobuf and prepend a fixed 4-byte big-endian integer to it, which specifies its length. When the server waits for a message it first expects to receive 4 bytes, decodes the length of the rest of the message from it, and expects to receive exactly this amount of bytes to read the message itself. This technique works very well and is quite commonly used.

In general, protobuf is easy to use. It’s a shame the official documentation comes with very few examples, but all in all one can find the information he needs – the docs are quite comprehensive. I really like the idea of code generation that protobuf employs – it’s the best way to enforce DRY and avoid writing repetitive code, especially when changes in the protocol are required. Additionally, protobuf has backends for multiple languages – I used this fact to implement a simple Python client that exercises the server (it’s part of the project code bundle). Only a couple of lines were required to pack and unpack the message in it, the rest is handled by protobuf generated code.

So, here’s the link to the code once again (it’s also available from the Code page). If you have any questions / comments / insights about it, please let me know.

Related posts:

  1. Length-prefix framing for protocol buffers
  2. Code sample – socket client thread in Python
  3. Code sample – socket client based on Twisted with PyQt
  4. Less copies in Python with the buffer protocol and memoryviews
  5. Sample using QScintilla with PyQt

17 Responses to “Boost.Asio with Protocol Buffers code sample”

  1. GergNo Gravatar Says:

    Your example actually highlights PB’s general issue in that it supports straming exceptionally poorly. IMOHO, pb is designed for symmetrical, half duplex, RPC-like protocols and other uses suffer. One of its biggest faults is that the design mandates the possibility of human error in otherwise automatically generated code. For example, without yet more work on your part, its possible to send RequestGetValue but to specify the type as a RequestSetValue. Obviously its a bug but its not reasonable such a bug is even possible given this is automatically generated code. Its both a bug which should not reasonably be allowed to happen and regardlessly, creates a needless heavier burden on coders.

    A second huge failing of PB for generalized messaging is that it dictates your message structures. In order to avoid this silliness, it found a generalize prefix header is ideal. So rather than just a length prefix, you have length plus msg type. This also has advantages if you require more complex networking and/or dispatch mechanisms. In doing so, you no longer require massive message unions for complex protocols and even more powerfully, allows for dispatch before parse. The later can be especially important for scalability schemes load distribution.

    Generally speaking, pb is a nice project for simple, half duplex, fully symmetrical (request/reply), messaging, but is less than ideal any project which doesn’t fit Google’s myoptic view of protocols.

  2. CYNo Gravatar Says:

    I tried to compile the C++ code in the example, but got this error:

    db_server.cpp: In member function ‘void DbServer::DbServerImpl::start_accept()’:
    db_server.cpp:183: error: ‘class boost::asio::basic_socket_acceptor<boost::asio::ip::tcp, boost::asio::socket_acceptor_service >’ has no member named ‘io_service’

    Is there some specific flag I need to pass in or a specific version of boost I need?

  3. elibenNo Gravatar Says:

    CY,

    I have Boost 1.40 installed (with the corresponding Boost.asio) library.

  4. CYNo Gravatar Says:

    Hmm. Looks like something may have changed between 1.40 and 1.47… I’ll see if I can dig in and figure it out.

  5. CYNo Gravatar Says:

    Ah hah – this might be it:

    From Boost 1.4.7 ChangeLog:
    Removed the deprecated member functions named io_service(). The get_io_service() member functions should be used instead.

    db_server.cpp:183 changes from

    DbConnection::create(acceptor.io_service(), db);

    to

    DbConnection::create(acceptor.get_io_service(), db);

    and it builds :-) .

    Now to see if it runs… May need to cook up a C++ client.

  6. elibenNo Gravatar Says:

    CY,

    Thanks for sharing the solution!

    The source code linked in this article comes with a simple Python client you can use to test the thing works. Python is great for protocol buffer hacking.

  7. CYNo Gravatar Says:

    Unfortunately, I don’t have the python bits of protobuf available at the moment – and anyway, I’ll need to be able to do a C++ client regardless, so it’s something I’ll have to figure out.

    I don’t suppose a C++ client already exists for this example? (e.g. has someone already done one?)

    Cheers, and thanks for a very helpful bit of code!

  8. elibenNo Gravatar Says:

    CY,

    I’m not aware of a C++ client, since this is just a made-up protocol for the sake of a sample, although cooking one up using bits and pieces from the server shouldn’t be hard.

    That said, I must add that I find Python very useful when working on protobuf-based protocols. Adding Python support is simply a matter of installing the protobuf Python libs and passing an extra option to the protoc compiler. Then, it’s very easy to use Python for quick-n-dirty clients, testing, simulations, and so on. Much more rapid development cycle than with C++.

  9. CauNo Gravatar Says:

    hji,
    I have protobuf and asio on my ubuntu system.
    A sample boost program also runs successfully.
    When i try to run your code, i get the error as follows.:

    >/Desktop/asio_protobuf_sample$ make
    protoc –cpp_out=. –python_out=. stringdb.proto
    g++ -c stringdb.pb.cc pkg-config --cflags protobuf
    g++ -c server_main.cpp
    g++ -c db_server.cpp
    g++ -o server_main server_main.o db_server.o stringdb.pb.o \
    pkg-config --libs protobuf -lboost_system
    /usr/bin/ld: cannot find -lboost_system
    collect2: ld returned 1 exit status
    make: *** [server_main] Error 1

    Can you please tell me what is wrong?

  10. elibenNo Gravatar Says:

    Cau,

    I don’t know how Boost is installed on your system. The linker should know about the directory in which Boost’s libs reside – you can help it with -L if it doesn’t know.

    Another issue could be versions. This example works with Boost 1.40, where the system library needs a lib (a lot of Boost code just needs headers). This may be different in earlier or later versions of Boost.

  11. CauNo Gravatar Says:

    hi,
    thanks for your reply. i came across a post online saying that i should be able to find the installed boost libraries in the path /usr/include/boost/stage/lib. I can find the boost folder in usr/include. but cannot find stage or lib anywhere in boost.

    The version of boost I have is 1.48. so u think the system library wont need a lib in this version??

  12. elibenNo Gravatar Says:

    Cau,

    I honestly don’t know. The Boost documentation should cover it, though.

  13. CauNo Gravatar Says:

    Thanks you.. :)

  14. ManinderNo Gravatar Says:

    What is the reason behind leaving 20 empty tags ? For example

    required RequestType type = 1;
    optional RequestGetValue request_get_value = 21;

    Why can you not write RequestGetValue as
    optional RequestGetValue request_get_value = 2; //as opposed to 21

  15. elibenNo Gravatar Says:

    Maninder,

    It’s just a habit I have when allocating fixed numbering, to accomodate future changes and additions. If in the future I’ll want to add a field that logically belongs to all requests, it will have numbering sequential to type and not after the optional request_ values. This is purely a matter of personal preference.

  16. hasbeanNo Gravatar Says:

    Thank you sir, this was a total life saver! It’s even more helpful since I’ve finally seen a proper example of how to use Boost’s shared pointers.

  17. RazvanNo Gravatar Says:

    Maybe implicit parameter in constructor should be
    PackedMessage(MessagePointer msg=boost::make_shared())
    to allow it to parse from array into something which exists

Leave a Reply

To post code with preserved formatting, enclose it in `backticks` (even multiple lines)