The problem

I ran into an interesting problem at work. I'm given a string (it's all in C), that contains a number (for example "12345"), which may be arbitrarily long. I must emit a string which is the same number in binary ("11000000111001" for my example).

At first it seems trivial - turning a number into a binary is a simple trick. But quickly the constraint that the number may be arbitrarily long blocks the trivial method. After all, we can't represent the number numerically in any type available in C (int, long, long long, all won't do, the number may be thousands of digits long).

There are BigInt libraries around, which allow taking care of huge numbers, but I feel that linking such a library to my project just for the sake of my seemingly simple task is an overkill. I was looking for as simple a solution as possible.

The solution

I got a hint from the "C Unleashed" book. There, they develop a BigInt library, and somewhere a comment notices that a long division by a single digit number is simple... Hmm - it got me thinking, well this is indeed simple. Consider the following paper and pen method:

You must divide a long number by a single digit number (say 2). You start from the left of the long number and go digit-by-digit. Write the division (integral part, e.g. 5/2 = 2), and if there's a carry (e.g. in 5/2 the carry is 1), add 10 to the next digit and go on. This way, we will eventually get the result. Here's an example:

I want to divide 12345 by 2. 1/2 = 0, carry = 1. So I write down "0" and go on to the next digit of 12345. It's 2, but there's a carry so make it 12. 12/2 = 6, no carry. Write down "6" (so far "06"), and go on. 3/2 = 1, carry 1. OK, write down "1". Go on = 14/2 = 7, no carry. Write down "7". 5/2 = 2, carry 1. Write down "2". All in all, we got "06172", and indeed 12345 / 2 = 6172 (with carry 1). The method works !

And now, when we have a way to divide the number by 2, we can employ the basic method of finding a binary representation:

Get N % 2, it's the LSB (luckily, mod 2 is trivial on numbers of any size, it depends only on the least significant digit - whether it's even or odd). Then N = N / 2, and repeat while N != 0. This will eventually give us the binary representation of N.

The implementation

As I mentioned earlier, this must be implemented in C. I haven't done hard-core C pointer hacking, so this seemed like a lot of fun. First, here is the long division function:

/* Note: in and out may be the same string,
   it will still work OK
*/
void longdiv2(const char* in, char* out)
{
    int carry = 0;

    while (*in)
    {
        int numerator = *in++ - '0';
        numerator += carry;

        carry = (numerator % 2) == 0 ? 0 : 10;
        *out++ = '0' + (numerator / 2);
    }

    *out = '\0';
}