Tags C & C++
How many times have you written this code in C++:

char line[BUF_LEN];
vector<string> file_lines;

while (fgets(line, BUF_LEN, filehandle))
{
   file_lines.push_back(line);
}

My bet - a lot, if you're working with C++. I know I did (note the usage of fgets() - the C stdio library works much faster than C++'s fstream).

Well, let me tell you something: this code is flawed. "No way, I've used it zillions of times and it works like a dream" you think... sorry to dissapoint you.

I'm now integrating the solution to a problem, and on my way I had to fight big, scary and hairy bug. A bug having to do with the code above.

Now imagine you're not in the world of all-good, but in the world of hairy multi-process boundary cases.

Your loop reads the last line of the file, which is not ended by '\n'. It's pushed into the vector. The loop comes back to read the next line - you'd expect it to stop, right ? Not always ! What if another process just wrote into the file, while you were pushing the line onto the vector. The other process added a couple of characters - so you read those and push'em to the vector. But that's wrong - in the file, it's all the same line (the last line wasn't terminated with '\n', recall) but you pushed it in parts, to two different vector locations !

Yeah I know it's not something people usually think of, but it happens, and robust code must handle it. I had this problem in two places in my code, and each one needed a different solution, because of the way that line-reading loop was called and used:

  1. In one place, I don't care about the added characters, so I just check if the line contains a '\n' and if it doesn't, after pushing it in the vector, I break the loop.
  2. In another place, I do care about the added characters, so I have to keep a flag that specifies whether the last line was "partial" (no '\n'), and if it was I concatenate the contents to the last line, rather than pushing them into a separate location.