I decided to start with the simulator itself, the "Virtual Machine" that reads raw MIX byte code and executes it. I will do the MIXAL compiler later - its syntax is not as simple as seemed first, I'll probably use a real parser to process it.
Till now, I've implemented reading a memory image from a file, the registers, some of the framework (utility functions, convert this to that, etc), and all LOAD and STORE operations. It's already about 500 lines of code.
About the input - I assume all memory is initialized to +0s, and only update what I read from the file. So, there can be a maximum of 4000 lines to read (one for each memory word). Each line is about 20-25 chars long, so the file can get pretty large (up to 100k). This is the "readable" input form. To those who like compact data, I'll add another mode - reading from a binary file, tightly packed.