Today I reworked the generation of linemarkers. The output
of CPP has lines in it that look like
# 12 "foobar.h"
that tell the compiler where each chunk of the
preprocessed file came from. If you don't intend to
generate a preprocessed file, these are useless - you can
grab the info straight from CPP's data structures. But they
are generated deep down in the guts of the preprocessor and
you can't get rid of them...
...well, now you can. The "reader" library interface
doesn't generate them anymore, and there's a new "printer"
interface that sticks them in right before output.
Structurally, this is a good deal cleaner than what we
had. It works great too, except that it gets all the line
numbers wrong. This is not really the fault of the
"printer", but a bad interaction with a whole different area
of the code.
There's a special internal routine to scan directive
lines. Among other things, it refuses to scan past the end
of a line - except what it really does is refuse to consume
the end of a line. The "printer" has to emit a linemarker
at the beginning of each #included file. It will
not get a chance to do so unless it's invoked before the
#include processor returns. But at that point, the
newline ending the #include line has not been
consumed. Therefore that newline will be counted twice.
It's even worse than that - the only place the "printer"
can get control can't distinguish between #include
and anything else, which means every single directive line
will be counted twice for line-numbering purposes.
The fix is to make the directive scanner consume
newlines. That will be tricky. The various directive
handlers count on being able to read that newline multiple
times and not get messed up; works fine when it's never
consumed, but if we want it to be consumed exactly once...
harder.