chump is currently a single line assembler and disassembler. This is fine for now when used to enter instructions on the fly into KMD. The next milestone will hopefully be able to assamble programs. This requires some additions into the system.
Firstly the system will have to cope with lables reather than strict numbers. Numbers are easy to process because you can find and scan them quite easily but lables which might not have been defined yet could be a little more tricky.
Forward lables (i.e. branches or loads etc. pointing to lables later in the code) are even more difficult. Many instruction sets have several branch types for different distances and consume different ammounts of space. The first pass will not be able to know how far the target is. Taking the worst case stratergy and reassembling individual lines in later passes is probably the best way to do this but this does not get over the issue of instruction sets like ARM where it is not the distance but the number of significant bits. This could lead to infinate loops.
The first pass should recognise all lables and read all instructions into a list.
Pass two takes the instructions and assambles them. If relative forward looking instructions need an address forward the worst case size is taken. All relative instructions are marked as "to_be_reassabmled".
Pass three reassambles each "to_be_reassambled" instruction and when it finds a smaller version it replaces it (thus changing the address of later instructions and lables). After each optimisation all relative instructions which look over the optimised instruction (forwards only) will be reassambled (again).
I havent started thinking about macros or preassembler operations. There are obveous places (between pass 1 and 2) where they fit in.
There have been two main additions to chump. Firstly it allows enumerations to get rid of the laboreous process of defining long consecutive lists like registers.
(enum 2 "r0" "r1" "r2" "r3")reather than
(("r0")(OO)) (("r1")(OI)) (("r2")(OI)) (("r3")(II))The second addition is the ability to have inline C type structuures in the assembler. Its nice to be able to have lines like:
while (r1 > r2) sub r1, r1, r2;Or even larger structures like:
do {add r1, r1, #1; ldrb r0, [r2, r1];} while (r0 != #0)These are easily defined in the chump code and generate several instructions. The disassembler can generate more C like code, but this is normally turned off.
Isn't perl lovely? I had a little play with it the other day and created a nice link page. Unfortunately coming back down to C is a little strange. I'm constantly thinking in high level languages. I maniged to convince someone to convince someone (not a mistake) to have a go at writing a good universal schematic capture package. It should be very useful.
New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!