Older blog entries for dgatwood (starting at number 5)

Well, I never got around to trying the updated code. Other projects (non-software) got in the way. I'm working on recording a song I wrote about a year ago. I'm basically done recording at this point except for patching a few places here and there, but I'm still heavily editing. That's occupying every evening until I finish it. Once that's done, I'll get back to the Nubus driver.

I recently encountered a handful of old patches for the MkLinux kernel that I don't think I have seen before, though a couple of them looked very familiar. We're due for a new kernel release and a new Linux server release ASAP, in part because of a nasty bug in the 2.0.xx kernel when dealing with OpenSSH's privilege separation.

I'm hoping I can get the Nubus framework in place and working before the next release, even if I don't have time to actually bring up any drivers in it. Of course, the drivers are the easy part. It only takes about an hour to port PCI ethernet drivers from NetBSD. I'd expect the Nubus drivers to be even easier. We'll see. :-)

Latest status... after talking with the NetBSD guys, I came up with a massive code restructuring for the offending function that cuts the number of accesses down from hundreds to... umm... four one-byte reads per slot. Should shave seconds from NetBSD's boot, and hopefully it will make the MkLinux version actually work. I'll try it this weekend.

The VM issues turned out to be nastier than I thought. It would wedge consistently with the hardware mapped. I then tried leaving the RAM backing that chunk of address space, and it -still- hung. Hmm. Changed it to allocate memory in a wired fashion and the hang disappeared. Turned back on the mapping of hardware addresses over that space and the hang came back.

Hmm. Rebuilt the kernel with a debug build so I could see where things were going wrong in the path through the VM system, since clearly something wasn't getting set up correctly in the page tables.

Odd, the debug kernel goes right through the problem spot, eventually generating a panic from calling certain VM routines without being at splvm. D'oh! Wrapped the code with s=splvm() and splx(s).

At last count, it correctly probes empty slots. When it tries to probe a full slot, as best I can tell, it's wedging whenever it hits the byte lane where the actual Nubus declaration ROM occurs. Probing different cards yields different, repeatable hangs.

What's odd is that it is successfully reading from a given address, then wedges on about the fifth or sixth access to that address. At this point, I'm starting to wonder if there's something wrong with the probe code port itself.

I've pinged the netbsd-mac68k mailing list to ask the developers if they've ever seen anything similar on the original platform. Failing that, I'm going to have to hand compare the code to the original mac68k code, and possibly hand-compare it to the Linux nubus code as well to get a second perspective from code that is known to work on PPC machines to some extent.

Now getting back to drawing my comic strip for tomorrow....

For some reason, the memory mapping problem isn't behaving like it should. One of two things should have always happened: either a successful access or a machine check exception (which would be caught by the trap handler and my setjmp usage and treated as an unsuccessful probe). Instead, the kernel is just wedging. Strangest thing.

So my code was actually asking the VM system for a range without backing so that it wouldn't leak memory, but I'd never seen code written quite like that before. However, on further digging, the PCI code did it just with a straight allocation followed by a pmap_map. Tried using -that exact code- and got the same results.

At this point, it seems that either the hardware is in some way failing to generate the correct exception, the exception isn't getting handled (because of interrupts being off?), or the hardware isn't configured correctly for probing. I'm going to try reconfiguring BART and see if that helps.... :-|

Wow, I'd only had an account for all of... maybe five minutes when I got my first cert. Nice to know I'm certifiable.

As of last night, I finished writing the additional VM support routines for mapping and unmapping I/O addresses into the kernel's address space as needed for Nubus support in MkLinux.

In the process, I found that there's a really nasty bug in the pmap system (the bottom half of the VM system, i.e. the part that directly manages page tables and the processor's MMU) in MkLinux. Basically, the specs require that a sync instruction (or sync and tlbsync in 604/604e-based designs) be issued prior to leaving the critical section after a TLB flush. In the MkLinux implementation, the sync and tlbsync instructions occur AFTER releasing the lock, leaving the potential for serious corruption problems in the kernel on SMP systems.

The code basically calls a Mach VM routine to allocate a chunk of virtual address space large enough for the region, then calls pmap routines directly to add the mappings. It then obtains locked access to the PTEs for the region, marks them as WIMG_IO, and invalidates the TLB entries as it goes. It's the most horrible thing I've ever had to do in my life, and I feel dirty for having written it, but the VM system doesn't allow you to freely map and unmap hardware into the kernel's address space.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!