Older blog entries for dhd (starting at number 98)

I have volunteered myself for a million little projects with unspecified goals and deadlines, and am consequently getting very little actual work done on any of them. I miss my old project manager...

On the bright side I'm finally fiddling with the voice building tools in greater depth, since we're actually, like, building voices now. I went into the studio to record some unit selection prompts and I'm slowly working on a French diphone voice (if I ever get the diphone list generation and letter-to-sound stuff done...)

Arr. Once again I am in computer telephony hell. In the hopes of achieving reliable echo cancellation and full-duplex performance, we acquired a very expensive Dialogic PCI telephony card and the package of (stupid, proprietary, gross) Linux software support. Well, it appears that the software support ... doesn't. And the documentation ... doesn't.

And as far as I can tell, Dialogic doesn't actually want to support people who try to use it. Their customer support e-mail directs you to a web forum or a pay-per-incident service. After paying $BIGNUM for this piece of crap board and proprietary crap software, the least they could do is write some bloody documentation for it that is good for more than toilet paper and give at least some complimentary e-mail support. Fuck you, Dialogic.

Surveying the landscape of computer telephony fills me with a deep feeling of helplessness and despair. All the software is broken, all the hardware is obscure as hell, it all costs unbelievable amounts of money, and it never seems to work correctly. And when it looks like I find a board that is actually useful for my applications, it turns out that it only works on Windows NT. (come on, people, you could at least support Solaris or some kind of semi-reasonable proprietary OS).

It looks like the Quicknet/IXJ stuff, as quirky as it may be, is the best thing there is for Linux and open source telephony at the moment. I hereby take back all the bad things I said about them :-)

BEWARE!

Oh yeah! That's right! We released some speech stuff a few days ago. Please beat on it, etc. I'm mostly working on higher-level things at the moment (i.e. actual applications that use these modules) so these should be stable for a while.

In other news, Sphinx2 0.3 was put out with little fanfare. So you don't actually need to use the CVS version to compile Speech::Recognizer::SPX. Which reminds me, I should probably update the README.

I've already been asked if this stuff works on Windows, I wonder when someone will ask if it will be rewritten in Python...

I'm playing at rewriting the silence filtering library. Handling all the audio buffering and framing safely and efficiently is unfortunately trickier than it looks. Pointer arithmetic is HARD, let's go SHOPPING!

Playing with the iPaq. Wrote some nice feature extraction code for FPU-less machines. Somewhat unsurprisingly, it's three times slower on my P3 laptop. (but 80 times faster on the iPaq)

Discovered a nasty bug in the iPaq audio driver, leading to a kernel oops. Predictably, select(2) was broken. Sent off patches. No reply. You'd think that a patch that changes a grand total of 4 lines, and fixes an oops, would get a bit of priority in people's mailboxes. You'd think wrong.

Oh the PLE133 is really lovely. So, to reduce costs, it seems that VIA decided it would be clever for it to only support PC133 memory, and not even all PC133 memory (it has to be CAS3).

I discovered this since I finally broke down and got a better motherboard for my home box and thought it prudent to reuse the old parts for a firewall box at work ... several sticks of SDRAM later and I finally have a working machine built around this thing which seems to be way too overpowered to serve as a firewall. <sigh>

What else ... well, various stuff. Having more fun with audio. I was thinking of presenting on speech and Perl at YAPC but I might just do audio and Perl as there is more than enough there to take up 45 minutes.

It really seems as if you cannot win with Linux audio drivers. If they don't support setting the fragment size, then you throttle with SNDCTL_DSP_GETOPTR (which is possibly a better idea anyway). But then you discover that lots of drivers don't support that either. Does anyone actually use this stuff for anything besides playing MP3s and Quake?

Made up a summary of all the issues with the telephony drivers and sent it off. Now I'm waiting for a reply, and have ended up debugging sound drivers instead. select(2) (and obviously poll(2)) breaks in interesting and different ways on different drivers. Somehow I am not surprised.

On a related subject, the VIA PLE133 chipset is really crap, and I suggest avoiding it. The integrated video won't do over 1024x768 without massive noise in the image, and the on-board audio uses one of those lame-o AC97 codecs that only does 48kHz. Suck.

Of course, if people who wrote sound applications actually knew that SNDCTL_DSP_SPEED returns a meaningful value in its argument (in particular the people who wrote the OSS backend for libao), that would also be helpful.

I guess I should probably just install ALSA on that box since its library will do all the necessary sample conversion. The kernel's VIA audio driver is really nice otherwise though, so I kind of fear what might happen.

In general, though, ALSA seems to have evolved to the point where it does a better job of OSS audio than the actual OSS modules. This is fairly impressive.

Also, pondering an enforced vacation from IRC, caffeine, or both. I find myself becoming more and more of a nasty, irritable, misanthropic, self-righteous bastard lately. Communication failures occur with frightening regularity.

20 Feb 2001 (updated 20 Feb 2001 at 02:08 UTC) »

Further progress ... my telephony gunk is now able to call me up on my cell phone and annoy me. Ring detection when dialing out turns out to be surprisingly hard, though - it seems that there's no way to tell if someone has picked up the phone in the middle of a ring until we fail to detect the next one. I suspect that voice data will fool the ring-detection filter, too. Perhaps the suggestion to use the speech recognizer to detect rings and pick-up is not so far fetched after all.

There is also a fair amount of black magic involved in getting the IXJ card to play DTMF tones correctly; the duration of the tones is magic (180ms on and 45ms off are the magic numbers supplied in the SDK, and other values tend to fail randomly), and also, it seems necessary to pause for at least a hundred milliseconds or so after setting the device off-hook, or it will fail to report tone state appropriately, causing none of the tones to be played correctly at all. Ah well. I keep telling myself "we build voices, not IVR systems" and that excuses all this ad-hockery.

I'm gaining a certain amount of sympathy for the idea of retiring to the countryside to grow fruit trees and "dealing in units of time no shorter than a fortnight". Yeah, a Jargon File reference (I think). So shoot me. (I have to be careful about saying that in the US, I guess)

I managed to fry the motherboard in my home machine when transplanting it into a new case, resulting in much cursing, swearing, weeping, and gnashing of teeth. The 'whisper' power supply that I bought is, in fact, very quiet, though. So presuming I don't fry the replacement motherboard, I should finally have a machine suitable for leaving on all the time and hence running mail, web, music, and wireless stuff from.

Still waiting for CMU to make another Sphinx release, as well as releasing the training tools and so forth.

I must admit that there are some things I like quite a lot about Red Hat 7. Being able to enable and disable inetd services with chkconfig(8) is pretty swell. Having POP3 and IMAP over SSL configured by default is as well (though of course one must still upgrade stunnel to a non-vulnerable version). It's a bit frustrating when it takes longer to download and install all the urgently needed updates (over a T1, natch) than it does to install the distribution itself, though.

I'm still regretting that I failed to snag a Conectiva 6.0 CD at LinuxWorld. By all accounts it sounds like the best of the RPM-based systems yet; they seem to have their heads screwed on straight with regard to security (shipping BIND in a chroot jail by default, for instance) and upgradability (well, if their APT port is any good, at least).

That said, I'm still waiting for one of the RPM based distributions to get rid of Sendmail as the default MTA in favour of Exim or Postfix... then I might actually consider using one on my own machines.

I actually sent a piece of snail-mail today that was not a parcel or rent cheque. Trying to hook up again with yet another old friend who has managed to avoid the rise of the Internet entirely, it seems. I keep half-heartedly searching people's names on Google wondering if they might have resurfaced on-line, but haven't had much luck.

Well, of course, after losing all hope, I finally get the damn phone server to work. Of course, I spent a few hours looking over logfiles wondering why all the messages were apparently getting delivered in strange orders until I realized that some of my debug printf()s were going via stderr and others via stdout... Anyway it is doing real full-duplex now, the network protocol bits work, etc, and I am happy.

Now all I have to do is adapt it to work with soundcards, which should be a lot easier since they do sane things like, duh, actually return 0 from read(2) and write(2) if their buffers are empty/full (respectively).

Also, fixed the configuration stuff for my Edinburgh Speech Tools XS stuff so it will actually compile on other people's machines, which is the first step towards finding an actual use for it. And finally started building a website for the company ... so I'll have to take "I work for a company with no website" out of my web pages :(

89 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!