23 Jan 2001 dhd   » (Master)

Cool. I got the Sphinx-II "continuous audio" module working with select(2) in Perl. So I can get rid of this horrible bit of code in my speech I/O framework, since I don't have to fork off a blocking process to do speech recognition anymore:

	# ARGH ... POE has messed with %SIG
	@SIG{keys %SIG} = ('DEFAULT') x keys %SIG;

The Sphinx-II "non-blocking" utterance processing interface is kind of broken... it processes only a single frame of data at a time, which is way less than the amount typically available to read from the audio device, and there's no explicit function to flush unprocessed frames (though you can just call uttproc_rawdata() with an empty buffer and blocking on :-) Fortunately, recognition is so much faster than real time that there's absolutely no reason to use non-blocking mode, even in a single-threaded program.

Once the 0.3 release happens I will volunteer to take a hacksaw to all the redundant and poorly-designed interfaces in Sphinx-II, fix them up, and properly document them.

In other news, I'm learning POE incrementally ... I've just taken the first step towards using it as more than just a convenient wrapper for select(2). Namely, I've taken my random collection of states handling the Festival server, Sphinx, audio I/O, and "dialog management" (such as it is - currently this is just "Hello World" and repeating back what the user says), and split them up into multiple sessions. Soon I'll take a stab at packaging them up into actual components.

And I just discovered that the ALSA emulation of the OSS interface is not quite bug-compatible. In ALSA, select(2) on PCM devices (including /dev/dsp) works as expected. With the kernel OSS drivers, you have to call read(2) on /dev/dsp before you can select it for reading, and if you start writing to it, even if your sound card is capable of full-duplex, you will no longer be woken up on read. Total fucking brain damage. Sigh...

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!