Advogato: How did you get started doing free software?
Christopher Montgomery: Programming had already been a hobby most
of my life. When I got
to college, I thought I was pretty hot stuff; I'd never been around
better programmers than I was. That changed drastically at MIT. I
didn't feel stupid (like all my HS guidance counsellors warned me), I
felt like I finally belonged. But it also became very clear
that I had so much to learn, I wasn't even on the map. My mentors
were free software hackers, and it only made sense that I became one
too.
Who are the mentors that influenced you the most?
A few instructors were inspirations, perhaps not personal mentors:
Hal Abelson, Jerry Sussman, Greg Papadopoulos. The people I learned
the most from daily around MIT were my own generation and a little
earlier. I'm a bit afraid to name names only because there were quite
a few, and I'll forget someone important...
Seth Finkelstein, Mike Bauer, Mark Eichin, Marc Horowitz, Greg Hudson
(whether he knows it or not ;-) and a lot of other people whom I'm
inadvertently insulting through omission... maybe I shouldn't have
started listing at all. Many of the folks I'm thinking of may or may
not have known more than me, but they certainly knew a lot I
didn't.
Some people probably know you from cdparanoia and MGM, while others are
more familiar with Vorbis. Why did
you write cdparanoia and why are you working on Vorbis?
Cdparanoia was a spinoff of the Ogg project many years
ago; I
needed a reliable way to get long samples of CD for testing my CODECs
and cdda2wav
didn't work well on any cdrom drive I could afford then
(I was still a student at the time).
Half the reason I wrote Vorbis is simply because I've been working on
audio CODECs for about six years now, and I'm finally getting good at
it. However, my interest was beginning to wane a bit after a lot of
code and not much success.... then Fraunhofer decided to sue
everyone in sight. That's what galvanized Vorbis. I've been working on
it steadily since November 1998.
Vorbis is [approximately] the sixth generation of Ogg and the first
CODECs that I feel is ready to go forth and do battle in the streaming
arena. It's not enough to be Free and as good as MPEG. I have to be
Free and clearly better.
Could you briefly give an overview of Ogg? The patent situation with
Fraunhofer certainly has made clear the need for a totally free
CODECs. What kind of support have you been getting for this project?
Ogg is the name of the overall project. Vorbis is our first CODEC
(for lossy audio streaming), but we'll continue into video and
lossless audio, as well as 'metacontent', etc.
I've actually received more corporate interest and support than
I expected (formerly the Green
Witch LLC of San Francisco and now iCast of Woburn Massachusetts
sponsors Vorbis development). The small and middle sized players in
the online download and streaming market are the ones being squeezed
by the big players that control the technology, so they're very
interested in Vorbis.
The Open Source community doesn't know much about Vorbis and I've not
been trying to attract attention; I wanted an alpha-grade release of
the Vorbis libraries, tools, and plugins before crowing too hard.
That may prove to be a mistake, but I have been worried about
attracting more attention that I'm ready for (with incomplete code),
or damaging our credibility with something that isn't ready.
Now, though, the code is nearly past the stage where you need to be a
hacker to get it going.
And in the meantime, you've moved from Massachusetts to the
California Bay Area. Is this just to make sure you're on a
different coast than your corporate sponsors?
I came out to the Bay area to get married, nothing more to it. My
wife doesn't have the freedom to move around so I came out here.
Metacontent? What do you mean by that?
'Content about the content'. Things like lyrics, album covers, any
additional data that isn't really part of the core content, but makes
sense to package along anyway.
What kind of licensing arrangements apply to the Vorbis protocol
spec and your implementation? Will the iCast product be free
software?
Vorbis, as I'm working on it, is a xiph.org product; iCast is a sponsor,
but they didn't buy us.
The Vorbis spec is free; the only right we retain over the spec is to
set it and certify compliance. I'd love to see third party
implementations of Vorbis with whatever software license the new
author sees fit to use.
The Vorbis libraries we're writing are all LGPL. The tools for
encode/decode are GPL.
So, how does Vorbis compare to MP3?
Right now, it compares well. I get a few emails a day from folks
who stumble across it, try it, and then tell me it's better.
I think that the current CVS mainline is as good as MP3.
This weekend I'm merging a month's worth of new work that
substantially improves it.
What are the main things you're doing differently?
MP3 and Vorbis aren't actually very similar except in very shallow
ways. Vorbis resembles TwinVQ a bit
more closely, but again, poking at it for more than a few minutes tells
you that the two have very little in common beyond some mathematics.
So, unfortunately, I don't have many shared landmarks between MP3 and
Vorbis to contrast directly.
Vorbis is FFT-based, while MP3 is not?
Vorbis does not subband, but uses an MDCT
directly. In this way, it
resembles TwinVQ. (I was going to grad school in Japan at the time
TwinVQ 1 was being developed, and I had the unbelievable luck of
taking a few classes from the professor who also headed the lab
developing it.)
Is TwinVQ proprietary or free?
Proprietary, but how much so is hard to tell. TwinVQ appears to be
more closed than MP3, but it's been suffering popularity problems. NTT
hasn't been very good at marketing it. It's had less scrutiny because
to not many people are paying attention to it.
To elaborate on a previous statement; the frequency domain stage
of Vorbis uses an MDCT directly. Vorbis also provides for time domain
encoding.
Is that a feature unique to Vorbis?
Compared to TwinVQ and MPEG, yes. Compared to the history of audio
coding, no. And this is less finished than the rest of the CODEC and
likely to be released somewhat later. But for strongly non-tonal
audio, we also encode time features using wavelets.
That counts as 'experimental but very promising'. Don't expect to see
it in 1.0, but it would be a priority once the absolutely necessary
features are ready. At some point, you just have to back off a bit,
realize what you already have is excellent, and resist the urge to
spend another four months making it even better before 1.0 :-)
So you're basically combining the best techniques out there that
aren't patented up the wazoo, right?
It's hard to tell with the current patent climate; some very broad,
obviously invalid patents apply to audio and I've been worried in the
past that it doesn't really matter what's really patented; the
lawyers will find a way to sue you. However, there's now a corporate
war chest behind Vorbis and I'm moving ahead.
Speaking of which, when do you plan to release 1.0?
We were hoping for this Tuesday. I'm behind schedule. Not by much,
but it's still annoying. So we'll likely go ahead with the press
releases, update the overly out of date xiph.org Vorbis pages and
continue forth. At this point, Vorbis will benefit from attention;
the majority of the work left to be done doesn't involve wizardry.
What we have in CVS right now has a complete API and runs solidly.
It's missing features (and the stuff I'm committing now alters the
bitstream format incompatibly), but it stands up.
Excellent! I'm sure I'm not alone in looking forward to the
release. But with the huge momentum behind MP3, what chance do you
think Vorbis has to gain a foothold?
Someone always asks that in some form ;-) Forgive me for sounding
cynical and elitist, but the consumers use whatever the industry
settles on. The industry wants Vorbis because MPEG is currently
running itself as an exclusive club. As the industry decides to use
the higher quality Vorbis format for free rather than MP3 with its
steep licensing and royalties, the consumers will get behind it.
The Open Source community is a bit different; here I can stand on
technical merits.
Do you really think so? I'm amazed by the number of people who
still use bladeenc even though
LAME is head and shoulders
above it.
Touche :-)
What I meant was that the Open Source community is relatively a much
higher percentage of technically literate early adopters. Taking the
OS community by storm doesn't assure widespread success elsewhere, but
a good project has a better chance there.
But aren't there some people who prefer digital audio to be a
closed club? After all, Vorbis doesn't have any copy protection,
region codes, or any of that.
Neither does the competition. The algorithmic equivalent of a wad
of gum in the keyhole is not real security, regardless of the press
releases that claim otherwise.
But really, this is the real can of worms.
I try to avoid political baggage and philosophical dogma, but I
fundamentally disagree with the amount of control the music industry
is trying to place over distribution. It is not realistic, it is not
practical, and moreover it just worries me.
Let me state for the record that I want the artists to get their
money, and much more so than the record companies do. The RIAA can shout piracy all they want,
but that isn't what it's about. It's about control, and only about
control.
Do you want the RIAA to have the ability to tell you that you may only
play your music on a single 'Walkman'? BTW, they have a deal with
Sony this year, and the price of that Walkman just went up.
Similar scenarios played out just this past month (albeit concerning
MPEG licensing, not the RIAA)! I don't have to construct a 'slippery
slope' argument, because we've already fallen down it. You, Joe
Consumer, are losing the very right to listen to music you've already
bought.
The last piece that completes the absurdity is that the protection
schemes the RIAA demands are impossible to implement securely. They'll
always end up cracked. Thus, the RIAA, DVD Forum and MPAA are trying
to make the act of reverse engineering these flimsy protections itself
a crime.
None of this makes sense, and I'm simply not going to participate in
the madness. It makes no engineering sense, it makes no legal sense,
and my conscience won't let me implement a feature that takes
reasonable rights away from people.
Now that Vorbis is close to being released, what kind of support
would you like to see from free software developers?
First, integrating it into their own projects. I've gone through
great pains to make sure the API is so easy it will make you cry. Of
course, I'm saying that before I actually document it ;-)
Secondly, Vorbis could use a few more steady contributors.
There are pieces of the distribution (the psychoacoustics,
signal processing, etc) that are deep wizardry... only a very few
people would be able to help. Thankfully, there are many more pieces
that any good hacker could wrap their mind around and make magic with.
I don't expect there to be much trouble attracting both kinds of
support now that we're turning off our 'go away and come back later'
field.
Many of the most visible open source projects fall into that "any
good hacker" category. In what ways does "deep wizardry" free software
differ?
A few of us at the Green Witch joked a few months back that
everyone's first program [in Open Source] used to be a mixer; now it's
an IRC client :-)
There are kinds of applications that are 'solved problems'. They use
simple blocks that have been done before. You only need to spend the
time and effort to put them together well. (Which is not a trivial
thing. Doing software well is to be commended).
Then there are the applications that 90% of programmers look at and
say 'I don't know how to do that' and so don't try it. Another 9%
will spend a month or two butting heads with the problem, then wander
away when it turned out to be much more difficult than expected.
Some of the silly fools keep tangling with the puzzle for ten
years.
You can learn a lot in ten years... but spending that long is an
accident. Any sane hacker would find an only mildly less sexy project
that takes 1/10th the time.
No one has written Vorbis yet because it's hard. If I'd had
any idea how hard and frustrating audio compression is at the time I
started, I wouldn't have. It was only supposed to be a small,
weekend-long hack as part of a more sexy package ;-)
But sometimes these hard things are needed. Should we just count on
a steady stream of non-sane hackers?
Well, in my case, I saw that MP3 could exist, so I knew it was
possible. In my case, I was an arrogant brat who didn't know better
but couldn't admit it. Perhaps counting on that motivation is more
practical :-)
Joel Becker of #gimp has just proposed registering theyallsuck.org
as a blanket site for reviewing IRC clients, mailers, window managers,
etc.
lol. I have been growing a pet peeve about people who write their
own version simply because they couldn't be bothered to learn the
software that currently exists. On the upside, the folks who do so
generally learn a lot in the process.
I come from a rabid BSD crowd, and they generally can't believe I'm
still hacking Linux. All I can manage to respond to most of the
arguments is, "Yes... but they're learning!" Which is the real
blessing of Linux. The other more 'elite' OS projects couldn't say
that.
The OSS community has grown by a lot, and we're still assimilating the
people who think that a few months with VB gives them what they need
to write another GIMP. They'll get
better. Of course, they'll be somewhat annoying in the meantime, but
most of them are good folks :-)
Do you think that this atmosphere of learning has anything to do
with the number of scarily smart people doing free software?
The current atmosphere is different than before namely because of
the influx of new programmers. The scarily smart people have always
been here; perhaps there are more of them here now, but I really don't
know. I do know that I meet more scarily smart people in the
community all the time, and that makes me feel good enough.
Joy's Law says, "Wherever you work, most of the smart people in the
world work for somebody else." Do you think that quote really applies
to free software?
We're not necessarily working hard as much as we're doing what we
love. Of course 'scary smart' people may work very hard at
having
fun :-) Seriously, I think and sincerely hope that 'us and them'
doesn't apply in OSS. People from other projects contribute to Ogg
and I contribute to other projects all the time.
When I'm working on code, "beating the competition" isn't what's on
my mind. Hackers get together with other hackers they like,
and art results. Who was working for who? Maybe it doesn't matter.
Are you actively working on the video CODEC side of Ogg?
Actively, now: no. I don't have the resources, and this falls into
the category of 'requires deep wizardry'. That doesn't mean that
folks don't play with it (and we do have code), but this does not
imply any sort of organized development of it yet.
"Folks" plural?
Yes, I slip in and out of 'me/we', but there is more to xiph.org
than just me :-)
Vorbis and the other CODECs have a few steady hackers, and about a
dozen casual contributors. That includes folks that are just in the
same sphere... like the LAME developers, Icecast developers, old
friends and so on.
Vorbis is 99% of the effort right now. The video CODEC sees
disorganized 'poke and twitch' coding. The Ogg umbrella libraries get
work now and then, but have been stable for a while.
The video project was a much easier project before you knew what
you were doing, then?
Oh, I didn't play with video until I knew how hard audio was.
Video is a bit easier than audio, but it's not trivial. I've had
at
least two hackers bounce off the video project pretty hard so far this
year :-)
I'm surprised to hear that video is easier than audio. How so?
A couple of reasons; video compression is less mature than audio.
There are still new, simple things to try with it that have a good
chance of working. The brain also tolerates much more quality loss in
video than audio. And so on.
Audio has been studied to death for 80 years; some of the masking
papers I used for data in Vorbis are 40 years old. Video is much,
much younger.
Back to cdparanoia. What's the philosophy behind that project?
cdroms are perhaps the lowest margin piece of hardware in modern
PCs, and the media format doesn't lend itself well to accurate
extraction. Getting skip-free audio off a CD is much harder than it
looks.
Because of the CD format itself, or because the drives can't be
bothered to get it right?
Both. Most of the problem, though, is buggy drives.
Yep. Regular CD players don't seem to have too much problem with
it.
Sure they do. Pick up your discman and shake it. That simulates
your OS going off and doing something else for a while. The CD is like
a phonograph record; it's digital, but it actually is a spiral meant
to be read from start to finish. If the CD is interrupted, loses sync,
hits a bug, whatever... it just skips. Seeks, by the original CD
spec, need only be within about 75 sectors of the intended
destination. The data format doesn't provide for fine grained seeking
like on a data disc; the CD literally is incapable of going back to
the exact spot it lost track.
There's actually much more arcane detail involved; I have about four
pages on the subject in my cdparanoia FAQ that
goes into extreme, boring detail about why it doesn't just work.
There are an astounding number of failure modes in reading an audio
disc. "It just is" even if it shouldn't be :-(
What kind of cool new features are going to be in Paranoia IV?
Paranoia IV is mostly intended to be a) portable and b) automatic
(with meaningful error handling). For example, rather than 'unable to
open cdrom drive', "you do not have /dev/sg support in your kernel" or
"I need read/write permission on /dev/sg6 to proceed." I also need to
add features like index/subcode extraction and ECC support for drives
that have it.
But portability is the big issue.
Do you spend all your waking time hacking, or are there things
you do away from the computer?
Right now, due to Vorbis, I spend all my waking time hacking, and I
can't wait for that to end. Mountain biking, ultimate frisbee,
badminton, hiking, hardcore strategy gaming... it occurs to me it's
been too long since I was actively singing. It's two years now since
my last Gilbert & Sullivan show and I haven't sung much since then.
...but you'll have to cut that list in half or more because my wife
will certainly have a longer list of her own to subject me to :-) To be
fair, those lists overlap.
I notice you use the phrase "open source software". Do you have a
position on the whole open source vs free software rhetorical
flamewar?
No, I use them interchangeably when the fine distinction doesn't
really matter. My mixed use is mostly motivated by a) lack of
alternative synonyms and b) desire not to sound like a robot.
Would you be willing to pose for nudenerds.com?
I haven't already? Well, dangit, who were those other people then?
Thanks very much for the interview, and best of luck with
the Vorbis release and your other projects.
Thank you :-) Best of luck with Advogato; I'm definitely hoping
for the best.