Upon reading an article
in The Economist regarding machine translation, I began to see that
there might be some potential for open collaboration in the field of
what is known as "Translation Memory", or, in other words, the use of
already translated phrases to translate new, and slightly different phrases.
I haven't done professional translations for a couple of years, and I
never did anything that expensive or high-level, so I don't know what
the state of the art is, even in the free software world. It appears as
though there are some efforts to pool resources, such as the kbabel
features listed here:
http://i18n.kde.org/translation-howto/gui-specialized-apps.html
I wonder, though, if there might be even more benefits to be had by
pooling resources even further, maybe putting the phrases in the
'compendium' in the public domain, so that licensing wouldn't be a
problem. Perhaps it could also be combined with a tag or trust metric
system, in order to let the translator choose who to trust or not trust
as a provider of previously translated phrases to use when munging
documentation.
Prior to writing this article, I did a bit of reading on the web, and
found that there are tools out there...foreigndesk
is an open source (albeit for windows) tool to use for translating. I
have no idea about the quality, but it looks interesting. There is also
a DTD, which the aforementioned program supports, for exchanging
translation memory data. Of course, as mentioned above, Kbabel looks
like it has a lot of features, too...
So - what do you think? Interesting idea? Too much of a niche? Am I
missing something obvious because of my lack of knowledge in this field?
I hope that this article has provided some food for thought - I know
that the idea in the original article in The Economist certainly piqued
my interest.
I think this is an excellent idea, and if you can get the trust metric idea to work it might be pathbreaking.
In another vein, I found the Economist article reminded me of something I find ironic. There was a big growth in the 1960s due to excitement in MT (machine translation), and one of the idea that pushed MT was the Vietnam War: deciphering intercepted VietCong messages was obviously of some interest to the military, and there were many people at the time bullish about the prospect of taking a language analysis (roughly, dictionary plus grammar) of a foreign language and mapping it onto english in some automatable way.
The principal beneficiaries of the wave of funding that came in was Chomsky and his followers, since their approach to language emphasises the syntax of language above its phonology, semantics and pragmatics, and the success of the above approach to MT depends upon translation being reducible to a problem of syntax. Nowadays most people think this is too simplistic, but Chomsky et al are now `in charge' of American linguistics, due to funding they received because of America's war in Vietnam. In view of Chomsky's political views, I find that rather ironic.