Thursday, August 07, 2008

Google Translation Center: The World’s Largest Translation Memory - GigaOM

Google Translation Center: The World’s Largest Translation Memory - GigaOM: "Google has been investing significant resources in a multi-year effort to develop its statistical machine translation technology. Statistical MT works by comparing large numbers of parallel texts that have been translated between languages and from these learns which words and phrases usually map to others — similar to the way humans acquire language. The problem with statistical MT is that it requires a large number of directly translated sentences. These are hard to find, and because of this SMT systems use sources like the proceedings from the European Parliament, United Nations, etc. Which are fine if you’re writing in bureaucrat-speak, but aren’t so great for other texts. Google Translation Center is a straightforward and very clever way to gather a large corpus of parallel texts to train its machine translation systems.

Part machine translator and part translation memory (a sort of search engine for translation that helps translators to recall translations), GTC will help translators by providing a free, global translation memory, and in turn drive costs down by reducing the amount of work needed to complete a text. It will help Google by providing an excellent source of high quality parallel texts that can be fed back into the statistical translation systems."


No comments: