Sunday, March 26, 2006

Bibliotheque Numerique Francophone

About a year ago, LanguageLog derided the reaction of the head of the National Library of France. Since then, there have been several tries to defeat the evil Google. As "La République Internationale des Lettres" writes in very sarcastic terms, there is yet another effort but this time only Francophone... No hurry though: "Le projet reste pour l'instant au stade de la réflexion." The project is still in its reflection stage.

Bibliothèque Numérique Francophone: "Après Gallica, après la Bibliothèque Numérique Européenne, cap donc aujourd'hui vers la Bibliothèque Numérique Francophone. Nul doute qu'avec ce chef d'escadrille visionnaire, les éditeurs et autres milieux professionnels moutonniers du livre français lancés derrière lui dans le combat anti-Google sont en train de participer activement au futur rayonnement des savoirs, de la culture et de la langue française sur internet. Qui parlait de déclin de la France?"

Monday, March 13, 2006

Found in Translation - Military Information Technology

Found in Translation - Military Information Technology: "Spurred by the military and intelligence communities’ growing need to translate and retrieve pertinent foreign-language intelligence, the Defense Advanced Research Project Agency has launched a program aimed at improving automated, searchable translations."

Tuesday, March 07, 2006

Speak It in Chinese, Hear It in English - Newsweek: International Editions - MSNBC.com

Speak It in Chinese, Hear It in English - Newsweek: International Editions - MSNBC.com: "A three-year EU project called TC-STAR is pumping €10 million into language-software R&D."
That's great - but - what's new in there? OCR? Siemens' MT (METAL)? In any case, everything seems to be two years away - even this statement is not new...

Sunday, March 05, 2006

IBM's research juggling act | Tech News on ZDNet

IBM's research juggling act | Tech News on ZDNet

Paul Horn, the director of IBM Research: It continues to be a big thing for IBM and for IBM Research, but it's not just WebFountain. The basic issues are, really, natural language understanding in general. What WebFountain was able to do, which made it powerful, was it would go in and would scan text documents on the Web and it would understand enough about what people were saying that you could query it about what people were saying. You could imagine that there's a lot of countries, including our own, that would care a lot about scanning documents and even open documents and crawling through them to see what people were saying. A lot of the early work on WebFountain was done in three languages--English, Arabic and Chinese--and you can guess who might sponsor that work.

WebFountain is an example of a natural language technology that allows you to essentially analyze from an intelligence point of view what people are saying, but the important point is that this is just a small piece of many, many problems that companies have and where you want to take advantage of natural language understanding, such as translating spoken English to Russian and back again.

We talked about call centers. Natural language understanding can be incredibly powerful, even if you've got a call center operator, just by monitoring the calls and trying to understand what the issues are. There's enormous amounts of natural language and analytic issues in how companies interact with their customers. WebFountain was a specific application of natural language and search technology, but it's just one.