Friday, May 25, 2007
Business Objects to Acquire Text Analytics Leader Inxight Software
Combination of Inxight and Business Objects to Deliver First Full Spectrum Business Intelligence Platform: "With the acquisition of Inxight Software, Inc., Business Objects expands its leadership in extending BI to embrace enterprise search. Going beyond basic keyword searches and solutions that simply provide a ranked listing of searched items, Inxight’s web services-based federated search and extraction capabilities extend the value of enterprise search engines by instantly clustering and filtering results from multiple search engines, including Google Search Appliance and Oracle Secure Enterprise Search. By providing a BI platform that leverages these capabilities, Business Objects will become the first vendor to bridge the gap between search and intelligence – delivering a broader view of data and dramatically accelerating the ability to locate hidden information in search results that might otherwise be overlooked. "
Tuesday, May 15, 2007
How Google translates without understanding
How Google translates without understanding The Register: "The Google approach is a lesson in practical software development: try things and see what sticks. It has just a few major steps:
1. Google starts with lots and lots of paired-example texts, like formal documents from the United Nations, in which identical content is expertly translated into many different languages. With these documents they can discover that 'white house' tends to co-occur with 'casa blanca,' so that the next time they have to translate a text containing 'white house' they will tend to use 'casa blanca' in the output.
2. They have even more untranslated text in each language, which lets them make models of 'well-formed' sentence fragments (for example, preferring 'white house' to 'house white'). So the raw output from the first translation step can be further massaged into (statistically) nicer-sounding text.
3. Their key for improving the system - and winning competitions - is an automated performance metric, which assigns a translation quality number to each translation attempt. More on this fatally weak link below."
1. Google starts with lots and lots of paired-example texts, like formal documents from the United Nations, in which identical content is expertly translated into many different languages. With these documents they can discover that 'white house' tends to co-occur with 'casa blanca,' so that the next time they have to translate a text containing 'white house' they will tend to use 'casa blanca' in the output.
2. They have even more untranslated text in each language, which lets them make models of 'well-formed' sentence fragments (for example, preferring 'white house' to 'house white'). So the raw output from the first translation step can be further massaged into (statistically) nicer-sounding text.
3. Their key for improving the system - and winning competitions - is an automated performance metric, which assigns a translation quality number to each translation attempt. More on this fatally weak link below."
Monday, May 07, 2007
PROMT 8.0: revamped translation software
PROMT revamped translation software product line: OSP International: "Evaluation of machine translation quality is usually quite individual but PROMT claims that PROMT 8.0 analyzes the context and generates grammatically correct translation of most of linguistic structures and set expressions. The user can teach the translator, enriching its vocabulary by adding personal dictionaries and using earlier translated text pieces in further translations. The quality of translation, especially of specialized texts, also largely depends on setting up software according to the document subject. The system set-up procedure, which many users used to ignore because of its length and complexity, has been much simplified in version 8.0. "