NAIROBI, Kenya, Mar 13 – IBM has announced several new IBM Watson technologies designed to help organizations begin identifying, understanding and analyzing some of the most challenging aspects of the English language with greater clarity, for greater insights.
The new technologies represent the first commercialization of key Natural Language Processing (NLP) capabilities to come from IBM Research’s Project Debater, the only AI system capable of debating humans on complex topics.
For example, a new advanced sentiment analysis feature is defined to identify and analyze idioms and colloquialisms for the first time. Phrases, like ‘hardly helpful,’ or ‘hot under the collar,’ have been challenging for AI systems because they are difficult for algorithms to spot.
With advanced sentiment analysis, businesses can begin analyzing such language data with Watson APIs for a more holistic understanding of their operation.
Further, IBM is bringing technology from IBM Research for understanding business documents, such as PDF’s and contracts, to also add to their AI models.
“Language is a tool for expressing thought and opinion, as much as it is a tool for information,” said Rob Thomas, General Manager, IBM Data and AI.
“This is why we’re harvesting technology from Project Debater and integrating it into Watson – to enable businesses to capture, analyze, and understand more from human language and start to transform how they utilize intellectual capital codified in data.”
The company will integrate Project Debater technologies into Watson throughout the year, with a focus advancing clients’ ability to exploit natural language through analysis – IBM has enhanced sentiment analysis to be able to better identify and understand complicated word schemes like idioms (phrases and expressions) and so called, sentiment shifters, which are combinations of words that, together, take on new meaning, such as, “hardly helpful.”
This technology will be integrated into Watson Natural Language Understanding this month. In addition, we are announcing a new classification technology that will enable clients to create AI models that can more easily classify clauses that occur in business documents, like procurement contracts.
Based on Project Debater’s deep learning-based classification technology, the new capability can learn from as few as several hundred samples to do new classifications quickly and easily. It will be added to Watson Discovery later this year.
It will also exploit natural language through Briefs/Summarization. This technology pulls textual data from a variety of sources to provide users with a summary of what is being said and written about a particular topic. An early version of Summarization was leveraged at The GRAMMYS this year to analyze over 18 million articles, blogs and bios to produce bite-sized insights on hundreds of GRAMMY artists and celebrities.
It will also exploit natural language through Clustering or Advanced Topic Clustering. Building on insights gained from Project Debater, new topic clustering techniques will enable users to “cluster” incoming data to create meaningful “topics” of related information, which can then be analyzed.
The technique, which will be integrated into Watson Discovery later this year, will also allow subject matter experts to customize and fine-tune the topics to reflect the language of specific businesses or industries, like insurance, healthcare and manufacturing.