IBM Watson: Beyond Jeopardy! Q&A
For context, please make sure you've attended the webinar, presented by Adam Lally.
Q: I am just a beginner in Computer Science. I am interested in learning about technologies used in Watson. What are they?
A: Some of the technology areas to look into are information retrieval (IR), natural-language processing (NLP), knowledge representation and reasoning (KR&R), and machine learning.
Q: How much Prolog or Horn clause-type programming does Watson use?
A: Mainly we used Prolog for pattern matching over natural language parse trees. Here is a link to an article that provides more detail of how we used Prolog: http://www.cs.nmsu.edu/ALP/2011/03/natural-language-processing-with-prolog-in-the-ibm-watson-system/.
Q: What are the most important NLP scoring algorithms? Or, based on the data, do different algorithms contribute more?
A: The key to Watson's success was not really a small number of very important algorithms, but the combination of many different algorithms. In evaluations that we did for the papers published in the IBM Journal of Research and Development, when we ablated a single evidence scoring algorithm from the full Watson system, we rarely saw a statistically significant impact on a few thousand questions, and never saw an impact of more than 1%. If you want to get an idea of what all the algorithms are, I highly recommend reading the journal papers.
Q: There is a tech solution provided by the IntelliResponse company when dealing with Q&A. This is used in high education. Do you feel that the Watson Q&A system can be implemented/adapted to deal with customer service Q&A?
A: See this: http://www-03.ibm.com/press/us/en/pressrelease/41122.wss.
Q: For machine learning, was any ILP or inductive logic programming used?
A: No, Watson did not use ILP..
Q: Is the improvement asymptotic? Any jumps?
A: As you can see from the graph of our improvement over time, we achieved big jumps over the first couple of years of the project, and then progress became more difficult because the easiest problems were all solved. However, we did continue to make steady progress..
Q: Since you do not enter all the facts into a database for Watson to retrieve, how do you feed Watson the data?
A: We feed Watson natural language text, such as encyclopedias and news articles for Jeopardy!, or medical texts for our healthcare applications.
Q: Is the result of the answer also dependent on the quality of the answer and evidence sources?
A: Yes, but a strength of Watson is that it doesn't base its decision on just one passage of text, but on multiple passages from multiple sources. So it can tolerate some amount of incorrect information in its sources if it is outweighed by good information.
Q: Did you use multiple ontologies or did you build a Jeopardy! task-specific ontology?
A: Watson doesn't require an ontology, but evidence scoring algorithms can make use of ontologies. For example, we did make extensive use of Dbpedia and the YAGO-type ontology for Jeopardy!. However such structured data was just one small part of Watson. Watson gets most of its evidence from unstructured text..
Q: How much hardware is used in processing and running the algorithms required to answer the queries?
A: For the Jeopardy! Exhibition, Watson ran on 90 IBM Power750 machines, each of which had 32 cores and up to 256GB of RAM. Watson can now run on less hardware than that, though.
Q: In what programming language(s) are the algorithms implemented?
A: Watson is implemented mostly in Java, but there are a few C components and some Prolog.
Q: Toronto based on an ontological model should have been rejected?
A: Watson does not use "hard constraints" on answer types, because in general they are too brittle. Instead Watson uses a technique we named "type coercion" where we attempt to gather evidence that the candidate answer is or is not of the specified type. One of the journal papers is devoted to this and to why it was such a successful technique for Jeopardy!. By the way, here's an interesting true story about our experience with such hard constraints: Say the question asks for a "month." We thought surely it was safe to put hard-code in Watson that the twelve months of the year are the only allowed answers, right? Then we get a Jeopardy! clue such as this one: "Id Al-Fitr is a day of feasting that ends the fast at the end of this Islamic holy month." Answer: Ramadan.
Q: How did the team get along?
A: We got along great (and still do!). I can't emphasize enough what a pleasure it is to work with such a great team.
Q: Watson is impressive on Jeopardy!. But if Watson has the medical diagnosis equivalent of the missed final Jeopardy! question, someone could die. What is your benchmark for when you would consider Watson to be ready for the real world?
A: First of all, remember that Watson had 14% confidence in Toronto being correct—Watson KNEW it was making a guess there, and obviously you wouldn't act on such guesses in a real medical scenario. In real world applications having accurate confidence estimation like this is a key advantage of the Watson technology. Second, we envision Watson not as a replacement for a doctor, but as a tool to allow a user such as a doctor to efficiently access and make sense of large amounts of unstructured information, so that he or she can ultimately make better decisions. We are not taking diagnosis decisions out of the hands of doctors.
Q: Does Watson embrace W3C Semantic Technology interface specifications like SPARQL or support Linked Open Data?
A: We did make some use of Dbpedia (see my answer to a previous question), which is represented in RDF and queries in SPARQL.
Q: How does this differ from MYCIN or a typical rule-based Expert System?
A: Expert systems use structured data—rules that were manually created by experts. These are extremely expensive to create and maintain. Watson gets its knowledge by analyzing unstructured text written in natural language. There is a large quantity of such information available and it is rapidly growing all the time. The promise of the Watson technology is to get deeper insights from this large quantity of unstructured text.