Around a year ago Voxxed co-founders Stephan Janssen and Mark Hazell were chatting and three central themes repeatedly arose in their conversation.
- Can we help developers solve problems more easily?
- Is the video and written content we have easy enough to discover, and are people making the most of it?
- Can we do something fun but worthwhile with machine learning and AI?
Fast forward a few months and during a conversation with IBM’s Reggie Hutcherson, they asked him about using Watson in a cool way. Thus the framework for what we’re now calling Project Sherlock started taking shape.
At Devoxx UK in June, Stephan and IBM’s Sandhya Kapoor pulled back the drapes on Project Sherlock during a conference session titled AI AI Captain.
‘You know when a problem has you stuck, you Google’d it, checked Stack Overflow and still can’t find what you need?’ started the conference session description. Sound familiar? Of course it does. Many a developer has spent hours trawling Stack Overflow threads in the desperate hope that someone else has suffered the same obscure problem as them. But is there another way? How do you search for something seemingly unsearchable?
Stephan and Sandhya outlined some of the vision behind the concept that would initially let users search contextually the 1,000+ written articles on Voxxed.com and many thousands of hours of video presentations from Devoxx events around the world.
They also enlisted a Pepper robot to help them explain how this could evolve and be used.
Pepper, it turns out, is capable of detecting emotions and adapting its tone to the situation, providing realistic conversation. He can access different software packages and be hooked up to IBM’s Watson, a technology platform powered by machine learning – which you may remember as the winner of Jeopardy in 2011. You can feed data in to Watson via a series of APIs. Watson learns from the data, and you can ask Pepper to query the results and gain insights from it.
This is amazing, Stephan assured us, as it demystifies AI and machine learning: instead of writing your own algorithms to cluster unlabelled data or search for patterns in a tsunami of information, you can ask a handy API. Forget scrubbing up on Support Vector Machines: learn how to query an interface.
Building an intelligent platform becomes simple, and we had a Sherlock demo using Voxxed articles and Devoxx videos.
They then fed in an article to the AlchemyAPI. We learned that using the API is incredibly simple (and well documented here), you just need an access key. The Alchemy Language Service uses natural language processing and can extract meta-data from the text: author, concepts, themes, relationships, sentiment and even emotion (which is in Beta).
Want to post an audio file instead of an article? Easy with Watson Speech to Text to convert it into text. This doesn’t always get the right words, but displays a confidence rating which you can configure according to context. For example, Watson may guess a spoken word is “lamb” with a confidence rating of 80%, and then 20% confidence it is “lambda”. When an article is about Java, you could configure it so that it always favours the latter option.
Once several articles have been analysed and Watson has been trained, there are patterns at your finger tips. Concept Insights can understand the high level concepts in the articles and recommend related documents. Now when searching for “lamdbas and streams” you could get the most relevant articles. If you’re feeling adventurous, you can present the article thumbnails, related video clips and snippets of text. If you’re feeling really adventurous, you can use Alchemy Vision to detect the keywords from a thumbnail, like “person” or “car”.
But why stop with articles and videos in English: there is also a Language Translation service which widens our pool of resources to videos from Devoxx France and beyond. This is the next step for Project Sherlock. And why stop there – the full list of services available from IBM Watson is here. With so much data available, it’s easy to imagine a situation where awkward development conundrums are quickly solved with the most relevant content.
It only took Stephan two weeks around a hectic schedule to set up this basic demonstration, with Watson expert Sandhya from IBM acting as the wise whisper in his ear, guiding his progress. Take a look at the github if you are interested in getting a feel for the code or contributing.
Watson has clear benefits. It reveals a realm of possibilities to the AI-uninitiated Java developer and, up to a point, Watson is free to play around with. With AlchemyAPI, you can make up to 1000 API calls a day on the free plan. Watson is well documented and there are essential IBM resources to help you get started including a developer community.
What Watson will not do is teach you how the machine learning process works. It wont teach you how to train a clustering algorithm, or predict the toxicity of a chemical compound. However it will empower you to quickly make intelligent applications and cognitive engines. If you are interested in machine learning, it is a great place to start. Who knows? Perhaps at the next Devoxx UK, Pepper will be ready to answer our Stack Overflow problems with a human-like conversation, video clip or extract from an expert Voxxed article.
Project Sherlock is the start of an initiative we hope many developers will contribute to. During their Devoxx UK session Stephan and Sandhya used Pepper as a user interface, but this is just as easily used if you don’t happen to have a star-of-the-art humanoid robot lying around in a corner of your garage. The end goal is contextual search and discovery, with responses anchored directly to confidence score rated sections of text and video content. This will take some time, but for now any developers with an interest in the topic can get involved, with iterations and project progress showcased at the following Devoxx event. If you want to know more, check out the interview with Stephan Janssen below. You can also check out the Sherlock “uploader” project on GitHub and the Sherlock demo here.
If interested in helping out you can also join our Slack group.