Basho, vendors of NoSQL key-value data store Riak, has just announced the 1.3 release of time series data oriented RiakTS. With this release, along with new seamless integration to Apache Spark, a standard SQL-based query system, and zippy write and query performance for time series data, Basho have made the decision to open source RiakTS under the Apache v2 licence (available for download here.). We spoke to Manu Marchal, MD EMEA, Basho, for the full story behind the decision.
Voxxed: What were the key reasons for open sourcing Riak TS?
Marchal: At Basho, we are proud of our open source heritage and are committed to continuing to support the Open Source Riak Community. We’re also aware that it matters to companies and our clients that we are available as open source. While the NHS is a commercial customer, being an open source provider was a part of their selection criteria. Possibly more importantly, it allows the growing community of engineers, architects, academics and enthusiasts to take open source and extend it into places that we can’t easily reach with just our own resources. We’ve recently seen some great examples of this with the work ClusterHQ did with Mesos, Marathon and Flocker, and the message queuing work that Tapjoy developed which they call Dynamiq.
With IoT increasingly touching almost every aspect of our lives, the associated data use cases are on the rise, which means our existing community of Riak KV OSS will most likely also have time series needs – and we want to be there for them. For example we’ve seen metrics evolving into a popular need and use case. Whether you are a social media, security, service or infrastructure company, you are storing and analysing log and metric data. Riak TS enables you to aggregate data over time to analyse system and device issues, or take action based on your metric analytics, such as system statistics (I/O and CPU) and SLAs (uptime and performance).
Who do you see as most benefitting from this decision?
The Riak community – we believe they have time series needs. However, also the emerging community of people tackling IoT projects. To be successful and deliver on the promises businesses are making to deliver powerful, real time and valuable experience in today’s highly connected world of very savvy customers and digital transformations, they need a highly resilient and scalable time series database. We believe that success in an IoT world works from the ground up. Having a view on how this evolving data can impact success and drive competitive advantage is a business imperative, but one that can really only be realized if you are using the right solutions from the heart of the project up. You need new solutions for new problems.
What’s uptake been like for RiakTS?
We’ve been surprised by the level of interest. Medium to large organizations have been seeking an enterprise grade time series database. Existing solutions either didn’t work at production scale or were too operationally complex. One competitor even says “we don’t efficiently query time stamped data” in some of their presentations. So we were seeing a pent-up demand.
Personally being interested in Formula E myself I’ve been excited to see Riak TS being implemented by our customer Intellicore, in order to support the viewer experience at events worldwide. As Christian Trotobas, Intellicore’s CEO, highlighted it can be very challenging to leverage insights and value in real-time from many different data streams, especially in these environments where there are large streams of sensor information and literally hundreds of models running to provide the enhanced user experience. As he said himself: “we quickly reached the limits of existing database technologies.
Because Riak TS is engineered and optimised for time series datasets, our ability to rapidly deliver at massive scale has been enhanced.” We’re looking forward to hearing more stories such as this one.
Why did you choose to add SQL support?
Unlike a key value store where you are asked to store almost anything, hence “unstructured data,” time series data is semi structured, so you are able to describe it in a schema. It lends itself to being stored in a table structure and being thought of in that familiar way. SQL is a well-known query language and suits the task of querying tabular data. By using SQL we also reduce the entry barrier to developers using Riak TS because they are highly likely to know SQL, and be comfortable with using it.
From your perspective, how healthy is the IoT market looking beneath all the hype?
Sure there is some hype, but this really is the 3rd industrial revolution. It is in our homes, our cars, our factories, our hospitals. They have even instrumented the bicycles used in the Tour de France and as discussed in Formula-E cars. Maybe we will see accelerometers in the heads of golf clubs to help a player’s game.
As with every new revolution there will be growth as well as failings as we attempt to unearth the valuable use-cases that add to our lives versus those that sounded great but didn’t really deliver any benefit. Ultimately technology is constantly evolving, and as it does we will continue to discover those moments where it enhances experiences and changes the way we view and engage with the world. How fast that will happen is maybe another question but even with all the hype, IoT isn’t a passing phase. It is adding value now and will continue to do so as it becomes a part of our everyday lives.
Do you see a need for IoT-oriented database providers to step up more when it comes to issues like security and data privacy?
Security and privacy are always issues when you are dealing with data, to different degrees depending on the type and source of the data. The deeper IoT moves into our lives the more attention there will need to be on these issues, without question.