With an accelerating movement towards graph models, we caught up with Matthias Broecheler, Director of Engineering for DSE Graph at DataStax, to get his take on the advance.
Voxxed: There seems to be an increasing movement among big data vendors towards graph models – what’s your take on this? Do you see it as a trend or an evolution of the data model?
Matthias Broecheler: We see a trend in companies being faced with increasingly connected and complex structured data because they are now starting to gather or process such data or because they are starting to extract value from it. In any case, they are finding existing database technologies to be poorly suited to handle this type of data and are looking for better alternatives, for example graph databases.
What makes this offering unique in the market?
In a nutshell, scalability and enterprise-readiness. DataStax Enterprise (DSE) Graph scales out horizontally to accommodate more data, more user requests, more write traffic or to distribute the data geographically for resilience. Most existing graph databases are scale-up or single-point-of failure architectures.
By building on top of the DSE platform, DSE Graph provides the features that enterprises have been asking for in order to confidently move graph applications to production. This includes security, encryption, managed services, visual administration, certified software, expert 24×7 support, driver support for a preferred language, a visual development studio, auditing, and much more.
What were the challenges in building this release? How do you ensure the multimodal model doesn’t come at the expense of performance?
With the DSE platform we are implementing each model separately with a special purpose query execution path. Only when the data is at rest do the different models converge. This provides an operational advantage that DSE can be managed the same irrespective of the model being used. It also means that we can optimize each model individually for its particular data access paths and usage patterns. The downside is that users cannot seamlessly move their application from one model to another since each one is purpose-build, but we believe that delivering the best possible performance and feature support for each model is most important.
What problems does it aim to solve?
Because cloud applications involve numerous components that can differ in their data model support requirements, a database that provides adaptive data management (or multi-model) capabilities will deliver a simpler and more agile solution for quickly bringing cloud applications to market. DSE Graph is part of DSE’s multi-model platform, which provides support for Apache Cassandra’s key-value and tabular data models, JSON and graph. Because data from all models is stored in a single persistence layer, each data model inherits Cassandra’s benefits as well as DSE’s enterprise-grade functionality.
What use cases do you envision for this software – and where would it not be the best case solution?
There are a variety of use cases where a graph database is a better fit than other database management systems including relational or general NoSQL database systems. They include:
- Master Data Management: A company must understand the data relationships across its multiple business units to create a holistic view of its customers. A graph model consolidates disparate data for use by both business intelligence tools and business applications.
- Recommendation and Personalisation: Enterprises need to understand how to quickly and effectively influence customers to purchase their product. Graph analysis is the most effective tool for handling recommendation and personalisation tasks in an application and making key decisions from the value found in data.
- Security and Fraud Detection: In a complex and highly interrelated network of users, entities, transactions, events and interactions, a graph database can help determine which interaction is fraudulent, poses a security risk or compliance concern.
- IoT and Networking: As IoT use cases commonly involve devices or machines that generate time-series information such as event and status data, graph databases work well because the streams from individual points create a high degree of complexity when blended together. Additionally, analytics involved in tasks such as root cause analysis, involve numerous relationships that form along the data elements and tend to be of much greater interest when examined collectively versus in isolation.
All of these use cases require you to build a system which integrates highly connected and heterogeneous data in order to get a comprehensive picture of what is going on and be able to react in real-time. DSE Graph is not a good fit loosely or unconnected data – like time-series, for instance.
What’s on the roadmap ahead for DataStax this year?
As we’ve discussed, a big objective is to build out a strong multi-model platform in DSE, but there are additional focuses as well. We’re also looking at simplifying and automating various tasks and operations that both architects and administrators tackle. You’ll see this manifest at both the server level and in our DataStax OpsCenter management solution that is bringing brand new levels of ease-of-use and maturity to central IT departments that are tasked with managing large, distributed database clusters.