Apache Storm PMC Chair & Streaming Data Tech Lead at Hortonworks P. Taylor Goetz gives us an overview of the real-time computation framework, which has just hit its 1.0 release milestone. Goetz explains why he increasingly sees batch as becoming an “augmentation” to real-time processing, and where he sees Storm in the expanding Hadoop ecosystem. […]

Read more

On the tenth anniversary of Hadoop, Hortonworks co founder Arun Murthy gives us an overview of the latest from the ecosystem around the original big data superstar technology. We also discuss the skill gap hampering Hadoop adoption, and how the company is working to lower the barriers to adoption. Filmed at the Hadoop Summit 2016 […]

Read more

With Amazon Web Service’s re:Invent 2015 conference taking place this week, there’s been a steady stream of cloudy news releases, with new additions from both Elastic (formerly ElasticSearch) and MariaDB cropping up in the feeds. The Elastic collaboration comes in the form of a new Amazon Elasticsearch Service (Amazon ES for short). Developers can quickly […]

Read more

Couchbase is an open source, document-oriented NoSQL database for modern web, mobile, and IoT applications. It is designed for ease of development and Internet-scale performance. Couchbase 4.0 is a major release of the NoSQL database server  that includes significant advances in both architecture and features. Download Couchbase Server 4.0 now! The key features introduced in this version are: […]

Read more

Key-value NoSQL people Basho have been  steadfast in their quest to expand the database reach in the enterprise over the past 12 months. Having launched the Basho Data Platform earlier this year – an amalgamation of existing Riak technologies, including Riak KV (for key value), Riak CS cloud storage, Apache Spark cluster framework, Redis caching, […]

Read more

Following up from last week’s story, this is Part 2 of Adrian Colyer’s journey to get out of the fire swamp when it comes to data persistence and why these apps seem to be breaking down more and more. This article appeared originally on Adrian’s blog. Peering into the mist In Part I we examined […]

Read more

In this article, Sunila Gollapudi, author of Practical Machine Learning, introduces the key aspects of machine learning semantics and various toolkit options in Python. Machine learning has been around for many years now and all of us, at some point in time, have been consumers of machine learning technology. One of the most common examples […]

Read more

With release of Spark 1.5, with better performance, usability, and operational stability, many predicted a further droop in MapReduce mind share. And it seems those dark mutterings may have had a kernel of truth. A survey released today by Spark warders Databricks – the company founded by the creators of the hot young data processing technology […]

Read more

Appearing originally on his blog, Adrian Colyer embarks on a journey to get out of the fire swamp when it comes to data persistence and why these apps seem to be breaking down more and more. (*) with apologies to Moseley, Marks, and Westley. Something a little different to the regular paper reviews for the […]

Read more