Enormous-giant-humongous fund-raising database provider MongoDB has consistently outperformed all contenders when it comes to NoSQL evangelism. And with today’s official GA green light for major releases MongoDB 3.0 and MongoDB Ops Manager, the company formerly known as 10gen is hoping to cut and even wider swath through the market. In this interview, Kelly Stirman walks us through the big things you should know about this double drop.
Voxxed: What are the headline changes in this release?
Stirman: MongoDB captivated developers in the way it is designed with the developer in mind – it’s just natural and intuitive for developers to use. On the theme of making things easier for developers, and with the growth of DevOps, there are a couple of exciting things in mind.
We’ve dramatically redesigned the architecture on the database to incorporate a pluggable storage engine. Let’s think about it this way: cars are a little bit like a database in that they have major functional systems. Some only come with one engine option – but, in some, you get the choice between diesel and gasoline. So you can choose between fuel efficiency or power. Databases also have many functional components, and a lot of them only have one storage engine. Some of them however, give you the option of multiples – ie. MySQL. And for the first time, MongoDB is going from one to three. The motivation is to let developers pick the engine that’s best for the tech they are building.
Can you walk us through these engines?
The first is MMAPv1. It’s an upgraded version of the storage engine that’s always been in MongoDB, but we’ve moved control and the concurrency model and now provide more efficient and concurrent handling of workloads.
The second is called WiredTiger. We acquired the company WiredTiger in December, and that’s the same team that created Berkeley DB – the most widely used embedded database in the world. That team sold Berkeley DB to Oracle ten years ago. 15 years on, things are very different. Hardware has evolved, computer algorithms are different, and so the team decided to re-examine the technology. We met them and it was love at first sight. Whilst Mongo was the complete car, they had an engine we thought would fit perfectly. It’s got document level concurrency control, and makes compression available to reduce the footprint of your database by up to 80%.
The third is an in-memory storage engine, which is exciting. As CPU speeds improved, memory costs and seed have gone up, but storage hasn’t changed in the past few decades. People minimise writing to disk or write it out of the critical path to not affect the application. This third engine lets you write in RAM without using your disk.
Another important aspect I should mention is that, uniquely, you can mix these three up in one deployment.
There’s been a trend for enterprises to employ multiple database systems to address a variety of issues – is the introduction of additional storage engines a response to this?
NoSQL is a poor name for a group of heterogeneous technologies. MongoDB is by far the most popular, and I think that’s because of the data model, which lets you do lots of different things. However, it’s not that easy to scale. For that reason, these enterprises have looked at other options. With 3.0 we’ve introduced WiredTiger as an alternative, and it removes performance differences with another competitor. If we can evolve and create specialised offerings, we can remove the need to use a whole load of technologies.
A lot of people think of operations as a tax on their business – nobody is beating their competitors on the strength of their ops – it’s just an additional cost. And the majority of these costs comes in the people running system. A few decades ago, it was hardware that was the greatest expenditure – now it’s staff. At the end of the day, automation saves you money. This is what we’re helping people do with Ops Manager. Also, developers don’t have to take their apps online, which ties into Continuous Deployment /Continuous Integration and agile trends.
What’s on the roadmap ahead for MongoDB this year?
Expect a couple of things. We will continue to develop storage engines ourselves with our partners. We know Facebook has integrated RocksDB, which they created with us. We expect others to follow suit – maybe not writing an entire storage engine, but using our open and published API to meet their needs. We’ll also continue to develop Ops Manager, adding new capabilities to help you accomplish more with less effort.
Finally, one of the things people say about NoSQL is that it’s a bit Wild West. Unlike SQL, there’s no schema, which gives more flexibility, but means ops teams can’t set rules to ensure quality data. There’s no reason MongoDB can’t supply schema to ensure, for example, all data fields meet a pattern you define. We are absolutely hoping to make schema valuation a part of MongoDB in the future.