Optimizing the order of MongoDB aggregation steps

MongoDB does have a query optimizer, and in most cases it's effective at picking the best of multiple possible plans.  However it's worth remembering that in the case of the aggregate function the sequence in which various steps are executed is completely under your control.  The optimizer won't reorder steps into the optimal sequence to get you out of trouble. 


Optimizing the order of steps probably comes mainly to reducing the amount (more...)

Bulk inserts in MongoDB

Like most database systems,  MongoDB provides API calls that allow multiple documents to be inserted in a single operation.  I’ve written about similar interfaces in Oracle in the past – for instance in this post. 

Array/Bulk interfaces improve database performance markedly by reducing the number of round trips between the client and the databases – Dramatically.  To realize how fundamental an optimization this is, consider that you have a bunch of people (more...)

Graph Lookup in MongoDB 3.3

Specialized graph databases such as Neo4J specialize in traversing graphs of relationships – such as those you might find in a social network.  Many non-graph databases have been incorporating Graph Compute Engines to perform similar tasks.  In the MongoDB 3.3 release, we now have the ability to perform simple graph traversal using the $graphLookup aggregation framework function.  This will become a production feature in the 3.4 release.

The new feature (more...)

Join performance in MongoDB 3.2 using $lookup

One of the key tenants of MongoDB schema design is to account for the absence of server-side joins.  Data is joined all the time inside of application code of course, but traditionally there’s been no way to perform joins within the server itself. 

This changed in 3.2 with the introduction of the $lookup operator within the aggregation framework.  $lookup performs the equivalent of a left outer join – eg: it retrieves (more...)

Good bye Quest!

You may have read that Francisco Partners and Elliott Management have entered into an agreement to  Acquire the Dell Software Group – largely composed of the Quest software company bought from Dell in 2012 .   I’ve worked at Quest since 1998, but alas I will not be participating in this next stage of the Quest journey.
Although the timing of the announcement was influenced by the logistics of this sale, it is actually a (more...)

Next Generation Databases

dbtngMy latest book Next Generation Databases is now available to purchase!   You can buy it from Amazon here, or directly from Apress here.  The e-book versions are not quite ready but if you prefer the print version you’re good to go.

I wrote this book as an attempt to share what I’ve learned about non-relational databases in the last decade and position these in the context of the relational database landscape that (more...)

Blockchain and databases of the future

You would have to have been living under a rock for the past few years not to have heard of Bitcoin.    Bitcoin is an electronic cryptocurrency which can be used like cash in many web transactions.  At time of writing there are about 15 million Bitcoins in circulation, trading at approximately $USD 360 each for a total value of about $USD 5.3 billion.

Bitcoin combines peer-to-peer technology and public key cryptography.  (more...)

Vector clocks

 

Once of the concepts I found difficult initially when looking at non-relational systems is the concept of the vector clock.  Some databases – like Cassandra - use timestamps to work out which is the “latest” transaction. If there are two conflicting modifications to a column value, the one with the highest timestamp will be considered the most recent and the most correct.

Other Dynamo systems use a more complex mechanism known as a (more...)

Exploring CouchBase N1QL

Couchbase recently announced Non-first Normal Form Query Language (N1QL) – pronounced “Nickel” – a virtually complete SQL language implementation for use with document databases, and implemented within the Couchbase server 4.0.

I recently took a quick look. 

Most of the examples use the sample films documents shown below (this is the same sample data we created for MongoDB in this post):

2015-10-05_16-43-02 n1ql

N1QL allows us to perform basic queries to retrieve selected documents (more...)

On my way to Collaborate 2015

As I write this I’m at 35,000 ft (or so) on my way to yet another Collaborate.  This year I’ll be presenting two sessions and participating in one panel:

963: Writing to Lead
Monday, April 13  |  Banyan E, South Convention Center, Level 2
10:30 a.m. – 11:30 a.m.
Jonathan Gennick will be moderating a panel including myself, Bobby Curtis, Charles Kim, Darl Kuln and Michael Rosenblum.  We’ll be talking about how writing (more...)