What improvements were made to multikey index query planning in MongoDB version 3.4?

MongoDB 3.4 includes improvements to multikey indexes that can provide significant performance improvements. In particular, the implementation of SERVER-15086 allows query plans to apply tighter bounds when scanning a multikey index on 3.4 servers running WiredTiger.
To take advantage of these improvements, you must rebuild existing multikey indexes after completing the full 3.4 upgrade process. An eligible index returns "v" : 2 in the output of getIndexes() or "indexVersion" : 2 in the output of explain().

Bounds for Multikey Indexes in 3.4

In addition to the existing isMultiKey flag, SERVER-15086 adds multiKeyPaths information to the query planner. The optimizer can use this information to make better decisions about data in the target collection.
For example:
> db.collection.find()
{ "_id" : ObjectId("58910378aa8624eb3d6638bc"), "tags" : [ "3.2", "Driver" ], "createdDate" : ISODate("2017-01-15T00:00:00Z") }
>
> db.collection.explain().find({tags: "3.2", createdDate:{$gte:ISODate('2017-01-10'), $lte:ISODate('2017-01-20')}})
...
"isMultiKey" : true,
"multiKeyPaths" : {
   "tags" : [
      "tags"
      ],
   "createdDate" : [ ]
},
...
"indexBounds" : {
   "tags" : [
      "[\"3.2\", \"3.2\"]"
   ],
   "createdDate" : [
      "[new Date(1484006400000), new Date(1484870400000)]"
   ]
}
...
The existence of multiKeyPaths alone is not enough to allow the optimizer to tightly bound the index. Rather, it is the information provided by this field that allows the optimizer to understand when the collection schema will support tighter index bounds without inadvertently missing data.
To demonstrate this, consider what happens to the index bounds when we add another document with a different shape (array for createdDate) to this collection in MongoDB 3.4:
> db.collection.find()
{ "_id" : ObjectId("58910378aa8624eb3d6638bc"), "tags" : [ "3.2", "Driver" ], "createdDate" : ISODate("2017-01-15T00:00:00Z") }
{ "_id" : ObjectId("58910538aa8624eb3d6638bd"), "tags" : "3.2", "createdDate" : [ ISODate("2017-01-15T00:00:00Z"), ISODate("2017-01-30T00:00:00Z") ] }
>
> db.collection.explain().find({tags: "3.2", createdDate:{$gte:ISODate('2017-01-10'), $lte:ISODate('2017-01-20')}})
...
"isMultiKey" : true,
"multiKeyPaths" : {
   "tags" : [
      "tags"
   ],
   "createdDate" : [
      "createdDate"
   ]
},
...
"indexBounds" : {
   "tags" : [
      "[\"3.2\", \"3.2\"]"
   ],
   "createdDate" : [
      "(true, new Date(1484870400000)]"
   ]
}
...
Both tags and createdDate now appear in multiKeyPaths, and the lower bound applied to createdDate in the first explain is removed.

Comments