The Perils of Building Indexes on MongoDB
Indexes are a critical part of any database operation. Defining the right indexes can make a huge difference to the performance of your database servers. However, creating indexes in MongoDB has several pitfalls that you need to be aware of for your day to day operations. MongoDB at a high level supports three techniques to build indexes on your collections1. Foreground Index Build
When you build an index in the foreground it blocks all other operations on the database on a large collection, this can be several hours. This implies that your database is down for the duration of the index build. Given that this is the default mode of building indexes it is not surprising that a lot of developers shoot themselves in the foot triggering accidental index builds. There is really no good reason to trigger a foreground index build on a production server (unless you know that the collection has a small amount of data)2. Background Index Build
As the name implies the background indexing process builds the index in the background without affecting the availability of your database server. However, it is still a resource intensive operation and you should expect to see performance degradation. Also since it is happening in the background it can take a lot longer to build than the foreground indexes. In the previous versions of MongoDB (< 2.6) when you did a background index build on the primary of a replica set it would run as “foreground” build on the secondary servers, thankfully it is no longer the case it is background build on all the nodes. If a background index build is interrupted it will resume as a foreground index build on server restart.3. Rolling Index Build
The rolling index build process builds the index on only one node at a time. It goes something like this
1. Rotate a secondary node out of the replica set (you can do this by changing ports or restarting in standalone mode)
2. Build the index on this node in the foreground. Once the index is built, rotate the node back into the replica set
3. Once the node has caught up to the changes move on to the next node. If the next node is the primary you will need to do a rs.stepDown() to make it into a secondary.
4. Rinse and repeat.
More details of the index build process are in the MongoDB documentation .
Using rolling index builds you can build an index without any significant performance impactfor your application. However there is failover involved so your application should be able to handle that (which it needs to anyways).
Can you do a rolling index build if you don’t have a replica set? Unfortunately for standalone instances, the only option is a “Background index build”.
The Rolling index build is our favored approach to building indexes at ScaleGrid. We even provide a UI and make it easy for you to kick off the whole process from our UI. Our backend will do all the orchestration necessary for the full index build it will trigger a server by server index build. You just need to point and click!
As always, if you have further questions you can reach out to us at [email protected]