September 10th, 2021

Finding Undetected Jumbo Chunks in MongoDB

https://www.percona.com/blog/finding-undetected-jumbo-chunks-in-mongodb/

https://www.percona.com/blog/?p=78021

Jumbo Chunks in MongoDB

Jumbo Chunks in MongoDBI recently came across an interesting case of performance issues during balancing in a MongoDB cluster. Digging through the logs, it became clear the problem was related to chunk moves taking a long time.

As we know, the default maximum chunk size is 64 MB. So these migrations are supposed to be very fast in most of the hardware in use nowadays.

In this case, there were several chunks way above that limit being moved around. How can this happen? Shouldn’t those chunks have been marked as Jumbo?

Recap on Chunk Moves

For starters, let’s review what we know about chunk moves.

MongoDB does not move a chunk if the number of documents in it is greater than 1.3 times the result of dividing the configured chunk size (default 64 MB) by the average document size.

If we have a collection with an average document size of 2 KB, a chunk with more than 42597 (65535 / 2 * 1.3) documents won’t be balanced. This might be an issue for collections with non-uniform document sizes, as the estimation won’t be very good in that case.

Also, if a chunk grows past the maximum size, it is flagged as Jumbo. This means the balancer will not try to move it. You can check out the following article for more information about dealing with jumbo chunks.

Finally, we must keep in mind that chunk moves take a metadata lock at a certain part of the process, so write operations are blocked momentarily near the end of a chunk move.

The autoSplitter in Action

Before MongoDB 4.2, the autoSplitter process was responsible for ensuring chunks do not exceed the maximum size runs on the mongos router. The router process keeps some statistics about the operations it performs. These statistics are consulted to make chunk-splitting decisions.

The problem is that in a production environment, there are usually many of these processes deployed. An individual mongos router only has a partial idea of what’s happening with a particular chunk of data.

If multiple mongos routers change data on the same chunk, it could grow beyond the maximum size and still go unnoticed.

There are many reported issues related to this topic (for example, see SERVER-44088, SERVER-13806, and SERVER-16715).

Frequently restarting mongos processes could also lead to the same problem. This causes the statistics used to make chunk-splitting decisions to be reset.

Because of these issues, the autoSplitter process was changed to run on the primary member of each shard’s replica set. This happened in MongoDB version 4.2 (see SERVER-9287 for more details).

The process should better prevent jumbo chunks now, as each shard has a more accurate view of what data it contains (and how it is being changed).

The MongoDB version, in this case, was 3.6, and the autoSplitter was not doing as good a job as expected.

Finding Undetected Jumbo Chunks in MongoDB

For starters, I came across a script to print a summary of the chunks statistics here. We can build upon it to also display information about Jumbo chunks and generate the commands required to manually split any chunks exceeding the maximum size.

var allChunkInfo = function(ns){
    var chunks = db.getSiblingDB("config").chunks.find({"ns" : ns}).sort({min:1}).noCursorTimeout(); //this will return all chunks for the ns ordered by min
    var totalChunks = 0;
    var totalJumbo = 0;
    var totalSize = 0;
    var totalEmpty = 0;
 
    chunks.forEach( 
        function printChunkInfo(chunk) { 
          var db1 = db.getSiblingDB(chunk.ns.split(".")[0]) // get the database we will be running the command against later
          var key = db.getSiblingDB("config").collections.findOne({_id:chunk.ns}).key; // will need this for the dataSize call
         
          // dataSize returns the info we need on the data, but using the estimate option to use counts is less intensive
          var dataSizeResult = db1.runCommand({datasize:chunk.ns, keyPattern:key, min:chunk.min, max:chunk.max, estimate:false});
         
          if(dataSizeResult.size > 67108864) {
            totalJumbo++;
            print('sh.splitFind("' + chunk.ns.toString() + '", ' + JSON.stringify(chunk.min) + ')' + ' // '+  chunk.shard + '    ' + Math.round(dataSizeResult.size/1024/1024) + ' MB    ' + dataSizeResult.numObjects );
          }
          totalSize += dataSizeResult.size;
          totalChunks++;
          
          if (dataSizeResult.size == 0) { totalEmpty++ }; //count empty chunks for summary
          }
    )
    print("***********Summary Chunk Information***********");
    print("Total Chunks: "+totalChunks);
    print("Total Jumbo Chunks: "+totalJumbo);
    print("Average Chunk Size (Mbytes): "+(totalSize/totalChunks/1024/1024));
    print("Empty Chunks: "+totalEmpty);
    print("Average Chunk Size (non-empty): "+(totalSize/(totalChunks-totalEmpty)/1024/1024));
}

The script has to be called from a mongos router as follows:

mongos> allChunkInfo("db.test_col")

And it will print any commands to perform chunk splits as required:

sh.splitFind("db.test_col", {"_id":"jhxT2neuI5fB4o4KBIASK1"}) // shard-1    222 MB    7970
sh.splitFind("db.test_col", {"_id":"zrAESqSZjnpnMI23oh5JZD"}) // shard-2    226 MB    7988
sh.splitFind("db.test_col", {"_id":"SgkCkfSDrY789e9nD4crk9"}) // shard-1    218 MB    7986
sh.splitFind("db.test_col", {"_id":"X5MKEH4j32OhmAhY7LGPMm"}) // shard-1    238 MB    8338
...
***********Summary Chunk Information***********
Total Chunks: 5047
Total Jumbo Chunks: 120
Average Chunk Size (Mbytes): 19.29779934868946
Empty Chunks: 1107
Average Chunk Size (non-empty): 24.719795257064895

You can see the script prints the commands to split the jumbo chunks, along with the shard where each one lives, the size of the chunk, and the number of documents on it. At the end, summary information is also displayed.

The estimate: true option in the dataSize command causes it to use the average document size of the collection for the estimation. This is not very accurate if the document size has many variances but runs faster and is less resource-intensive. Consider setting that option if you know your document size is more or less uniform.

All that is left is to run the commands generated by the script, and the jumbo chunks should be gone. The balancer will then move chunks during the next balancing window as required to even out the data. Consider scheduling the balancing window outside the peak workload hours to reduce the impact.

Conclusion

Even though MongoDB 3.6 has been EOL for some time now, many people are still running it, mostly due to outdated drivers on the application side that are not compatible with newer releases. Nowadays, the autoSplitter should be doing a better job of keeping chunk size at bay.

If you are still running a MongoDB version older than 4.2, it would be a good idea to review your chunk stats from time to time to avoid some nasty surprises.

Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.

Download Percona Distribution for MongoDB Today!