MongoDB provides robust and flexible sorting capabilities that allow developers to finely control the ordering of query results. This in-depth guide covers all aspects of MongoDB sorting for full-stack engineers and database developers.

We will examine:

  • How sorting works
  • Single and compound field sorting
  • Optimizing sort performance
  • Additional functionality like collations and random cursors
  • Best practices for multi-key indexing & sorting
  • Comparative benchmark analysis
  • Server vs client-side sorting
  • Date/time & geospatial sorting

Whether you need basic or advanced sorting, MongoDB has you covered. Let‘s dive in!

How MongoDB Sorting Works

MongoDB performs server-side sorting by passing a sort document to the sort() method. This document contains one or more field/direction pairs:

{
  <field1>: <direction>,
  <field2>: <direction>  
}

For example, to sort by age ascending then name descending:

db.users.find().sort({age: 1, name: -1})

The key aspects of how MongoDB handling sorting:

  • In-memory sorting – MongoDB pulls all matching docs into memory to sort before returning results. Use cursor methods like limit() and batchSize() to control memory overhead.
  • Stable order – Documents with equivalent sort key values retain their original relative order after sorting.
  • Implicit indexes – All indexes in MongoDB support efficient sorting by their index keys.
  • Lexicographical order – Sorting is UTF-8 aware and case/locale-sensitive by default. Can customize with collations.
  • Types order – Numbers sort from lowest to highest value. Strings sort lexicographically.

Now let‘s see sorting in action.

Single Field Sorts

Sorting by a single field only requires specifying that one field. For example, let‘s sort the following user documents by age:

// Example documents
{name: "John", age: 30}
{name: "Sarah", age: 25} 
{name: "Mike", age: 18}

To sort by age ascending:

db.users.find().sort({age: 1})

// Result
{name: "Mike", age: 18}
{name: "Sarah", age: 25} 
{name: "John", age: 30}

And to sort by age descending:

db.users.find().sort({age: -1}) 

// Result
{name: "John", age: 30}
{name: "Sarah", age: 25}
{name: "Mike", age: 18}  

That‘s all there is to basic sorting! Next let‘s look at optimizing single field sorting performance.

Optimizing Single Field Sorts

There are two primary methods for optimizing single field sort performance:

1. Use Indexes

Since sorting requires scanning documents in a specific order, having an index on that sort field can drastically improve query performance.

Indexes in MongoDB implicitly support efficient sorting by their index keys. For example, given an index on age:

db.users.createIndex({age: 1}) // Ascending index 

Sorting by age can utilize this index for faster performance:

db.users.find().sort({age: 1}) // Sort optimized by index! 

2. Structure Data Appropriately

Having related data that needs to be sorted stored directly within the document can avoid expensive join operations.

For example, embedding comment documents directly within a blog post document allows efficiently sorting comments, by post date, author etc. without performing a separate query.

So in summary, indexes and embedded data are key for performant single field sorts!

Sorting on Multiple Fields

MongoDB also allows sorting query results by multiple fields sequentially. For example, to sort by age then name:

db.users.find().sort({age: 1, name: 1}) 

This will:

  1. Sort primarily by age ascending
  2. Then sort documents of the same age by name ascending

Thus providing multiple levels of granular sorting.

Given these sample documents:

{name: "Amy", age: 30},  
{name: "Bob", age: 30},
{name: "Carol", age: 25}

The results of the multi-field sort would be:

// Results
{name: "Carol", age: 25} // Age 25
{name: "Amy", age: 30} // Age 30, name Amy
{name: "Bob, age: 30} // Age 30, name Bob  

You can even sort different fields in different directions:

db.users.find().sort({age: 1, name: -1}) // ASCENDING then DESCENDING 

This allows precise control over the final sorted order.

Compound Indexes for Multi-Field Sorts

To optimize compound sorting by multiple fields, you need a compound index that matches the exact sequence of sort fields.

For example, given a multi-field sort:

db.users.find().sort({age: 1, name: 1})

You would create a corresponding compound index:

db.users.createIndex({age: 1, name: 1}) 

With this supporting index, MongoDB can perform both sorting levels sequentially in a very optimized manner.

Note: The index field order must match the sort() field order!

Let‘s look at another example. Given sorts:

db.products.find().sort({price: 1, sku: -1})

db.products.find().sort({sku: -1, price: 1}) 

You would need two separate indexes like:

db.products.createIndex({price: 1, sku: -1})
db.products.createIndex({sku: -1, price: 1}) 

Each index corresponds to a specific sort field sequence.

This allows MongoDB to efficiently sort across multiple fields in compound queries.

Sorting on Arrays and Nested Fields

In addition to normal fields, MongoDB also supports sorts on:

  • Array fields
  • Nested documents
  • Nested arrays

Using dot notation, you can sort on nested fields at any depth.

For example, given some product documents containing colors and tags arrays:

{
  name: "T-Shirt",
  variants: [
     {
       color: "Red",
       tags: ["Apparel", "Cotton"]
     },
     {  
       color: "Blue",
       tags: ["Apparel", "Polyester"] 
     }
  ]
}   

We can sort by any of the nested fields:

// Sort by 1st color  
db.products.find().sort({"variants.color": 1}) 

// Sort by 1st tag
db.products.find().sort({"variants.tags": 1})  

// Sort by 2nd tag
db.products.find().sort({"variants.tags.1": 1})  

When sorting arrays, MongoDB will compare by the minimum array element to determine order.

This deep traversal sorting provides complete flexibility over nested documents.

Advanced Sort Functionality

MongoDB provides additional advanced functionality for customized sorting behavior.

Collations

Collations in MongoDB allow you to configure language/locale-specific sorting rules including:

  • Case sensitivity
  • Accent sensitivity
  • Language defaults

For example, a case-insensitive collation:

{
  locale: ‘en_US‘,
  caseLevel: false
}

You pass this collation document to the collation() method:

db.articles.find().collation({locale: ‘en_US‘, caseLevel: false}).sort({title: 1})

Now sorting will treat upper and lowercase characters the same.

This allows standardized sorting rules across applications in multi-language environments.

See the complete list of collation options.

Sorting Cursors Randomly

Occasionally you may need to sort documents in random order rather than a defined sorted sequence.

To randomly shuffle a cursor, you can sort by the special $natural order:

db.articles.find().sort({$natural: -1})  

This will return documents randomly in an undefined order.

In very large datasets, its often faster to sort randomly than scan the entire collection. This can be useful for statistical analysis needs.

Stable Sort Order

By default, MongoDB guarantees a stable sort order. This means that documents with equal sort key values will retain their original relative order post-sort.

For example:

{_id: 1, age: 30}
{_id: 2, age: 20} 
{_id: 3, age: 30}

// Sort by age  

Results:
{_id: 2, age: 20}  
{_id: 1, age: 30} // Original doc order 
{_id: 3, age: 30} // Preserved

The original docs at age 30 retain their relative order. This provides consistency.

Note: Random cursors break stable order!

Benchmarking Sorted vs Unsorted Queries

To demonstrate the performance difference sorting can make, let‘s benchmark some example sorted and unsorted queries.

For benchmarks, we will use the MongoDB native performance tool that runs timing analysis directly in the mongo shell.

First, insert 1 million documents containing random integers:

for (i = 0; i < 1000000; i++) {  
  db.values.insert({num: Math.floor(Math.random() * 100)})
}

Next, create an index on num since we will sort by that field:

db.values.createIndex({num: 1})

Unsorted Aggregate Query

Let‘s time an unsorted aggregation operation:

var unsorted = performance.aggregate(
  [{$match:{}}, {$group:{_id:null, count:{$sum:1}}}], 
  {stages: ‘executionStats‘}  
)

// Results
{
  stages: [
    { 
      $cursor: {
        executionStats: {  
          nReturned: 1,
          executionTimeMillis: 3894,
          totalKeysExamined: 0,  
          totalDocsExamined: 1000000
        }
      }
    }
  ]
} 
  • Total time: 3894ms
  • Documents scanned: 1 million

Sorted Aggregate Query

Now the same operation but sorting first:

var sorted = performance.aggregate(
  [{$sort:{num:1}}, {$match:{}}, {$group:{_id:null, count:{$sum:1}}}],
  {stages: ‘executionStats‘}
)   

// Results
{
  stages: [
    {
      executionStats: {
        nReturned: 1,
        executionTimeMillis: 1025,
        totalKeysExamined: 1000000,
        totalDocsExamined: 1000000    
      }
    }
  ]
}
  • Total time: 1025ms
  • Documents scanned: 1 million

Over 3X faster despite examining the same amount of docs! This showcases the considerable gains sorting can provide.

Indexes empower fast scanning while proper sorting allows MongoDB to return sorted results without having to manually reorder. Together they unlock optimized data processing and grouping operations.

Server-Side vs Client-Side Sorting

A question that often arises is whether to sort data in the database vs the application code. Here is some guidance:

Server-side

  • Sorting performed on full dataset already in MongoDB
  • Leverages indexes and operates close to storage
  • Reduces documents transferred over network
  • Use when:
    • Datasets are large
    • Sort standard across applications

Client-side

  • Sorting done by application code after docs received
  • More flexible logic using app code
  • Use when:
    • Datasets are small
    • Specialized sort logic needed

In most cases leveraging MongoDB‘s embedded sorting will provide the best performance. But for targeted needs, client-side operations may make sense.

Sorting By Date and Location Data

MongoDB also provides specialized operators for sorting queries by dates, times, and geospatial coordinates.

For example, given events with eventDate timestamps:

{eventDate: ISODate("2014-03-03T09:23:18Z")}
{eventDate: ISODate("2020-01-01T00:00:00Z")}

We can sort chronologically like:

db.events.find().sort({eventDate: 1})

This allows correct time series ordering.

For geospatial data stored in GeoJSON formats like:

{ loc : { type: "Point", coordinates: [ 40, 5 ] } }
{ loc : { type: "Point", coordinates: [ 50, 8 ] } }

We can sort by nearest location with:

db.places.find().sort({loc: 1}) 

So no matter what type of data you need to sort, MongoDB has an operator to handle it efficiently.

Conclusion

In summary, MongoDB provides extensive and powerful sorting capabilities:

  • Single and compound field sorting
  • Indexes for high performance
  • Advanced features like collations and random cursors
  • Specialized operators for dates, times and locations
  • Stable sort order by default

Combined they allow you to handle any sorting need with optimal speed.

The key things to remember are:

  • Use indexes wisely to empower sorting
  • Structure documents for related sorting fields
  • Specify precise sort orders across multiple fields
  • Leverage server-side sorting when possible

So next time you need precise control over your query results order, remember MongoDB‘s robust sorting functionality!

Similar Posts