2.3. Defining documents

In MongoDB, a document is roughly equivalent to a row in an RDBMS. When working with relational databases, rows are stored in tables, which have a strict schema that the rows follow. MongoDB stores documents in collections rather than tables — the principal difference is that no schema is enforced at a database level. MongoEngine allows you to define schemata for documents as this helps to reduce coding errors, and allows for utility methods to be defined on fields which may be present.

2.3.1. Defining a document's schema

To define a schema for a document, create a class that inherits from Document. Fields are specified by adding field objects as class attributes to the document class:

from mongoengine import *
import datetime

class Page(Document):
    title = StringField(max_length=200, required=True)
    date_modified = DateTimeField(default=datetime.datetime.utcnow)

As BSON (the binary format for storing data in MongoDB) is order dependent, documents are serialized based on their field order.

2.3.2. Dynamic Document Schemas

One of the benefits of MongoDB is dynamic schemas for a collection, whilst data should be planned and organised (after all explicit is better than implicit!) there are scenarios where having dynamic / expando style documents is desirable. DynamicDocument documents work in the same way as Document but any data / attributes set to them will also be saved:

from mongoengine import *

class Page(DynamicDocument):
    title = StringField(max_length=200, required=True)

# Create a new page and add tags
>>> page = Page(title='Using MongoEngine')
>>> page.tags = ['mongodb', 'mongoengine']
>>> page.save()

>>> Page.objects(tags='mongodb').count()
1

2.3.3. Fields

By default, fields are not required. To make a field mandatory, set the required keyword argument of a field to True. Fields also may have validation constraints available (such as max_length in the example above). Fields may also take default values, which will be used if a value is not provided. Default values may optionally be a callable, which will be called to retrieve the value (such as in the above example). The field types available are as follows:

For a full list, refer to the API Reference.

Field arguments

Each field type can be customised by keyword arguments. The following keyword arguments can be set on all fields:

List fields

MongoDB allows storing lists of items. To add a list to a document, use the ListField field type. ListField takes another field object as its first argument, which specifies which type elements may be stored within the list:

class Page(Document):
    tags = ListField(StringField(max_length=50))

Embedded documents

MongoDB has the ability to embed documents within other documents. Schemata may be defined for these embedded documents, just as they may be for regular documents. To create an embedded document, define a document that inherits from EmbeddedDocument rather than Document:

class Comment(EmbeddedDocument):
    content = StringField()

To embed the document within another document, use the EmbeddedDocumentField field type, providing the embedded document class as the first argument:

class Page(Document):
    comments = ListField(EmbeddedDocumentField(Comment))

comment1 = Comment(content='Good work!')
comment2 = Comment(content='Nice article!')
page = Page(comments=[comment1, comment2])

Dictionary Fields

Often, an embedded document may be used instead of a dictionary — generally embedded documents are recommended as dictionaries don't support validation or custom field types. However, sometimes a dictionary is more appropriate, for example when the structure of a field is unknown. In this situation, DictField may be used:

class SurveyResponse(Document):
    date = DateTimeField()
    user = ReferenceField(User)
    answers = DictField()

2.3.4. Document collections

Document classes that inherit directly from Document will have their own collection in the database. The name of the collection is by default the name of the class converted to lowercase. If you need to change the name of the collection (e.g. to use MongoEngine with an existing database), then create a class dictionary attribute called meta on your document, and set collection to the name of the collection that you want your document class to use:

# Will work with data in an existing collection named 'cmsPage'
class Page(Document):
    title = StringField(max_length=200, required=True)
    meta = {
        'collection': 'cmsPage'
    }

2.3.5. Indexes

You can specify indexes on collections to make querying faster. This is done by creating a list of index specifications in the meta dictionary. Indexes may be specified as a single field, a tuple of fields, or a dictionary of options and the fields to index:

class Page(Document):
    category = IntField()
    title = StringField()
    rating = StringField()
    created = DateTimeField()
    meta = {
        'indexes': [
            'title',               # single-field index
            '$title',              # text index
            '#title',              # hashed index
            ('title', '-rating'),  # compound index
            ('category', '_cls'), # compound index
            {
                'fields': ['created'],
                'expireAfterSeconds': 3600  # ttl index
            }
        ]
    }
Note Inheritance adds extra field indices. See: Document inheritance.

There are a few top level defaults for all indexes that can be set:

class Page(Document):
    title = StringField()
    rating = StringField()
    meta = {
        'index_opts': {},
        'index_background': True,
        'index_cls': False,
        'auto_create_index': True,
        'auto_create_index_on_save': False,
    }

2.3.6. Ordering

A default ordering can be specified for your QuerySet using the ordering attribute of meta. Ordering will be applied when the QuerySet is created, and can be overridden by subsequent calls to order_by():

from datetime import datetime

class BlogPost(Document):
    title = StringField()
    published_date = DateTimeField()

    meta = {
        'ordering': ['-published_date']
    }

2.3.7. Shard keys

If you have a sharded MongoDB cluster, you can specify the shard key as a tuple of fields using the shard_key attribute of meta. This ensures the shard key fields are provided when saving a document:

class LogEntry(Document):
    machine = StringField()
    app = StringField()
    timestamp = DateTimeField()
    data = StringField()

    meta = {
        'shard_key': ('machine', 'timestamp',)
    }

2.3.8. Document inheritance

To create a specialised type of a Document you have defined, you may subclass it and add any extra fields or methods you may need. As the document is now made up of two classes you must tell MongoEngine that inheritance is permitted on your original document by setting allow_inheritance to True in the meta:

class Page(Document):
    title = StringField(max_length=200, required=True)
    meta = {'allow_inheritance': True}

class DatedPage(Page):
    date = DateTimeField()

Behind the scenes, MongoEngine deals with inheritance by adding a _cls attribute that contains the class name in every document. When a document is loaded, MongoEngine checks its _cls attribute and uses that class to construct the instance:

Page(title='a funky title').save()
DatedPage(title='another title', date=datetime.utcnow()).save()

print(Page.objects().count())      # 2
print(DatedPage.objects().count()) # 1

# print documents in their native form
qs = Page.objects.exclude('id').as_pymongo()
print(list(qs))
# [
#   {'_cls': 'Page', 'title': 'a funky title'},
#   {'_cls': 'Page.DatedPage', 'title': 'another title', 'date': datetime(...)}
# ]

2.3.9. Abstract classes

If you want to add some extra functionality to a group of Document classes but you don't need or want the overhead of inheritance, you can use the abstract attribute of meta. This won't turn on document inheritance but will allow you to keep your code DRY:

class BaseDocument(Document):
    meta = {
        'abstract': True,
    }
    def check_permissions(self):
        ...

class User(BaseDocument):
    ...

Now the User class will have access to the inherited check_permissions method and won't store any of the extra _cls information.