2.3. Defining documents
In MongoDB, a document is roughly equivalent to a row in an RDBMS. When working with relational databases, rows are stored in tables, which have a strict schema that the rows follow. MongoDB stores documents in collections rather than tables — the principal difference is that no schema is enforced at a database level. MongoEngine allows you to define schemata for documents as this helps to reduce coding errors, and allows for utility methods to be defined on fields which may be present.
2.3.1. Defining a document's schema
To define a schema for a document, create a class that inherits from
Document. Fields are specified by adding field objects as class attributes to
the document class:
from mongoengine import *
import datetime
class Page(Document):
title = StringField(max_length=200, required=True)
date_modified = DateTimeField(default=datetime.datetime.utcnow)
As BSON (the binary format for storing data in MongoDB) is order dependent, documents are serialized based on their field order.
2.3.2. Dynamic Document Schemas
One of the benefits of MongoDB is dynamic schemas for a collection, whilst data should be
planned and organised (after all explicit is better than implicit!) there are scenarios where
having dynamic / expando style documents is desirable. DynamicDocument documents
work in the same way as Document but any data / attributes set to them will also
be saved:
from mongoengine import *
class Page(DynamicDocument):
title = StringField(max_length=200, required=True)
# Create a new page and add tags
>>> page = Page(title='Using MongoEngine')
>>> page.tags = ['mongodb', 'mongoengine']
>>> page.save()
>>> Page.objects(tags='mongodb').count()
1
2.3.3. Fields
By default, fields are not required. To make a field mandatory, set the
required keyword argument of a field to True. Fields also may have
validation constraints available (such as max_length in the example above).
Fields may also take default values, which will be used if a value is not provided. Default
values may optionally be a callable, which will be called to retrieve the value (such as in
the above example). The field types available are as follows:
BinaryFieldBooleanFieldComplexDateTimeFieldDateFieldDateTimeFieldDecimalFieldDictFieldEmailFieldEmbeddedDocumentFieldEmbeddedDocumentListFieldEnumFieldFileFieldFloatFieldGenericEmbeddedDocumentFieldGenericLazyReferenceFieldGenericReferenceFieldGeoJsonBaseFieldGeoPointFieldImageFieldIntFieldLazyReferenceFieldLineStringFieldListFieldLongFieldMapFieldMultiLineStringFieldMultiPointFieldMultiPolygonFieldObjectIdFieldPointFieldPolygonFieldReferenceFieldSequenceFieldSortedListFieldStringFieldURLFieldUUIDField
For a full list, refer to the API Reference.
Field arguments
Each field type can be customised by keyword arguments. The following keyword arguments can be set on all fields:
db_field— The MongoDB field name.required— If set toTrueand the field is not set on the document instance, aValidationErrorwill be raised when the document is validated.default— A value to use when no value is set for this field.unique— WhenTrue, no documents in the collection will have the same value for this field.unique_with— A field name (or list of field names) that when taken together with this field, will not have two documents in the collection with the same value.primary_key— WhenTrue, use this field as the primary key for the collection.choices— An iterable of choices to which the value of this field should be limited.validation— A callable to validate the value of the field.name— A name to use when defining indexes in the meta dictionary.
List fields
MongoDB allows storing lists of items. To add a list to a document, use the
ListField field type. ListField takes another field object as its
first argument, which specifies which type elements may be stored within the list:
class Page(Document):
tags = ListField(StringField(max_length=50))
Embedded documents
MongoDB has the ability to embed documents within other documents. Schemata may be defined
for these embedded documents, just as they may be for regular documents. To create an
embedded document, define a document that inherits from
EmbeddedDocument rather than Document:
class Comment(EmbeddedDocument):
content = StringField()
To embed the document within another document, use the
EmbeddedDocumentField field type, providing the embedded document class as the
first argument:
class Page(Document):
comments = ListField(EmbeddedDocumentField(Comment))
comment1 = Comment(content='Good work!')
comment2 = Comment(content='Nice article!')
page = Page(comments=[comment1, comment2])
Dictionary Fields
Often, an embedded document may be used instead of a dictionary — generally embedded
documents are recommended as dictionaries don't support validation or custom field types.
However, sometimes a dictionary is more appropriate, for example when the structure of a
field is unknown. In this situation, DictField may be used:
class SurveyResponse(Document):
date = DateTimeField()
user = ReferenceField(User)
answers = DictField()
2.3.4. Document collections
Document classes that inherit directly from Document will have their own
collection in the database. The name of the collection is by default the name of the class
converted to lowercase. If you need to change the name of the collection (e.g. to use
MongoEngine with an existing database), then create a class dictionary attribute called
meta on your document, and set collection to the name of the
collection that you want your document class to use:
# Will work with data in an existing collection named 'cmsPage'
class Page(Document):
title = StringField(max_length=200, required=True)
meta = {
'collection': 'cmsPage'
}
2.3.5. Indexes
You can specify indexes on collections to make querying faster. This is done by creating a
list of index specifications in the meta dictionary. Indexes may be specified as
a single field, a tuple of fields, or a dictionary of options and the fields to index:
class Page(Document):
category = IntField()
title = StringField()
rating = StringField()
created = DateTimeField()
meta = {
'indexes': [
'title', # single-field index
'$title', # text index
'#title', # hashed index
('title', '-rating'), # compound index
('category', '_cls'), # compound index
{
'fields': ['created'],
'expireAfterSeconds': 3600 # ttl index
}
]
}
There are a few top level defaults for all indexes that can be set:
class Page(Document):
title = StringField()
rating = StringField()
meta = {
'index_opts': {},
'index_background': True,
'index_cls': False,
'auto_create_index': True,
'auto_create_index_on_save': False,
}
2.3.6. Ordering
A default ordering can be specified for your QuerySet using the
ordering attribute of meta. Ordering will be applied when the
QuerySet is created, and can be overridden by subsequent calls to
order_by():
from datetime import datetime
class BlogPost(Document):
title = StringField()
published_date = DateTimeField()
meta = {
'ordering': ['-published_date']
}
2.3.7. Shard keys
If you have a sharded MongoDB cluster, you can specify the shard key as a tuple of fields
using the shard_key attribute of meta. This ensures the shard key
fields are provided when saving a document:
class LogEntry(Document):
machine = StringField()
app = StringField()
timestamp = DateTimeField()
data = StringField()
meta = {
'shard_key': ('machine', 'timestamp',)
}
2.3.8. Document inheritance
To create a specialised type of a Document you have defined, you may subclass
it and add any extra fields or methods you may need. As the document is now made up of two
classes you must tell MongoEngine that inheritance is permitted on your original document by
setting allow_inheritance to True in the meta:
class Page(Document):
title = StringField(max_length=200, required=True)
meta = {'allow_inheritance': True}
class DatedPage(Page):
date = DateTimeField()
Behind the scenes, MongoEngine deals with inheritance by adding a _cls
attribute that contains the class name in every document. When a document is loaded,
MongoEngine checks its _cls attribute and uses that class to construct the
instance:
Page(title='a funky title').save()
DatedPage(title='another title', date=datetime.utcnow()).save()
print(Page.objects().count()) # 2
print(DatedPage.objects().count()) # 1
# print documents in their native form
qs = Page.objects.exclude('id').as_pymongo()
print(list(qs))
# [
# {'_cls': 'Page', 'title': 'a funky title'},
# {'_cls': 'Page.DatedPage', 'title': 'another title', 'date': datetime(...)}
# ]
2.3.9. Abstract classes
If you want to add some extra functionality to a group of Document classes but you don't
need or want the overhead of inheritance, you can use the abstract attribute of
meta. This won't turn on document inheritance but will allow you to keep your
code DRY:
class BaseDocument(Document):
meta = {
'abstract': True,
}
def check_permissions(self):
...
class User(BaseDocument):
...
Now the User class will have access to the inherited
check_permissions method and won't store any of the extra _cls
information.