MongoDB Data Modelling Tutorial

In this MongoDB tutorial we learn the difference between traditional relational databases and no-sql databases like MongoDB.

We also cover embedded and normalized data models and when to use each.

Here's a table of contents of what you'll learn in this lesson:
(click on a link to skip to its section)

Let's jump right in.

How does a traditional relational database work

In traditional relational databases like MySQL, we store our data in a table structure with columns and rows.

Each table contains multiple columns that define the data we want to store and each row then contain the entries.

As an example, let’s say we want to store some employee information like name and date of birth.

We would create a table in the database, with the name ‘employee’, that holds three columns called ‘id’, ‘name’ and ‘dob’.

idnamedob

The ‘id’ column is simply a auto-generated unique identifier for each entry in the table.

We can now add some employees to the table, so it would look something like the following.

idnamedob
01John1990-03-14
02Jane1984-07-01
03Jack2001-04-20
04Jill1996-11-27

We can then search or filter users by column based on the data we want.

How is MongoDB different from a relational database

Simply put, we store all related data in a single document.

As an example, let’s consider that we want to create a blog with the following requirements:

  • Every post has a unique title, description and URL
  • Every post has the name of its author and any likes it got
  • Every post may have multiple tags
  • Every post can have zero or more comments with the commenter’s name, timestamp, message and likes

In a traditional database, we would use multiple tables linked by an ID.

Posts Table:

idtitledescriptionurlauthorlikes
01How to MongoLearn how to Mongo/how-to-mongo/John Doe10

Tags Table:

idpost_idtag_name
0401Learn
0901Database
1301MongoDB

Comments Table:

comment_idpost_idusercommenttimestamplikes
0101JaneThanks for the toot2020-11-094
0201JackYou forgot DB at the end2020-11-103

In MongoDB, we put all of the above in a single document that looks something like the following.

Example:
_id:         01
title:       How to Mongo
description: Learn how to Mongo
author:      John Doe
url:         /how-to-mongo/
tags:        [MongoDB, Database, Learn]
likes:       10
comments:
    user:      Jane
    comment:   Thanks for the toot
    timestamp: 2020-11-09
    likes:     4

    user:      Jack
    comment:   You forgot DB at the end
    timestamp: 2020-11-10
    likes:     3

As you can imagine, there are several benefits to storing data this way, specially when it comes to speed.

Data Models: Embedded vs Normalized

MongoDB allows us to structure our database in two ways.

  • We can have (embed) everything in a single document, known as the embedded or de-normalized data model.
  • We can create sub-documents and refer to them in the original document by using references, known as the normalized data model.

Let’s go back to our employee example and expand it a little, but this time we use MongoDB.

We’ll add the following details:

  • Personal details: First and last name and date of birth
  • Contact information: Mobile number and email address
  • Adress: Area, City, State/Province

First, let’s use the embedded data model and place everything in a single document, that will look similar to the one below.

Example:
     _id:       01
personal:
    first_name: John
    last_name:  Doe
    dob:        1990-03-14

contact:
    mobile_no:  123 555 4567
    email:      john.doe@example.com

address:
    area:       Manhattan Beach
    city:       Los Angeles
    state_prov: California

Now, let’s use the normalized data model and separate the example above into three sub-documents.

Example: Employee (Main)
_id: 01
Example: Personal (Sub)
_id:        02
emp_id:     01

first_name: John
last_name:  Doe
dob:        1990-03-14
Example: Contact (Sub)
_id:       03
emp_id:    01

mobile_no: 123 555 4567
email:     john.doe@example.com
Example: Address (Sub)
_id:        04
emp_id:     01

area:       Manhattan Beach
city:       Los Angeles
state_prov: California

Like the relational database, we link the sub-documents to the main with an ID.

So which one do you use then? This really depends on your requirements.

  • If you don’t need to have all the information available at the same time, separate them.
  • If you’re going to use the data together, store everything in a single document.

Let’s consider our previous two examples, the blog data and employee information.

The normalized data model would be better suited to the employee information because we won’t always need all the data at the same time. If we wanted to just filter by date of birth to see who’s birthday it is today, we only need the data in the Personal sub-document.

On the other hand, everything in the blog post document is typically needed because of the way a blog post is displayed. In this case the embedded data model is the better choice.