Home / Tutorials / MongoDB / Data Modelling

MongoDB Data Modelling Tutorial

In this MongoDB tutorial we learn the difference between traditional relational databases and no-sql databases like MongoDB.

We also cover embedded and normalized data models and when to use each.

How does a traditional relational database work
How is MongoDB different from a relational database
Data Models: Embedded vs Normalized

How does a traditional relational database work

In traditional relational databases like MySQL, we store our data in a table structure with columns and rows.

Each table contains multiple columns that define the data we want to store and each row then contain the entries.

As an example, let’s say we want to store some employee information like name and date of birth.

We would create a table in the database, with the name ‘employee’, that holds three columns called ‘id’, ‘name’ and ‘dob’.

name

dob

The ‘id’ column is simply a auto-generated unique identifier for each entry in the table.

We can now add some employees to the table, so it would look something like the following.

id	name	dob
01	John	1990-03-14
02	Jane	1984-07-01
03	Jack	2001-04-20
04	Jill	1996-11-27

We can then search or filter users by column based on the data we want.

How is MongoDB different from a relational database

Simply put, we store all related data in a single document.

As an example, let’s consider that we want to create a blog with the following requirements:

Every post has a unique title, description and URL
Every post has the name of its author and any likes it got
Every post may have multiple tags
Every post can have zero or more comments with the commenter’s name, timestamp, message and likes

In a traditional database, we would use multiple tables linked by an ID.

Posts Table:

id	title	description	url	author	likes
01	How to Mongo	Learn how to Mongo	/how-to-mongo/	John Doe	10

Tags Table:

id	post_id	tag_name
04	01	Learn
09	01	Database
13	01	MongoDB

Comments Table:

comment_id	post_id	user	comment	timestamp	likes
01	01	Jane	Thanks for the toot	2020-11-09	4
02	01	Jack	You forgot DB at the end	2020-11-10	3

In MongoDB, we put all of the above in a single document that looks something like the following.

Example:

_id:         01
title:       How to Mongo
description: Learn how to Mongo
author:      John Doe
url:         /how-to-mongo/
tags:        [MongoDB, Database, Learn]
likes:       10
comments:
    user:      Jane
    comment:   Thanks for the toot
    timestamp: 2020-11-09
    likes:     4

    user:      Jack
    comment:   You forgot DB at the end
    timestamp: 2020-11-10
    likes:     3

As you can imagine, there are several benefits to storing data this way, specially when it comes to speed.

Data Models: Embedded vs Normalized

MongoDB allows us to structure our database in two ways.

We can have (embed) everything in a single document, known as the embedded or de-normalized data model.
We can create sub-documents and refer to them in the original document by using references, known as the normalized data model.

Let’s go back to our employee example and expand it a little, but this time we use MongoDB.

We’ll add the following details:

Personal details: First and last name and date of birth
Contact information: Mobile number and email address
Adress: Area, City, State/Province

First, let’s use the embedded data model and place everything in a single document, that will look similar to the one below.

Example:

     _id:       01
personal:
    first_name: John
    last_name:  Doe
    dob:        1990-03-14

contact:
    mobile_no:  123 555 4567
    email:      john.doe@example.com

address:
    area:       Manhattan Beach
    city:       Los Angeles
    state_prov: California

Now, let’s use the normalized data model and separate the example above into three sub-documents.

Example: Employee (Main)

_id: 01

Example: Personal (Sub)

_id:        02
emp_id:     01

first_name: John
last_name:  Doe
dob:        1990-03-14

Example: Contact (Sub)

_id:       03
emp_id:    01

mobile_no: 123 555 4567
email:     john.doe@example.com

Example: Address (Sub)

_id:        04
emp_id:     01

area:       Manhattan Beach
city:       Los Angeles
state_prov: California

Like the relational database, we link the sub-documents to the main with an ID.

So which one do you use then? This really depends on your requirements.

If you don’t need to have all the information available at the same time, separate them.
If you’re going to use the data together, store everything in a single document.

Let’s consider our previous two examples, the blog data and employee information.

The normalized data model would be better suited to the employee information because we won’t always need all the data at the same time. If we wanted to just filter by date of birth to see who’s birthday it is today, we only need the data in the Personal sub-document.

On the other hand, everything in the blog post document is typically needed because of the way a blog post is displayed. In this case the embedded data model is the better choice.

Environment Setup

Basics