Relational Model Versus Document Model


In the ever-expanding field of data management, the decision to use either the Relational Model or the Document Model can have a profound impact on system design and efficiency. This post explores their contrasts, supplemented with examples and diagrams to make the differences more tangible.


Schema Flexibility

In document databases, there’s no need for predefined schemas, allowing you to dynamically adjust the data structure. This is particularly useful in applications where frequent schema updates are necessary.

Document Model Example: JSON Flexibility

{  
    "id": 1,  
    "fullName": "Jane Doe",  
    "contact": {  
        "email": "jane.doe@example.com",  
        "phone": "123-456-7890"  
    },  
    "preferences": ["email", "sms"]  
}  

Now, if you decide to add an additional field, such as address, you can include it directly without changing any predefined schema:

{  
    "id": 1,  
    "fullName": "Jane Doe",  
    "contact": {  
        "email": "jane.doe@example.com",  
        "phone": "123-456-7890"  
    },  
    "preferences": ["email", "sms"],  
    "address": "123 Main Street"  
}  

Relational Model Example: Schema-On-Write Rigor

In relational databases, the schema must be predefined:

CREATE TABLE Users (  
    id INT PRIMARY KEY,  
    fullName VARCHAR(100),  
    email VARCHAR(100),  
    phone VARCHAR(15)  
);  

If you want to add the address field, you must first alter the table schema:

ALTER TABLE Users  
ADD COLUMN address VARCHAR(255);  

Performance and Locality

In document models, related data often resides in the same document for fast access.

Document Model Example

Imagine retrieving a user’s profile and preferences in one shot:

{  
    "id": 1,  
    "fullName": "John Smith",  
    "education": [  
        { "degree": "B.Sc.", "year": 2010 },  
        { "degree": "M.Sc.", "year": 2015 }  
    ],  
    "jobs": [  
        { "title": "Software Engineer", "company": "ABC Inc.", "years": 3 },  
        { "title": "Senior Developer", "company": "XYZ Ltd.", "years": 5 }  
    ]  
}  

Relational Model Example

Here, you must query at least two tables (Users and Jobs) and join them:

SELECT U.fullName, J.title, J.company  
FROM Users U  
JOIN Jobs J ON U.id = J.userId  
WHERE U.id = 1;  

The results would require combining multiple sets of data:

+----------------+------------------+------------+  
| fullName       | title            | company    |  
+----------------+------------------+------------+  
| John Smith     | Software Engineer| ABC Inc.   |  
| John Smith     | Senior Developer | XYZ Ltd.   |  
+----------------+------------------+------------+  

Handling Data Relationships

Relational databases excel at normalizing datasets into smaller, related tables.

Example Relational Data: Jobs and Users

Users Table  
+----+-----------+  
| id | fullName  |  
+----+-----------+  
| 1  | John Smith|  
+----+-----------+  
   
Jobs Table  
+----+-----------+------------+  
| id | title     | userId     |  
+----+-----------+------------+  
| 1  | Engineer  | 1          |  
| 2  | Developer | 1          |  
+----+-----------+------------+  

Example in Document Model (Denormalized)

In document databases, relationship data is often embedded to reduce joins:

{  
    "id": 1,  
    "fullName": "John Smith",  
    "jobs": [  
        { "title": "Engineer" },  
        { "title": "Developer" }  
    ]  
}  

However, denormalization may increase redundancy and complexities if jobs are related to multiple users.


Image Illustrations

Relational Model (Normalized)

+-----------+      +-----------+  
|  Users    |      |   Jobs    |  
+-----------+      +-----------+  
| id        |<-+   | userId    |  
| fullName  |  +-> | title     |  
+-----------+      +-----------+  

Document Model (Denormalized)

+-------------------------------------------+  
|                 User                      |  
+-------------------------------------------+  
| id          | fullName                   |  
| jobs        | [                          |  
|             |   { title: "Engineer" },   |  
|             |   { title: "Developer" }   |  
|             | ]                          |  
+-------------------------------------------+  

Use Case Suitability

  • Document Model: Ideal for loosely coupled, hierarchical data such as content management systems, catalogs, or social media profiles.
  • Relational Model: Preferred for highly interconnected data structures like financial systems or inventory management where consistency and normalization are critical.

Convergence of Models

The lines are blurring, with relational databases supporting semi-structured data (e.g., PostgreSQL’s JSONB) and document databases offering query functionalities akin to joins (e.g., MongoDB’s $lookup).

Relational Support for JSON (PostgreSQL Example):

SELECT *  
FROM Users  
WHERE data->'preferences' @> '"sms"';  

Document Model Joins (MongoDB Aggregation Example):

db.users.aggregate([  
    {  
        $lookup: {  
            from: "jobs",  
            localField: "id",  
            foreignField: "userId",  
            as: "associatedJobs"  
        }  
    }  
]);  

Final Thoughts

The Relational Model and Document Model cater to different use cases but increasingly overlap in their capabilities. Developers can now choose based not only on current needs but also taking into account hybrid possibilities. By leveraging their evolving features, modern solutions can strike a balance between flexibility, performance, and relational robustness.

Series Designing Data-Intensive Applications Part 4 of 41
  1. Designing Reliable Data Systems
  2. What is Scalability in Data Systems?
  3. Building Maintainable Software Systems
  4. Relational Model Versus Document Model
  5. Speaking the Language of Data- A Guide to Query Languages
  6. Unraveling Connections- Exploring Graph-Like Data Models
  7. The Backbone of Databases- Data Structures that Power Storage
  8. Transaction Processing vs. Analytics Let's understand the divide
  9. Understanding Column-Oriented Storage- A Deep Dive into Analytics Optimization
  10. Formats for Encoding Data
  11. Modes of Dataflow in Distributed Systems
  12. Leaders and Followers - The Core of Replication
  13. Problems with Replication Lag - Challenges and Solutions
  14. Multi-Leader Replication in Distributed Databases
  15. Leaderless Replication Flexibility for Distributed Databases
  16. Partitioning and Replication in Scaling Distributed Databases
  17. Partitioning of Key-Value Data- Strategies and Challenges
  18. Partitioning and Secondary Indexes- Balancing Efficiency and Complexity
  19. Efficient Methods for Rebalancing Data in Distributed Systems
  20. Ensuring Accurate Request Routing in Distributed Databases
  21. The Slippery Concept of a Transaction
  22. Exploring Weak Isolation Levels in Databases
  23. Achieving Serializability in Transactions
  24. Faults and Partial Failures in Distributed Systems
  25. Navigating Unreliable Networks in Distributed Systems
  26. The Challenges of Unreliable Clocks in Distributed Systems
  27. Knowledge Truth and Lies in Distributed Systems
  28. Consistency Guarantees in Distributed Systems
  29. Linearizability in Distributed Systems
  30. Understanding Ordering Guarantees in Distributed Systems
  31. Achieving Reliability with Distributed Transactions and Consensus Mechanisms
  32. Leveraging Unix Tools for Efficient Batch Processing
  33. MapReduce and Distributed Filesystems- Foundations of Scalable Data Processing
  34. Advancing Beyond MapReduce- Modern Frameworks for Scalable Data Processing
  35. Enabling Reliable and Scalable Event Streams in Distributed Systems
  36. Synchronizing Databases with Real-Time Streams
  37. Unifying Batch and Stream Processing for Modern Pipelines
  38. Integrating Distributed Systems for Unified Data Pipelines
  39. Unbundling Monolithic Databases for Flexibility
  40. Building Correct Systems in Distributed Environments
  41. Ethical Data Practices for Building Better Systems

Want to get blog posts over email?

Enter your email address and get notified when there's a new post!