NoSQL based Data Modelling in Database Management Systems
Most applications are built by abstracting one data model on top of another. Each layer hides the complexity of the layers below by providing a different data model. These abstractions allow different groups of people and different software applications to work effectively.
The roots of relational databases lie in business data processing, transaction processing and batch processing. The goal was to hide the implementation details behind a cleaner interface.
Not Only SQL has a few driving forces:
- Greater scalability
- preference for free and open source software
- Specialised query optimisations
- Desire for a more dynamic and expressive data model
With a SQL model, if data is stored in a relational tables, an awkward translation layer is translated, this is called impedance mismatch. JSON model reduces the impedance mismatch and the lack of schema is often cited as an advantage. JSON representation has better locality than the multi-table SQL schema. All the relevant information is in one place, and one query is sufficient.
In relational databases, it’s normal to refer to rows in other tables by ID, because joins are easy. In document databases, joins are not needed for one-to-many tree structures, and support for joins is often weak. If the database itself does not support joins, you have to emulate a join in application code by making multiple queries. The most popular database for business data processing in the 1970s was the IBM’s Information Management System (IMS). IMS used a hierarchical model and like document databases worked well for one-to-many relationships, but it made many-to-,any relationships difficult, and it didn’t support joins.
The main arguments in favour of the document data model are schema flexibility, better performance due to locality, and sometimes closer data structures to the ones used by the applications. The relation model counters by providing better support for joins, and many-to-one and many-to-many relationships. If the data in your application has a document-like structure, then it’s probably a good idea to use a document model. The relational technique of shredding can lead unnecessary complicated application code. The poor support for joins in document databases may or may not be a problem. If you application does use many-to-many relationships, the document model becomes less appealing. Joins can be emulated in application code by making multiple requests. Using the document model can lead to significantly more complex application code and worse performance.
Most document databases do not enforce any schema on the data in documents. Arbitrary keys and values can be added to a document, when reading, clients have no guarantees as to what fields the documents may contain.
Document databases are sometimes called schemaless, but maybe a more appropriate term is schema-on-read, in contrast to schema-on-write. Schema-on-read is similar to dynamic (runtime) type checking, whereas schema-on-write is similar to static (compile-time) type checking. The schema-on-read approach if the items on the collection don’t have all the same structure (heterogeneous).
Data locality for queries
If your application often needs to access the entire document, there is a performance advantage to this storage locality. The database typically needs to load the entire document, even if you access only a small portion of it. On updates, the entire document usually needs to be rewritten, it is recommended that you keep documents fairly small.
Convergence of document and relational databases
PostgreSQL does support JSON documents. RethinkDB supports relational-like joins in its query language and some MongoDB drivers automatically resolve database references. Relational and document databases are becoming more similar over time.