When to Use Graph Databases?

While still relational databases lead the database race by a rather large margin, in the past few years we have seen a rather significant surge in the emergence of other database paradigms such as key-value databases, document-based databases, and graph databases. While in the broader picture, all of these paradigms serve the purpose of storing and retrieving data, in the more specific terms, each paradigm is best suited only for particular tasks. Unfortunately, what I have observed in my career as a consultant is that the choice for the database in many enterprises is taken rather lightly and without enough caution and studies; this obviously leads to significant losses and undesirable consequences.

In this article, we are going to discuss in what circumstances it is best to use graph databases. More particularly, I am going to provide you with a guideline as to what symptoms in your enterprise can indicate that a graph database might suits best for your data model.

Why Graph Databases?

As mentioned before, disregard of the paradigm behind their designs, all databases can store and retrieve data so fundamentally there is nothing that can be done with a graph database that cannot be done with a relational one; the question is in the efficiency and easy of achieving this. When database is well-suited for the purpose it is used for, data storage and retrieval should be efficient and easy. So, as a rule of thumb, if you find yourself struggling with your database, it is likely that you have chosen the wrong database paradigm for your use-case. For instance, if to get a simple task done, you need to write a very long query that takes unacceptable amount of time to execute, you should probably consider other database paradigms. This is of course because those who designed each database paradigm did so because they found themselves in a situation where no other paradigm could serve their purposes and consequently were forced to create new paradigms for their particular cases.

Graph database is one of these paradigms and are particularly suited for complex and hierarchical data models where the long chain of relations between entities are of interest for the purpose that is served. The fundamental difference between relational databases and graph databases is in how they view the relations between data entities. Relational databases organize data in rows and tables that are correlated with each other via foreign keys for join operations.  This organizational model inhibits the possibility of representing or considering the relation between data entities (tables) in queries unless relations are explicitly stored as data in separate tables. On the other hand, graph databases treat relations as special type of data entities that can have their own properties and are natively supported in the query language. In addition, graph databases usually provide built-in and efficient support for various types of traversals through the relations and among data entities. Consequently, queries that involve distant relations between two data entities are far more efficient (and easier to express in the query language) in graph databases.

Should You Migrate Your Current Data to Graph Databases?

This really depends on your data model, but there are some signs that may help you in making the right decision. Generally speaking, If in your database you have tables dedicated to storing data of relations among other tables, then, graph databases might be useful for you as they offer native support for relations among data entities. Likewise, if queries are inefficient in your database as a result of having to perform too many joins or subqueries, graph databases might help you because they are optimized to traverse among data entities very efficiently.

When Not to Use Graph Databases?

Unlike relational databases that have been under development for decades, many graph databases are relatively new and not equally scalable or mature. So, when a task is doable with decent performance using a relational database, it might be a good idea to keep using more mature relational databases rather than using a graph database.

In addition, in order to accelerate graph traversal operations, graph databases often need to keep in memory multiple maps from nodes to relations and vice versa. That implies that graph databases are usually slow at write operations and relatively memory hungry. Therefore, for write intensive use-cases, graph databases are far from ideal choices.

Likewise, when the data is going to be stored and retrieved is semi-structured, graph databases are not very suitable choices. Usually for such workloads, document-based databases are more suitable.