Why the hopelessness?
There truly is no exact explanation. It feels like an ache on the heart, something
I've personally have been looking into Graph Databases, specifically, Neo4j, for the past couple of months for the purposes of building a project, and after spending time reading articles on https://neo4j.com/ and browsing through their documentation to understand the vision that the dev team is taking Neo4j to, it has started to become more and more apparent that while Graph Databases solve many of the problems that Data Engineers, Analysts and Scientists face with a number of RDBMS, it is not an indication that we might leave the relational paradigm anytime soon.
Even with all that, I do not believe that they will at anytime replace SQL. Both have completely separate use cases in the industry, and while the RDBMS setup of databases works as needed in some use cases, it is unable to be do things like analysis within strongly and sparsely connected graphs that Graph Databases relies on as a selling point. And that while Graph Databases are much, much better at searching nodes for finding insights and giving easier access to detailed analysis, there will always be a use case for just wanting to store relations tabular that different RDBMS and SQL provides much more optimizations for. Adding onto that, graph databases, I believe, are in memory databases with a master-child architecture when it comes to scaling horizontally. Which allows room for clusters. SQL and RDBMS are notorious for being hard to scale horizontally, not that it cannot be done, but it's one of the considerations that DBAs have to take a look into.
This is an opinion piece, so feel free to disagree and let me know what your thoughts are!
Graph Databases, such as GrapheneDB, or Neo4j, offer a set of advantages that help to encounter the enormous performance overhead that comes with performing joins on an extensive number of relations, growing the depth size for more than 6-7, or maybe up til even 10 joins to find and retrieve meaningful information. It doesn't help to write really long queries for that, as adding and writing can result in double checking and making sure if the relations are retrieving the right information as we expect them to. Further adding, the JOIN operation performs a very mitigated version of a Cartesian product on both the relations it is being addressed with.
Please note that the idea of RDBMS, Joins, and a Cartesian Product can quickly heat-up to be a very strong discussion among many DBA professionals, something which might be out of the scope of this blog. JOINS are certainly optimized for different RDBMS, and I even quote myself saying that Relational Databases are of the most complex beasts in Computer Science along with compilers, operating systems, and distributed computing, there are ways to optimize for these operations, and you should definitely look into discussions about this before believing that JOINS are not really optimized.
Take a look here, When And Why Are Joins Expensive?. Here is an article that dives in SQL for optimizing JOINS and SELECT, How to design SQL queries with better performance: SELECT * and EXISTS vs IN vs JOINs.
Most of the references and information below I got from Neo4j themselves, so it may be a biiittt biased. Take it with a grain of salt.
Graph Databases, in all respects, and just so we are on the same page, are based on 4 main ideas,
MATCH (a:Person),(b:Person)
WHERE a.name = 'A' AND b.name = 'B'
CREATE (a)-[r:RELTYPE]->(b)
RETURN type(r)
Here, the above query written in Neo4j, and is trying to 'match' for the nodes with the certain characteristics. For the above, it is two nodes, 'a' and 'b' of type label 'Person', which carries the attribute 'name' , and 'RELTYPE' is the relationship that joins 'a' and 'b', from 'a' to 'b' as the 'direction', and returns the type of the relation that exists between the two variables.
So, now that the above is out of the way, let's discuss some draw backs,
And since the idea of Graph Databases is to connect node to node, and traverse the depth of relations, there needs to be a discussion if you're in a team of how exactly different instances of nodes relate to what kind of other nodes, and vice versa.
It completely falls on what your use case is. If you have the freedom to have ton and ton of disk space, and the capacity to scale nodes pretty efficiently for your application, then you are most probably concerned with the degree and number of relationships that are forming between those nodes, rather than the overall number of those nodes. That serves as a pretty good use case since traversal can be much more simplified.
Or are you unsure how deep is the information that you are looking for? You may not be sure if applying JOIN to a certain degree might get you need - you can go as deep as you want into the whole graph to find what information you need.
Perhaps the use case that you are looking for falls in each of the following suggestions. If so, check out the Use Cases On Neo4j.
That's pretty much it. Happy Database Desiging!