Conceptual Model vs Graph Model
We’ve started running some sessions on graph modelling in London and during the first session it was pointed out that the process I’d described was very similar to that when modelling for a relational database.
I thought I better do some reading on the way relational models are derived and I came across an excellent video by Joe Maguire titled 'Data Modelers Still Have Jobs: Adjusting For the NoSQL Environment'
Joe starts off by showing the following 'big picture framework' which describes the steps involved in coming up with a relational model:
A couple of slides later he points out that we often blur the lines between the different stages and end up designing a model which contains a lot of implementation details:
If, on the other hand, we compare a conceptual model with a graph model this is less of an issue as the two models map quite closely:
-
Entities -> Nodes / Labels
-
Attributes -> Properties
-
Relationships -> Relationships
-
Identifiers -> Unique Constraints
Unique Constraints don’t quite capture everything that Identifiers do since it’s possible to create a node of a specific label without specifying the property which is uniquely constrained. Other than that though each concept matches one for one.
We often say that graphs are white board friendly by which we mean that that the model you sketch on a white board is the same as that stored in the database.
For example, consider the following sketch of people and their interactions with various books:
If we were to translate that into a write query using Neo4j’s cypher query language it would look like this:
CREATE (ian:Person {name: "Ian"})
CREATE (alan:Person {name: "Alan"})
CREATE (gg:Person:Author {name: "Graham Greene"})
CREATE (jlc:Person:Author {name: "John Le Carre"})
CREATE (omih:Book {name: "Our Man in Havana"})
CREATE (ttsp:Book {name: "Tinker Tailor, Soldier, Spy"})
CREATE (gg)-[:WROTE]->(omih)
CREATE (jlc)-[:WROTE]->(ttsp)
CREATE (ian)-[:PURCHASED {date: "05-02-2011"}]->(ttsp)
CREATE (ian)-[:PURCHASED {date: "08-09-2011"}]->(omih)
CREATE (alan)-[:PURCHASED {date: "05-07-2014"}]->(ttsp)
There are a few extra brackets and the 'CREATE' key word but we haven’t lost any of the fidelity of the domain and in my experience a non technical / commercial person would be able to understand the query.
By contrast this article shows the steps we might take from a conceptual model describing employees, departments and unions to the eventual relational model.
If you don’t have the time to read through that, we start with this initial model...
...and by the time we’ve got to a model that can be stored in our relational database:
You’ll notice we’ve lost the relationship types and they’ve been replaced by 4 foreign keys that allow us to join the different tables/sets together.
In a graph model we’d have been able to stay much closer to the conceptual model and therefore closer to the language of the business.
I’m still exploring the world of data modelling and next up for me is to read Joe’s 'Mastering Data Modeling' book. I’m also curious how normal forms and data redundancy apply to graphs so I’ll be looking into that as well.
Thoughts welcome, as usual!
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.