Neo4j: Cypher - Finding directors who acted in their own movie
I’ve been doing quite a few Intro to Neo4j sessions recently and since it contains a lot of problems for the attendees to work on I get to see how first time users of Cypher actually use it.
A couple of hours in we want to write a query to find directors who acted in their own film based on the following model.
A common answer is the following:
MATCH (a)-[:ACTED_IN]->(m)<-[:DIRECTED]-(d)
WHERE a.name = d.name
RETURN a
We’re matching an actor 'a', finding the movie they acted in and then finding the director of that movie. We now have pairs of actors and directors which we filter down by comparing their 'name' property.
I haven’t written SQL for a while but if my memory serves me correctly comparing properties or attributes in this way is quite a common way to test for equality.
In a graph we don’t need to compare properties - what we actually want to check is if 'a' and 'd' are the same node:
MATCH (a)-[:ACTED_IN]->(m)<-[:DIRECTED]-(d)
WHERE a = d
RETURN a
We’ve simplifed the query a bit but we can actually go one better by binding the director to the same identifier as the actor like so:
MATCH (a)-[:ACTED_IN]->(m)<-[:DIRECTED]-(a)
RETURN a
So now we’re matching an actor 'a', finding the movie they acted in and then finding the director if they happen to be the same person as 'a'.
The code is now much simpler and more revealing of its intent too.
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.