Neo4j: The foul revenge graph
Last week I was showing the foul graph to my colleague Alistair who came up with the idea of running a 'foul revenge' query to find out which players gained revenge for a foul with one of their own later in them match.
Queries like this are very path centric and therefore work well in a graph. To recap, this is what the foul graph looks like:
The first thing that we need to do is connect the fouls in a linked list based on time so that we can query their order more easily.
We can do this with the following query:
MATCH (foul:Foul)-[:COMMITTED_IN_MATCH]->(match)
WITH foul,match
ORDER BY match.id, foul.sortableTime
WITH match, COLLECT(foul) AS fouls
FOREACH(i in range(0, length(fouls) -2) |
FOREACH(foul1 in [fouls[i]] | FOREACH (foul2 in [fouls[i+1]] |
MERGE (foul1)-[:NEXT]->(foul2)
)));
This query collects fouls grouped by match and then adds a 'NEXT' relationship between adjacent fouls. The graph now looks like this:
Now let’s find the revenge foulers in the Bayern Munich vs Barcelona match. We’re looking for the following pattern:
This translates to the following cypher query:
match (foul1:Foul)-[:COMMITTED_AGAINST]->(app1)-[:COMMITTED_FOUL]->(foul2)-[:COMMITTED_AGAINST]->(app2)-[:COMMITTED_FOUL]->(foul1),
(player1)-[:MADE_APPEARANCE]->(app1), (player2)-[:MADE_APPEARANCE]->(app2),
(foul1)-[:COMMITTED_IN_MATCH]->(match:Match {id: "32683310"})<-[:COMMITTED_IN_MATCH]-(foul2)
WHERE (foul1)-[:NEXT*]->(foul2)
RETURN player2.name AS firstFouler, player1.name AS revengeFouler, foul1.time, foul1.location, foul2.time, foul2.location
I’ve added in a few extra parts to the pattern to pull out the players involved and to find the revenge foulers in a specific match - the Bayern Munich vs Barcelona Semi Final 2nd leg.
We end up with the following revenge fouls:
We can see here that Dani Alves actually gains revenge on Bastian Schweinsteiger twice for a foul he made in the 10th minute.
If we tweak the query to the following we can get a visual representation of the revenge fouls as well:
match (foul1:Foul)-[:COMMITTED_AGAINST]->(app1)-[:COMMITTED_FOUL]->(foul2)-[:COMMITTED_AGAINST]->(app2)-[:COMMITTED_FOUL]->(foul1),
(player1)-[:MADE_APPEARANCE]->(app1), (player2)-[:MADE_APPEARANCE]->(app2),
(foul1)-[:COMMITTED_IN_MATCH]->(match:Match {id: "32683310"})<-[:COMMITTED_IN_MATCH]-(foul2),
(foul1)-[:NEXT*]->(foul2)
RETURN *
At the moment I’ve restricted the revenge concept to single matches but I wonder whether it’d be more interesting to create a linked list of fouls which crosses matches between teams in the same season.
The code for all of this is on github - the README is a bit sketchy at the moment but I’ll be fixing that up soon.
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.