· neo4j cypher

Neo4j 2.0.0-M06 \-> 2.0.0-RC1: Working with path expressions

We recently released Neo4j 2.0.0-RC1 and since there were some breaking changes from Neo4j 2.0.0-M06 I decided to check if I needed to update any of my football graph queries.

On query which no longer worked as I expected was the following one which calculated the top goal scorers for televised games:

MATCH (player:Player)-[:played|subbed_on]->stats
WITH stats.goals AS goals, player, stats-[:in]->()-[:on_tv]-() as onTv
RETURN player.name,
       SUM(CASE WHEN onTv = FALSE THEN goals ELSE 0 END) as nonTvGoals,
       SUM(CASE WHEN onTv = TRUE THEN goals ELSE 0 END) as tvGoals,
       SUM(goals) as allGoals
ORDER BY tvGoals DESC
LIMIT 10

This is what that section of the graph looks like visually:

2013 11 23 10 11 24

When I run that query in 2.0.0-M06 I get the following output:

==> +------------------------------------------------------+
==> | player.name        | nonTvGoals | tvGoals | allGoals |
==> +------------------------------------------------------+
==> | "Robin Van Persie" | 11         | 15      | 26       |
==> | "Gareth Bale"      | 8          | 13      | 21       |
==> | "Luis Suárez"      | 12         | 11      | 23       |
==> | "Theo Walcott"     | 5          | 9       | 14       |
==> | "Demba Ba"         | 7          | 8       | 15       |
==> | "Santi Cazorla"    | 5          | 7       | 12       |
==> | "Edin Dzeko"       | 7          | 7       | 14       |
==> | "Carlos Tevez"     | 5          | 6       | 11       |
==> | "Juan Mata"        | 6          | 6       | 12       |
==> | "Steven Gerrard"   | 3          | 6       | 9        |
==> +------------------------------------------------------+

However, when I tried it in 2.0.0-RC1 the 'nonTvGoals' and 'tvGoals' columns don’t show any goals!

==> +--------------------------------------------------------+
==> | player.name          | nonTvGoals | tvGoals | allGoals |
==> +--------------------------------------------------------+
==> | "Djibril Cissé"      | 0          | 0       | 3        |
==> | "Gabriel Agbonlahor" | 0          | 0       | 9        |
==> | "Shaun Derry"        | 0          | 0       | 0        |
==> | "Marouane Fellaini"  | 0          | 0       | 11       |
==> | "Jermaine Jenas"     | 0          | 0       | 2        |
==> | "Claudio Yacob"      | 0          | 0       | 0        |
==> | "Michael Owen"       | 0          | 0       | 1        |
==> | "Sean Morrison"      | 0          | 0       | 2        |
==> | "Kolo Touré"         | 0          | 0       | 0        |
==> | "Antonio Valencia"   | 0          | 0       | 1        |
==> +--------------------------------------------------------+

I simplified the query a bit and then ran it against both versions to try and work out what was going on:

MATCH (player:Player)-[:played|subbed_on]->stats
RETURN stats-[:in]->()-[:on_tv]-() as onTv
LIMIT 1

on 2.0.0-M06

==> +-------+
==> | onTv  |
==> +-------+
==> | false |
==> +-------+

on 2.0.0-RC1

==> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
==> | onTv                                                                                                                                                                                                                                                                                                                                                                                         |
==> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
==> | [[Node[9058]{name:"Alou Diarra stats for 345824",node_id:"9058",match_id:"345824",type:"stats",goals:0},:in[47190]{},Node[9032]{name:"Swansea City vs West Ham United",node_id:"9032",friendly_date:"2012-08-25 12:45:00 +0100",time:1245,home_goals:3,match_id:"345824",away_goals:0,attendance:"20424",date:1345898700,type:"match"},:on_tv[47059]{},Node[143]{name:"Sky",node_id:"143"}]] |
==> +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

As you can see the path expression which indicates whether a match was televised now returns the matching path if there is one rather than a boolean value indicating if there is a path.

Now that we know that we can tweak the query to check if the onTV path contains any relationships which would indicate that a path does exist:

MATCH (player:Player)-[:played|subbed_on]->stats
WITH stats.goals AS goals, player, stats-[:in]->()-[:on_tv]-() as onTv
RETURN player.name,
       SUM(CASE WHEN LENGTH(onTv) = 0 THEN goals ELSE 0 END) as nonTVGoals,
       SUM(CASE WHEN LENGTH(onTv) > 0 THEN goals ELSE 0 END) as tvGoals,
       SUM(goals) AS allGoals
ORDER BY tvGoals DESC
LIMIT 10

If we run that it works as before:

==> +------------------------------------------------------+
==> | player.name        | nonTVGoals | tvGoals | allGoals |
==> +------------------------------------------------------+
==> | "Robin Van Persie" | 11         | 15      | 26       |
==> | "Gareth Bale"      | 8          | 13      | 21       |
==> | "Luis Suárez"      | 12         | 11      | 23       |
==> | "Theo Walcott"     | 5          | 9       | 14       |
==> | "Demba Ba"         | 7          | 8       | 15       |
==> | "Santi Cazorla"    | 5          | 7       | 12       |
==> | "Edin Dzeko"       | 7          | 7       | 14       |
==> | "Carlos Tevez"     | 5          | 6       | 11       |
==> | "Juan Mata"        | 6          | 6       | 12       |
==> | "Steven Gerrard"   | 3          | 6       | 9        |
==> +------------------------------------------------------+

I’m still going through the rest of my queries to check they work but if any of your pre 2.0.0-RC1 queries are breaking do post them up on the mailing list and someone will help you sort them out.

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket