Neo4j: Cypher - Finding movies by decade

I was recently asked how to find the number of movies produced per decade in the movie data set that comes with the Neo4j browser and can be imported with the following command:

``````
:play movies
``````

We want to get one row per decade and have a count alongside so the easiest way is to start with one decade and build from there.

``````
MATCH (movie:Movie)
WHERE movie.released >= 1990 and movie.released <= 1999
RETURN 1990 + "-" + 1999 as years, count(movie) AS movies
ORDER BY years
``````

Note that we're doing a label scan of all nodes of type Movie as there are no indexes for range queries. In this case it's fine as we have few movies but If we had 100s of thousands of movies then we'd want to optimise the WHERE clause to make use of an IN which would then use any indexes.

If we run the query we get the following result:

``````
==> +----------------------+
==> | years       | movies |
==> +----------------------+
==> | "1990-1999" | 21     |
==> +----------------------+
==> 1 row
``````

Let's pull out the start and end years so they're explicitly named:

``````
MATCH (movie:Movie)
ORDER BY years
``````

Now we need to create a collection of start and end years so we can return more than one. We can use the UNWIND function to take a collection of decades and run them through the rest of the query:

``````
UNWIND [{start: 1970, end: 1979}, {start: 1980, end: 1989}, {start: 1980, end: 1989}, {start: 1990, end: 1999}, {start: 2000, end: 2009}, {start: 2010, end: 2019}] AS row
MATCH (movie:Movie)
ORDER BY years
``````
``````
==> +----------------------------+
==> | years       | count(movie) |
==> +----------------------------+
==> | "1970-1979" | 2            |
==> | "1980-1989" | 2            |
==> | "1990-1999" | 21           |
==> | "2000-2009" | 13           |
==> | "2010-2019" | 1            |
==> +----------------------------+
==> 5 rows
``````

Alistair pointed out that we can simplify this even further by using the RANGE function:

``````