· neo4j

Neo4j: Cypher - Flatten a collection

Every now and then in Cypher land we'll end up with a collection of arrays, often created via the COLLECT function, that we want to squash down into one array.

For example let's say we have the following array of arrays...


$ RETURN [[1,2,3], [4,5,6], [7,8,9]] AS result;
==> +---------------------------+
==> | result                    |
==> +---------------------------+
==> | [[1,2,3],[4,5,6],[7,8,9]] |
==> +---------------------------+
==> 1 row

...and we want to return the array [1,2,3,4,5,6,7,8,9].

Many programming languages have a 'flatten' function and although cypher doesn't we can make our own by using the REDUCE function:


$ WITH  [[1,2,3], [4,5,6], [7,8,9]] AS result 
  RETURN REDUCE(output = [], r IN result | output + r) AS flat;
==> +---------------------+
==> | flat                |
==> +---------------------+
==> | [1,2,3,4,5,6,7,8,9] |
==> +---------------------+
==> 1 row

Here we're passing the array 'output' over the collection and adding the individual arrays ([1,2,3], [4,5,6] and [7,8,9]) to that array as we iterate over the collection.

If we're working with numbers in Neo4j 2.0.1 we'll get this type exception with this version of the code:


==> SyntaxException: Type mismatch: expected Any, Collection<Any> or Collection<Collection<Any>> but was Integer (line 1, column 148)

We can easily work around that by coercing the type of 'output' like so:


WITH  [[1,2,3], [4,5,6], [7,8,9]] AS result 
RETURN REDUCE(output = range(0,-1), r IN result | output + r);

Of course this is quite a simple example but we can handle more complicated scenarios as well by using nested calls to REDUCE. For example let's say we wanted to completely flatten this array:


$ RETURN [[1,2,3], [4], [5, [6, 7]], [8,9]] AS result;
==> +-------------------------------+
==> | result                        |
==> +-------------------------------+
==> | [[1,2,3],[4],[5,[6,7]],[8,9]] |
==> +-------------------------------+
==> 1 row

We could write the following cypher code:


$ WITH  [[1,2,3], [4], [5, [6, 7]], [8,9]] AS result 
  RETURN REDUCE(output = [], r IN result | output + REDUCE(innerOutput = [], innerR in r | innerOutput + innerR)) AS flat;
==> +---------------------+
==> | flat                |
==> +---------------------+
==> | [1,2,3,4,5,6,7,8,9] |
==> +---------------------+
==> 1 row

Here we have an outer REDUCE function which iterates over [1,2,3], [4], [5, [6,7]] and [8,9] and then an inner REDUCE function which iterates over those individual arrays.

If we had more nesting then we could just introduce another level of nesting!

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket