Neo4j's Cypher vs Clojure - Group by and Sorting
One of the points that I emphasised during my talk on building Neo4j backed applications using Clojure last week is understanding when to use Cypher to solve a problem and when to use the programming language.
A good example of this is in the meetup application I’ve been working on. I have a collection of events and want to display past events in descending order and future events in ascending order.
First let’s create some future and some past events based on the current timestamp of 1404006050535:
CREATE (event1:Event {name: "Future Event 1", timestamp: 1414002772427 })
CREATE (event2:Event {name: "Future Event 2", timestamp: 1424002772427 })
CREATE (event3:Event {name: "Future Event 3", timestamp: 1416002772427 })
CREATE (event4:Event {name: "Past Event 1", timestamp: 1403002772427 })
CREATE (event5:Event {name: "Past Event 2", timestamp: 1402002772427 })
If we return all the events we see the following:
$ MATCH (e:Event) RETURN e;
==> +------------------------------------------------------------+
==> | e |
==> +------------------------------------------------------------+
==> | Node[15414]{name:"Future Event 1",timestamp:1414002772427} |
==> | Node[15415]{name:"Future Event 2",timestamp:1424002772427} |
==> | Node[15416]{name:"Future Event 3",timestamp:1416002772427} |
==> | Node[15417]{name:"Past Event 1",timestamp:1403002772427} |
==> | Node[15418]{name:"Past Event 2",timestamp:1402002772427} |
==> +------------------------------------------------------------+
==> 5 rows
==> 13 ms
We can achieve the desired grouping and sorting with the following cypher query:
(def sorted-query "MATCH (e:Event)
WITH COLLECT(e) AS events
WITH [e IN events WHERE e.timestamp <= timestamp()] AS pastEvents,
[e IN events WHERE e.timestamp > timestamp()] AS futureEvents
UNWIND pastEvents AS pastEvent
WITH pastEvent, futureEvents ORDER BY pastEvent.timestamp DESC
WITH COLLECT(pastEvent) as orderedPastEvents, futureEvents
UNWIND futureEvents AS futureEvent
WITH futureEvent, orderedPastEvents ORDER BY futureEvent.timestamp
RETURN COLLECT(futureEvent) AS orderedFutureEvents, orderedPastEvents")
We then use the following function to call through to the Neo4j server using the excellent neocons library:
(ns neo4j-meetup.db
(:require [clojure.walk :as walk])
(:require [clojurewerkz.neocons.rest.cypher :as cy])
(:require [clojurewerkz.neocons.rest :as nr]))
(def NEO4J_HOST "http://localhost:7521/db/data/")
(defn cypher
([query] (cypher query {}))
([query params]
(let [conn (nr/connect! NEO4J_HOST)]
(->> (cy/tquery query params)
walk/keywordize-keys))))
We call that function and grab the first row since we know there won’t be any other rows in the result:
(def query-result (->> ( db/cypher sorted-query) first))
Now we need to extract the past and future collections so that we can display them on the page which we can do like so:
> (map #(% :data) (query-result :orderedPastEvents))
({:timestamp 1403002772427, :name "Past Event 1"} {:timestamp 1402002772427, :name "Past Event 2"})
> (map #(% :data) (query-result :orderedFutureEvents))
({:timestamp 1414002772427, :name "Future Event 1"} {:timestamp 1416002772427, :name "Future Event 3"} {:timestamp 1424002772427, :name "Future Event 2"})
An alternative approach is to return the events from cypher and then handle the grouping and sorting in clojure. In that case our query is much simpler:
(def unsorted-query "MATCH (e:Event) RETURN e")
We’ll use the clj-time library to determine the current time:
(def now (clj-time.coerce/to-long (clj-time.core/now)))
First let’s split the events into past and future:
> (def grouped-by-events
(->> (db/cypher unsorted-query)
(map #(->> % :e :data))
(group-by #(> (->> % :timestamp) now))))
> grouped-by-events
{true [{:timestamp 1414002772427, :name "Future Event 1"} {:timestamp 1424002772427, :name "Future Event 2"} {:timestamp 1416002772427, :name "Future Event 3"}],
false [{:timestamp 1403002772427, :name "Past Event 1"} {:timestamp 1402002772427, :name "Past Event 2"}]}
And finally we sort appropriately using these functions:
(defn time-descending [row] (* -1 (->> row :timestamp)))
(defn time-ascending [row] (->> row :timestamp))
> (sort-by time-descending (get grouped-by-events false))
({:timestamp 1403002772427, :name "Past Event 1"} {:timestamp 1402002772427, :name "Past Event 2"})
> (sort-by time-ascending (get grouped-by-events true))
({:timestamp 1414002772427, :name "Future Event 1"} {:timestamp 1416002772427, :name "Future Event 3"} {:timestamp 1424002772427, :name "Future Event 2"})
I used Clojure to do the sorting and grouping in my project because the query to get the events was a bit more complicated and became very difficult to read with the sorting and grouping mixed in.
Unfortunately cypher doesn’t provide an easy way to sort within a collection so we need our sorting in the row context and then collect the elements back again afterwards.
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.