· spatial neo4j cypher

neo4j/cypher: Finding football stadiums near a city using spatial

One of the things that I wanted to add to my football graph was something location related so I could try out neo4j spatial and I thought the easiest way to do that was to model the location of football stadiums.

To start with I needed to add spatial as an unmanaged extension to my neo4j plugins folder which involved doing the following:

$ git clone git://github.com/neo4j/spatial.git spatial
$ cd spatial
$ mvn clean package -Dmaven.test.skip=true install
$ unzip target/neo4j-spatial-0.11-SNAPSHOT-server-plugin.zip -d /path/to/neo4j-community-1.9.M04/plugins/
$ /path/to/neo4j-community-1.9.M04/bin/neo4j restart

If it’s installed correctly then you should see this sort of output from issuing a 'curl' against the web interface:

$ curl -L http://localhost:7474/db/data
{
  "extensions" : {
...
    "SpatialPlugin" : {
      "addEditableLayer" : "http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/addEditableLayer",
      "addCQLDynamicLayer" : "http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/addCQLDynamicLayer",
      "findGeometriesWithinDistance" : "http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/findGeometriesWithinDistance",
      "updateGeometryFromWKT" : "http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/updateGeometryFromWKT",
      "addGeometryWKTToLayer" : "http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/addGeometryWKTToLayer",
      "getLayer" : "http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/getLayer",
      "addSimplePointLayer" : "http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/addSimplePointLayer",
      "findGeometriesInBBox" : "http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/findGeometriesInBBox",
      "addNodeToLayer" : "http://localhost:7474/db/data/ext/SpatialPlugin/graphdb/addNodeToLayer"
    },
…
  },
...
  "neo4j_version" : "1.9.M04"

The next step was to create a spatial index containing the stadiums latitudes/longitudes.

There’s a good example in https://github.com/mneedham/spatial/blob/master/src/test/java/org/neo4j/gis/spatial/IndexProviderTest.java#L251 which I was able to adapt to do what I wanted.

I got a list of stadiums along with their locations as a CSV from Chris Bell’s blog.

The output looks like this:

Name,Team,Capacity,Latitude,Longitude
"Adams Park","Wycombe Wanderers",10284,51.6306,-0.800299
"Almondvale Stadium","Livingston",10122,55.8864,-3.52207
"Amex Stadium","Brighton and Hove Albion",22374,50.8609,-0.08014
"Anfield","Liverpool",45522,53.4308,-2.96096
"Ashton Gate","Bristol City",21497,51.44,-2.62021
"B2net Stadium","Chesterfield",10400,53.2535,-1.4272

I ended up with the following code to create nodes for each of the stadium and add them to the spatial index:

// imports excluded

public class SampleSpatialGraph {
    public static void main(String[] args) throws IOException {
        List<String> lines = readFile("/path/to/stadiums.csv");

        EmbeddedGraphDatabase db = new EmbeddedGraphDatabase("/path/to/neo4j-community-1.9.M04/data/graph.db");
        Index<Node> index = createSpatialIndex(db, "stadiumsLocation");
        Transaction tx = db.beginTx();

        for (String stadium : lines) {
            String[] columns = stadium.split(",");
            Node stadiumNode = db.createNode();
            stadiumNode.setProperty("wkt", String.format("POINT(%s %s)", columns[4], columns[3]));
            stadiumNode.setProperty("name", columns[0]);
            index.add(stadiumNode, "dummy", "value");
        }

        tx.success();
        tx.finish();
    }

    private static Index<Node> createSpatialIndex(EmbeddedGraphDatabase db, String indexName) {
        return db.index().forNodes(indexName, SpatialIndexProvider.SIMPLE_WKT_CONFIG);
    }

    // readFile function excluded
}

The full code is on this gist if you’re interested.

We can now query the stadiums using cypher to find say the stadiums within 5 kilometres of Manchester:

START n=node:stadiumsLocation('withinDistance:[53.489271, -2.246704, 5.0]')
RETURN n.name, n.wkt;
==> +------------------------------------------------+
==> | n.name             | n.wkt                     |
==> +------------------------------------------------+
==> | ""Etihad Stadium"" | "POINT(-2.20024 53.483)"  |
==> | ""Old Trafford""   | "POINT(-2.29139 53.4631)" |
==> +------------------------------------------------+
==> 2 rows
==> 214 ms

Or we could use a bounding box query whereby we return all the stadiums within a virtual box based on coordinates. For example the following query returns all the stadiums which are within the M25:

START n=node:stadiumsLocation('bbox:[-0.519104,0.22934, 51.279958,51.69299]')
RETURN n.name, n.wkt;
==> +----------------------------------------------------+
==> | n.name                | n.wkt                      |
==> +----------------------------------------------------+
==> | ""White Hart Lane""   | "POINT(-0.065684 51.6033)" |
==> | ""Wembley""           | "POINT(-0.279543 51.5559)" |
==> | ""Victoria Road""     | "POINT(0.159739 51.5478)"  |
==> | ""Vicarage Road""     | "POINT(-0.401569 51.6498)" |
==> | ""Underhill Stadium"" | "POINT(-0.191789 51.6464)" |
==> | ""The Valley""        | "POINT(0.036757 51.4865)"  |
==> | ""The Den""           | "POINT(-0.050743 51.4859)" |
==> | ""Stamford Bridge""   | "POINT(-0.191034 51.4816)" |
==> | ""Selhurst Park""     | "POINT(-0.085455 51.3983)" |
==> | ""Craven Cottage""    | "POINT(-0.221619 51.4749)" |
==> | ""Griffin Park""      | "POINT(-0.302621 51.4882)" |
==> | ""Loftus Road""       | "POINT(-0.232204 51.5093)" |
==> | ""Boleyn Ground""     | "POINT(0.039225 51.5321)"  |
==> | ""Emirates Stadium""  | "POINT(-0.108436 51.5549)" |
==> | ""Brisbane Road""     | "POINT(-0.012551 51.5601)" |
==> +----------------------------------------------------+
==> 15 rows
==> 23 ms

Now I just need to wire the stadiums in with the rest of the graph and I’ll be able to write queries based on players performance in different parts of the country.

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket