· clojure

Clojure: Paging meetup data using lazy sequences

I’ve been playing around with the meetup API to do some analysis on the Neo4j London meetup and one thing I wanted to do was download all the members of the group.

A feature of the meetup API is that each end point will only allow you to return a maximum of 200 records so I needed to make use of offsets and paging to retrieve everybody.

It seemed like a good chance to use some lazy sequences to keep track of the offsets and then stop making calls to the API once I wasn’t retrieving any more results.

I wrote the following functions to take care of that bit:

(defn unchunk [s]
  (when (seq s)
    (lazy-seq
      (cons (first s)
            (unchunk (next s))))))

(defn offsets []
  (unchunk (range)))


(defn get-all [api-fn]
  (flatten
   (take-while seq
               (map #(api-fn {:perpage 200 :offset % :orderby "name"}) (offsets)))))

I previously wrote about the chunking behaviour of lazy collections which meant that I ended up with a minimum of 32 calls to each URI which wasn’t what I had in mind!

To get all the members in the group I wrote the following function which is passed to get-all:

(:require [clj-http.client :as client])

(defn members
  [{perpage :perpage offset :offset orderby :orderby}]
  (->> (client/get
        (str "https://api.meetup.com/2/members?page=" perpage
             "&offset=" offset
             "&orderby=" orderby
             "&group_urlname=" MEETUP_NAME
             "&key=" MEETUP_KEY)
        {:as :json})
       :body :results))

So to get all the members we’d do this:

(defn all-members []
  (get-all members))

I’m told that using lazy collections when side effects are involved is a bad idea - presumably because the calls to the API might never end - but since I only run it manually I can just kill the process if anything goes wrong.

I’d be interested in how others would go about solving this problem - core.async was suggested but that seems to result in much more / more complicated code than this version.

The code is on github if you want to take a look.

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket