Clojure: Paging meetup data using lazy sequences
I’ve been playing around with the meetup API to do some analysis on the Neo4j London meetup and one thing I wanted to do was download all the members of the group.
A feature of the meetup API is that each end point will only allow you to return a maximum of 200 records so I needed to make use of offsets and paging to retrieve everybody.
It seemed like a good chance to use some lazy sequences to keep track of the offsets and then stop making calls to the API once I wasn’t retrieving any more results.
I wrote the following functions to take care of that bit:
(defn unchunk [s]
(when (seq s)
(cons (first s)
(unchunk (next s))))))
(defn offsets []
(unchunk (range)))
(defn get-all [api-fn]
(take-while seq
(map #(api-fn {:perpage 200 :offset % :orderby "name"}) (offsets)))))
I previously wrote about the chunking behaviour of lazy collections which meant that I ended up with a minimum of 32 calls to each URI which wasn’t what I had in mind!
To get all the members in the group I wrote the following function which is passed to get-all:
(:require [clj-http.client :as client])
(defn members
[{perpage :perpage offset :offset orderby :orderby}]
(->> (client/get
(str "" perpage
"&offset=" offset
"&orderby=" orderby
"&group_urlname=" MEETUP_NAME
"&key=" MEETUP_KEY)
{:as :json})
:body :results))
So to get all the members we’d do this:
(defn all-members []
(get-all members))
I’m told that using lazy collections when side effects are involved is a bad idea - presumably because the calls to the API might never end - but since I only run it manually I can just kill the process if anything goes wrong.
I’d be interested in how others would go about solving this problem - core.async was suggested but that seems to result in much more / more complicated code than this version.
The code is on github if you want to take a look.
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.