Clojure: Thread last (->>) vs Thread first (\->)
In many of the Clojure examples that I’ve come across the thread last (→>) macro is used to make it easier (for people from a non lispy background!) to see the transformations that the initial data structure is going through.
In one of my recent posts I showed how Jen & I had rewritten Mahout’s entropy function in Clojure:
(defn calculate-entropy [counts data-size]
(->> counts
(remove #{0})
(map (partial individual-entropy data-size))
(reduce +)))
Here we are using the thread last operator to first pass counts as the last argument of the remove function on the next line, then to pass the result of that to the map function on the next line and so on.
The function expands out like this:
(remove #{0} counts)
(map (partial individual-entropy data-size) (remove #{0} counts))
(reduce + (map (partial individual-entropy data-size) (remove #{0} counts)))
We can also use clojure.walk/macroexpand-all to see the expanded form of this function:
user> (use 'clojure.walk)
user> (macroexpand-all '(->> counts
(remove #{0})
(map (partial individual-entropy data-size))
(reduce +)))
(reduce + (map (partial individual-entropy data-size) (remove #{0} counts)))
I recently came across the thread first (->) macro while reading one of Jay Fields' blog posts and thought I’d have a play around with it.
The thread first (->) macro is similar but it passes its first argument as the first argument to the next form, then passes the result of that as the first argument to the next form and so on.
It’s pointless to convert this function to use -> because all the functions take the previous result as their last argument but just in case we wanted to the equivalent function would look like this:
(defn calculate-entropy [counts data-size]
(-> counts
(->> (remove #{0}))
(->> (map (partial individual-entropy data-size)))
(->> (reduce +))))
As you can see we end up using →> to pass counts as the last argument to remove, then map and then reduce.
The function would expand out like this:
(->> counts (remove #{0}))
(->> (->> counts (remove #{0})) (map (partial individual-entropy data-size)))
(->> (->> (->> counts (remove #{0})) (map (partial individual-entropy data-size))) (reduce +))
If we then evaluate the →> macro we end up with the nested form:
(->> (->> (remove #{0} counts) (map (partial individual-entropy data-size))) (reduce +))
(->> (map (partial individual-entropy data-size) (remove #{0} counts)) (reduce +))
(reduce + (map (partial individual-entropy data-size) (remove #{0} counts)))
I haven’t written enough Clojure to come across a real use for the thread first macro but Jay has an example on his blog showing how he refactored some code which was initially using the thread last macro to use thread first instead.
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.