DuckDB: Create a function in SQL
I’ve been learning about Hybrid Search via this blog post, which describes the Reciprocal Rank Fusion algorithm, and I wanted to implement and use it in a DuckDB query.
The formula for the function is shown below:
RRF(d) = Σ(r ∈ R) 1 / (k + r(d))
Where:
d is a document
R is the set of rankers (retrievers)
k is a constant (typically 60)
r(d) is the rank of document d in ranker r
This isn’t too tricky to implement in SQL, but I didn’t realise that in DuckDB it’s called a MACRO. We can write the following query to create the function:
CREATE OR REPLACE MACRO rrf(rank, k:=60) AS
coalesce((1 / (k + rank)), 0)
And then we can call it like this:
SELECT rrf(1).round(5) as "1",
rrf(2).round(5) as "2",
rrf(100).round(5) as "100",
rrf(101).round(5) as "101";
┌─────────┬─────────┬─────────┬─────────┐
│ 1 │ 2 │ 100 │ 101 │
│ double │ double │ double │ double │
├─────────┼─────────┼─────────┼─────────┤
│ 0.01639 │ 0.01613 │ 0.00625 │ 0.00621 │
└─────────┴─────────┴─────────┴─────────┘
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.