ClickHouse - How to get the first 'n' values from an array
I was recently working with some very long arrays in ClickHouse and I wanted to select just a few values so that they didn’t take up the entire screen. The way I thought would 'just work' ™ didn’t, so this blog documents how to do it.
If you want to follow along, you’ll need to install ClickHouse. On a Mac, Brew is a pretty good option:
brew install clickhouse
Once you’ve done that, launch ClickHouse Local:
clickhouse local -m
ClickHouse local version 23.10.1.1976 (official build).
MN :)
And then we’re going to return an array that contains 10 values:
SELECT range(0,10);
┌─range(0, 10)──────────┐
│ [0,1,2,3,4,5,6,7,8,9] │
└───────────────────────┘
I wanted to get just the first 5 values, so I tried to use the Python-esque syntax:
SELECT range(0,10)[:5];
But that wasn’t too happy with me:
Expected one of: token, Comma, ClosingSquareBracket, CAST operator, NOT, INTERVAL, CASE, DATE, TIMESTAMP, tuple, collection of literals, array, number, literal, NULL, Bool, true, false, string literal, asterisk, qualified asterisk, compound identifier, list of elements, identifier, COLUMNS matcher, COLUMNS, qualified COLUMNS matcher, substitution, MySQL-style global variable
What we need is the arraySlice
function, which lets us extract a variable number of values from an array.
We have to specify an offset
(remembering that array indexing in ClickHouse starts from 1
) and then a length
, which is the number of values to return.
The following does the job:
SELECT arraySlice(range(0,10), 1, 5);
┌─arraySlice(range(0, 10), 1, 5)─┐
│ [0,1,2,3,4] │
└────────────────────────────────┘
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.