ClickHouse: Unknown setting 'allow_nullable_key'
I’ve been playing around with ClickHouse’s Amazon reviews dataset and ran into an interesting problem when trying to set the allow_nullable_key
setting.
In this blog post, we’ll learn how and why we might choose to set it.
I started off with the following SQL statement to create a table called reviews
based on the structure of the Parquet file:
CREATE TABLE reviews
ENGINE = MergeTree
ORDER BY review_date EMPTY AS (
SELECT *
FROM s3(concat(
'https://datasets-documentation.s3.eu-west-3.amazonaws.com/',
'amazon_reviews/amazon_reviews_2015.snappy.parquet'
))
);
This throws the following exception:
Received exception:
Code: 44. DB::Exception: Sorting key contains nullable columns, but merge tree setting `allow_nullable_key` is disabled. (ILLEGAL_COLUMN)
The problem is that the Parquet file allows null values in every column, so all the columns are nullable. But, by default, the sorting key of a ClickHouse table can’t be nullable.
You used to be able to override this nullable issue using the |
So we try to set allow_nullable_key
:
CREATE TABLE reviews
ENGINE = MergeTree
ORDER BY review_date EMPTY AS (
SELECT *
FROM s3(concat(
'https://datasets-documentation.s3.eu-west-3.amazonaws.com/',
'amazon_reviews/amazon_reviews_2015.snappy.parquet'
))
)
SETTINGS allow_nullable_key = 1;
Received exception:
Code: 115. DB::Exception: Unknown setting 'allow_nullable_key'. (UNKNOWN_SETTING)
That can’t be right - I’m certain it does exist, and I can find it in the documentation.
After flailing around for a bit, Ilya showed me the errors of my ways.
This setting applies to the CREATE TABLE
statement rather than the whole query, so we need to specify it further up the query, like this:
CREATE TABLE reviews
ENGINE = MergeTree
ORDER BY review_date
SETTINGS allow_nullable_key = 1
EMPTY AS (
SELECT *
FROM s3(concat(
'https://datasets-documentation.s3.eu-west-3.amazonaws.com/',
'amazon_reviews/amazon_reviews_2015.snappy.parquet'
))
);
And now, it’s happy!
Ok.
0 rows in set. Elapsed: 0.120 sec.
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.