Apache Pinot: Inserts from SQL - Unable to get tasks states map - No task is generated for table
I recently wrote a post on the StarTre blog describing the inserts from SQL feature that was added in Apache Pinot 0.11, and while writing it I came across some interesting exceptions due to configuration mistakes I’d made. In this post we’re going to describe one of those exceptions.
To recap, I was trying to ingest a bunch of JSON files from an S3 bucket using the following SQL query:
INSERT INTO "events"
FROM FILE 's3://marks-st-cloud-bucket/events/*.json'
Don’t worry, those credentials were deactivated and deleted several days ago. |
When I ran this query against a Pinot cluster that contained a controller, broker, and server, I got the following exception:
"message": "QueryExecutionError:\norg.apache.commons.httpclient.HttpException: Unable to get tasks states map. Error code 400, Error message: {\"code\":400,\"error\":\"No task is generated for table: events, with task type: SegmentGenerationAndPushTask\"}\n\tat org.apache.pinot.common.minion.MinionClient.executeTask(MinionClient.java:123)\n\tat org.apache.pinot.core.query.executor.sql.SqlQueryExecutor.executeDMLStatement(SqlQueryExecutor.java:102)\n\tat org.apache.pinot.controller.api.resources.PinotQueryResource.executeSqlQuery(PinotQueryResource.java:145)\n\tat org.apache.pinot.controller.api.resources.PinotQueryResource.handlePostSql(PinotQueryResource.java:103)",
"errorCode": 200
My mistake here was that I didn’t have a minion in the cluster. The ingestion job is run by the minion component, so without one of those this feature doesn’t work!
An update (30th June 2023)
Today I learned that you can get this error even if you do have a minion configured. The scenario that results in this error is if no files are found for ingestion.
This might happen if you have an invalid glob expression in the includeFileNamePattern
For example, the following throws the exception:
SET taskName = 'events-task7';
SET input.fs.className = 'org.apache.pinot.spi.filesystem.LocalPinotFS';
SET includeFileNamePattern='glob:customers.csv';
INSERT INTO customers
FROM FILE 'file:///input/';
We can fix the query by adding **/
at the beginning:
SET taskName = 'events-task7';
SET input.fs.className = 'org.apache.pinot.spi.filesystem.LocalPinotFS';
SET includeFileNamePattern='glob:**/customers.csv';
INSERT INTO customers
FROM FILE 'file:///input/';
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.