Node.js: Minifying JSON documents
I often need to minimise the schema and table config files that you use to configure Apache Pinot so that they don’t take up so much space. After doing this manually for ages, I came across the json-stringify-pretty-compact library, which speeds up the process.
We can install it like this:
npm install json-stringify-pretty-compact
And then I have the following script:
import pretty from 'json-stringify-pretty-compact';
let inputData = '';
process.stdin.on('data', (chunk) => {
inputData += chunk;
});
process.stdin.on('end', () => {
const value = JSON.parse(inputData);
console.log(pretty(value));
});
process.stdin.resume();
Imagine we then have the following file:
{
"schemaName": "parkrun",
"primaryKeyColumns": ["competitorId"],
"dimensionFieldSpecs": [
{
"name": "runId",
"dataType": "STRING"
},
{
"name": "eventId",
"dataType": "STRING"
},
{
"name": "competitorId",
"dataType": "LONG"
},
{
"name": "rawTime",
"dataType": "INT"
},
{
"name": "lat",
"dataType": "DOUBLE"
},
{
"name": "lon",
"dataType": "DOUBLE"
},
{
"name": "location",
"dataType": "BYTES"
},
{
"name": "course",
"dataType": "STRING"
}
],
"metricFieldSpecs": [
{
"name": "distance",
"dataType": "DOUBLE"
}
],
"dateTimeFieldSpecs": [{
"name": "timestamp",
"dataType": "TIMESTAMP",
"format" : "1:MILLISECONDS:EPOCH",
"granularity": "1:MILLISECONDS"
}]
}
The field specs take up so much unnecessary space, so let’s get our script to sort that out:
cat config/schema.json | node minify.mjs
{
"schemaName": "parkrun",
"primaryKeyColumns": ["competitorId"],
"dimensionFieldSpecs": [
{"name": "runId", "dataType": "STRING"},
{"name": "eventId", "dataType": "STRING"},
{"name": "competitorId", "dataType": "LONG"},
{"name": "rawTime", "dataType": "INT"},
{"name": "lat", "dataType": "DOUBLE"},
{"name": "lon", "dataType": "DOUBLE"},
{"name": "location", "dataType": "BYTES"},
{"name": "course", "dataType": "STRING"}
],
"metricFieldSpecs": [{"name": "distance", "dataType": "DOUBLE"}],
"dateTimeFieldSpecs": [
{
"name": "timestamp",
"dataType": "TIMESTAMP",
"format": "1:MILLISECONDS:EPOCH",
"granularity": "1:MILLISECONDS"
}
]
}
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.