Unix: tar - Extracting, creating and viewing archives
I’ve been playing around with the Unix tar command a bit this week and realised that I’d memorised some of the flag combinations but didn’t actually know what each of them meant.
For example, one of the most common things that I want to do is extract a gripped neo4j archive:
$ wget http://dist.neo4j.org/neo4j-community-1.9.2-unix.tar.gz
$ tar -xvf neo4j-community-1.9.2-unix.tar.gz
where:
-
-x means extract
-
-v means produce verbose output i.e. print out the names of all the files as you unpack it
-
-f means unpack the file which follows this flag i.e. neo4j-community-1.9.2-unix.tar.gz in this example
I didn’t realise that by default tar runs against standard input so we could actually achieve the above in one go with the following combination:</li> ~bash $ wget http://dist.neo4j.org/neo4j-community-1.9.2-unix.tar.gz -o - | tar -xv ~
The other thing I wanted to do was create a gripped archive from the contents of a folder, something which I do much less frequently and am therefore much more rusty at! The following does the trick: ~bash $ tar -cvzpf neo4j-football.tar.gz neo4j-football/ $ ls -alh neo4j-football.tar.gz -rw-r—r-- 1 markhneedham staff 526M 22 Aug 23:38 neo4j-football.tar.gz ~
where:
-
-c means create a new archive
-
-z means gzip that archive
-
-p means preserve file permissions
Sometimes we’ll want to exclude some things from our archive which is where the '--exclude' flag comes in handy.
For example I want to exclude the data, git and neo4j folders which sit inside 'neo4j-football' which I can do with the following: ~bash $ tar --exclude "data*" --exclude "neo4j-community*" --exclude ".git" -cvzpf neo4j-football.tar.gz neo4j-football/ $ ls -alh neo4j-football.tar.gz -rw-r—r-- 1 markhneedham staff 138M 22 Aug 23:36 neo4j-football.tar.gz ~
If we want to quickly check that our file has been created correctly we can run the following: ~bash $ tar -tvf neo4j-football.tar.gz ~
where:
-
-t means list the contents of the archive to standard out
And that pretty much covers my main use cases for the moment!
About the author
I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.