How to Compress and Extract Files in Linux Using tar and gzip Commands

File compression serves as an efficient means to transfer files among diverse computers and servers. Linux administrators routinely perform tasks such as archiving, compressing, and extracting files. An archive is a single file that includes multiple files or directories.  If you've encountered tar files with the .tar, .tar.gz, .xz, or .bz2 extensions, they likely originated from the .tar utility. 

These archives compress disk space on local drives and facilitate convenient internet downloads. Once an archive is downloaded, users can employ unzipping tools to extract files on their systems. Linux distributions offer various command-line utilities, particularly those utilizing the tar gzip command to efficiently compress or extract large archives.

This guide demonstrates the tar and gzip command functionality for archiving, compressing, and extracting files on Linux systems. Here, we will use the Ubuntu 22.04 Linux distribution to implement and execute all commands.

What Is tar?

tar stands for “tape archive” and is a widely used command-line file compression utility. It is used for compressing and extracting archives

Compressed archives contain more than one file and directories denoted by extensions such as .tar, .tar.gz, .xz, or .bz2, known as tarballs. In Linux, developers can distribute source code and important files via tar gz files. In addition, you can also perform file compression and decompression into different formats using tar gzip and bzip2 utilities. 

What Is gzip?

gzip is a single file stream, lossless compression tool for Linux that is used for compressing and decompressing files concurrently. It supports both command-line and graphical interfaces, providing versatility. 

The key features of the gzip utility include simultaneous compression of multiple files, terminal-based execution, utilization of LZ77 and Lempel–Ziv–Welch algorithms, shortening of long file names, and automatic generation of .gz file extensions. Overall, gzip proves to be a robust and user-friendly compression solution tailored for Linux environments.

tar and gzip (File Compression Tools)

The basic difference between the two tools lies in their primary functions: 

The tar utility is responsible for creating archives from multiple files, whereas gzip is designed to compress individual files. Despite this difference, the tools can be used together. Using tar, you can perform the gzip task to compress the files it archives. By using the “z” switch with the tar command, gzip is integrated into the tar process, allowing for compressed archives to be created seamlessly.

How to Compress a File or Directory Using tar Command

The basic syntax to compress a file using tar is:

$ tar -czvf <tar compress file name> </destination path for files/or/directory>

For example, let’s say you have a file/directory named “/home/samreena” in the current directory. To compress this directory to an archive named testfile.tar.gz, you will execute the following command:

$ tar -czvf testfile.tar.gz /home/samreena

In the above command, the tar is a command line utility to compress a file or directory. After that, the flags or switch “-czvf” is used to represent the following:

-c: Creates a tar compress file or an archive file

-z: Compress the archive using gzip

-v: Display the "verbose," this option will show the progress details in the terminal window during archive creation.

-f: This parameter allows you to assign an archive file name.

how to compress and extract files in linux using tar and gzip commands

To verify it, use the “ls” command to list all files and directories of the current directory:

how to compress and extract files in linux using tar and gzip commands

How to Compress Multiple Files or Directories to an Archive File

To compress multiple files or directories to a single archive, navigate into the directory where all files or sub-directories are located using the ‘cd’ command. Use the tar command with the appropriate flags, such as “tar czvf,” to create a compressed archive. For example, to compress files into a gzip-compressed archive, you can use the following command:

$ tar -czvf <archive_name.tar.gz> <file1 file2 file3 …>

For instance, if you have three files named “index.html,” “testfile.tar.gz,” and “testfolder2” in the current Documents directory and you want to compress them into an archive named “test_archive.tar.gz,” you would run the following command:

$ tar -czf test_archive.tar.gz index.html testfile.tar.gz testfolder2

how to compress and extract files in linux using tar and gzip commands

This will create a compressed archive named test_archive.tar.gz containing the specified files.

How to Exclude Files or Directories during Archiving File

When archiving a directory, it's a common case where certain files should be omitted from the archive. In this case, you can use the --exclude option. In this way, you can identify files to be excluded from the created archive. Files matching the specified patterns under the --exclude option will be automatically left out of the tar archive. To exclude files from archiving, use the following command:

$ tar --exclude='*.html' -czvf test_archive.tar.gz /home/samreena

How to Extract Files or Directories from an Archive Using the tar Command

To extract files using the tar command in Linux, it's essential to note that tar, a default tool on most Linux distributions, supports various compression formats like bzip2, gzip, lzip, lzop, lzma, xz, and compress. When creating a tar archive, it's advisable to include the appropriate suffix in the archive file name to indicate the compression format. For example, let’s say you have a tar archive that is compressed with gzip, and the file is named “test_archive.tar.gz.” To extract a compressed tar file, you can use the following syntax:

$ tar -xf test_archive.tar.gz

If you want to list all file names during the extraction process, you will use the “-v” option:

$ tar -xvf test_archive.tar.gz

how to compress and extract files in linux using tar and gzip commands

It's important to note that this command extracts the contents of the tar archive into the current working directory by default.

How to Extract an Archive Into a Specific Directory

To extract a tar file into a different directory instead of the current directory, use the “-C” option and specify the directory path.

Suppose you want to extract archive contents in “home/samreena.” In that case, specify the directory path and archive name using the following command:

$ tar -xvf test_archive.tar.gz -C /home/samreena

how to compress and extract files in linux using tar and gzip commands

How to Extract a Specific File or Directory From an Archive

At times, there is a need to extract specific files or directories from a tar archive rather than extracting the entire content. To achieve this, one can utilize a space-separated list of file or directory names following the archive name in the command.

For example, let’s say you want to extract "index.html" and "testfolder2" from the tar archive. In that case, the following command can be executed in the terminal:

$ tar -xf test_archive.tar,gz index.html testfolder2

In situations where an attempt is made to extract a file that is not present in the archive, the command provides feedback by displaying a “Not found in archive” error message:

$ tar -xf test_archive.tar.gz README

The tar command switches:

x: Used to extract the tar archive or specific files from an archive.

v: Displays verbose information or file names during the processing of an archive.

f: Instruct tar regarding the files it should operate on.

t: Displays a list of file names included in an archive.

Conclusion

In this tutorial, we demonstrated how to compress or extract files using tar and gzip tools in Linux. Both tools are useful for compressing and decompressing files and directories on your system. If you are using a Linux VPS server, you can do the file compression and extraction efficiently with the help of these utilities. Feel free to ask questions if you have any.