Introduction
Working with compressed recordsdata is a standard incidence in programming, and of the varied forms of compressed recordsdata, .gz recordsdata are fairly widespread within the Unix and Linux world.
On this Byte, we’ll discover a number of methods to unzip a .gz file utilizing Python. We’ll get into Python’s gzip
module, learn the contents of a .gz file, and extra.
What are .gz recordsdata?
.gz is the file extension for Gzip recordsdata (GNU zip), which is a file compression and decompression device created by the GNU challenge. It is typically utilized in Unix and Linux methods for compressing issues like log recordsdata, knowledge backups, and so forth.
In Python, we will use the built-in gzip
module to learn and write .gz recordsdata. This module gives a simple and easy-to-use interface for dealing with .gz recordsdata.
Studying Contents of a .gz File
To learn the contents of a .gz file in Python, we will use the gzip.open()
operate. This opens the .gz file in binary or textual content mode, identical to the built-in Python open()
operate. Here is a easy instance:
import gzip
with gzip.open('file.txt.gz', 'rt') as f:
file_content = f.learn()
print(file_content)
Within the above code, file.txt.gz
is the .gz file we need to learn, and 'rt'
is the mode wherein we need to open the file. The ‘r’ stands for learn mode and ‘t’ stands for textual content mode. The output would be the content material of the .gz file.
Utilizing Python’s gzip Module
The gzip
module in Python gives the GzipFile
class which is definitely modeled after Python’s File
object. This class has strategies and properties much like those out there within the File
object, which makes it simpler to work with .gz recordsdata.
Here is an instance of how one can use the gzip
module to learn a .gz file:
import gzip
with gzip.GzipFile('file.txt.gz', 'r') as f:
file_content = f.learn()
print(file_content)
Within the above code, we’re utilizing the GzipFile
class to open the .gz file. The ‘r’ mode signifies that we need to open the file for studying. The output would be the content material of the .gz file.
Notice: Whereas each gzip.open()
and gzip.GzipFile()
can be utilized to learn .gz recordsdata, gzip.open()
is extra handy if you need to learn the file as textual content, whereas gzip.GzipFile()
is healthier for studying binary recordsdata.
Utilizing the sh Module for Unzipping
The sh
module is a full-fledged subprocess interface for Python that means that you can name any program as if it have been a operate. To unzip a .gz file, we will use the gunzip
command via the sh
module. Let’s examine how one can do it.
First, you’ll want to set up the sh
module if you have not achieved it but. You’ll be able to set up it utilizing pip:
$ pip set up sh
As soon as put in, you should use it to unzip a .gz file:
from sh import gunzip
gunzip("/path/to/yourfile.gz")
This script will unzip the yourfile.gz
file in the identical listing. If you wish to specify a special output listing, you are able to do so by passing it as a second argument to the gunzip
operate:
from sh import gunzip
gunzip("/path/to/yourfile.gz", "-c > /path/to/output/yourfile")
Utilizing the tarfile Module
The tarfile
module makes it doable to learn and write tar archives, together with these utilizing gzip compression. Right here is an instance:
import tarfile
with tarfile.open("/path/to/yourfile.tar.gz", "r:gz") as tar:
tar.extractall(path="/path/to/output")
Notice: The tarfile
module is simply relevant for .tar.gz recordsdata. In case you’re coping with a easy .gz file, you need to use the gzip
module or the sh
module with the gunzip
command.
Conclusion
On this Byte, now we have seen how one can unzip .gz recordsdata in Python utilizing the sh
module, the gzip
module, and the tarfile
module. Every methodology has its personal benefits and use instances, so select the one which most accurately fits your wants.