.GZ Gzip Compressed
.gz

Gzip Compressed

Gzip compresses a single file or data stream using the DEFLATE algorithm (LZ77 + Huffman coding), producing a .gz file with a CRC-32 integrity check. For multi-file archives, combine with TAR to create .tar.gz tarballs — the standard packaging format for Unix software distribution.

Archive structure
Header magic bytes
Entries compressed files
Index directory
CompressionDEFLATESingle-fileCRC-321992
By FileDex
Not convertible

Gzip single-file compression; archive conversion requires recompression libraries not available in browser WASM.

Common questions

What is the difference between .gz and .zip?

Gzip compresses a single file or stream using DEFLATE. ZIP is a multi-file archive format that also uses DEFLATE but bundles multiple files with a central directory index. For multiple files with gzip, you first bundle them with tar (.tar.gz). Both use the same DEFLATE algorithm, so compression ratios are similar.

Can I compress multiple files into one .gz file?

Gzip compresses only one file at a time. To compress multiple files, first create a tar archive and then compress it: tar czf archive.tar.gz folder/. The resulting .tar.gz contains all files in a single compressed archive.

How do I check if a .gz file is corrupted?

Run gzip -t file.gz to verify the CRC-32 checksum without extracting. A successful test exits silently; a corrupt file prints an error message with the specific CRC mismatch. For .tar.gz files, use tar tzf archive.tar.gz to verify both the gzip layer and the tar structure.

Is gzip still used for web compression?

Gzip remains the most widely supported HTTP compression method — every browser and web server supports Content-Encoding: gzip. Brotli (br) has largely replaced gzip for modern HTTPS connections with 15-25% better compression, but gzip is still the universal fallback and is required for HTTP/1.1 compliance.

What makes .GZ special

What is a GZ file?

GZ (GNU Zip) is a file compression format based on the DEFLATE algorithm, created by Jean-loup Gailly and Mark Adler in 1992 as a free replacement for the proprietary compress format on Unix systems. Unlike ZIP, GZ compresses a single file or data stream without bundling multiple files — for multi-file archives, GZ is combined with TAR, producing the ubiquitous .tar.gz (also written .tgz) format.

Continue reading — full technical deep dive

GZ is deeply embedded in Unix/Linux infrastructure: log rotation, web server content encoding, software distribution, and database backups all rely on it. The format is lossless and patent-free, making it universally available.

How to open GZ files

  • gunzip / gzip -d (macOS, Linux) — Built-in CLI: gunzip file.gz
  • tar (macOS, Linux) — Extract .tar.gz in one step: tar -xzf archive.tar.gz
  • 7-Zip (Windows) — Free, open-source
  • WinRAR (Windows) — Built-in GZ support
  • The Unarchiver (macOS) — Free, handles .gz and .tar.gz
  • PeaZip (Windows, Linux) — Free alternative

Technical specifications

Property Value
Algorithm DEFLATE (LZ77 + Huffman coding)
Single-file Compresses one file or stream
Checksum CRC-32 for integrity verification
Concatenation Multiple GZ streams can be concatenated
Common pairs .tar.gz / .tgz, .sql.gz, .log.gz
Magic bytes 1F 8B
MIME type application/gzip

Common use cases

  • Web compression: HTTP Content-Encoding: gzip header compresses text responses (HTML, CSS, JS, JSON) for faster transfer — typically 60-80% smaller
  • Log file rotation: logrotate on Linux compresses old log files with gzip to save disk space
  • Source code distribution: .tar.gz tarballs are the standard format for open-source software releases
  • Database backups: mysqldump | gzip > backup.sql.gz creates compressed SQL dumps
  • Pipeline compression: GZ can compress data streams in real-time (piped between processes)

GZ vs other compression formats

Format Algorithm Ratio Speed Single-file
GZ DEFLATE Good Fast
BZ2 BWT + Huffman Better Slower
XZ LZMA2 Best Slowest
ZIP DEFLATE Good Fast ❌ (multi-file)
Zstandard zstd Very good Very fast

For most use cases, GZ offers the best balance of speed and compression. XZ is preferred when maximum compression matters more than time. Zstandard (.zst) is increasingly replacing GZ in high-performance scenarios.

Working with GZ on the command line

# Compress a file (replaces original with .gz)
gzip file.sql

# Compress while keeping the original
gzip -k file.sql

# Decompress
gunzip file.sql.gz
# or
gzip -d file.sql.gz

# Create .tar.gz archive
tar -czf archive.tar.gz /path/to/folder/

# Extract .tar.gz archive
tar -xzf archive.tar.gz

# View compressed file without extracting
zcat file.txt.gz | head -20

HTTP gzip compression

Web servers enable gzip compression for text-based responses to reduce bandwidth and improve load times. In Nginx: gzip on; gzip_types text/html text/css application/javascript;. In Apache: enable mod_deflate. Browsers advertise support via the Accept-Encoding: gzip, deflate, br request header, and servers respond with Content-Encoding: gzip when they compress the body. This is transparent to the end user but significantly reduces page load times on slow connections.

Integrity verification

GZ files include a CRC-32 checksum and the uncompressed file size. If a GZ file is truncated or corrupted during transfer, gunzip will report an error. Verify integrity with gzip -t file.gz (test mode) before depending on the contents.

.GZ compared to alternatives

.GZ compared to alternative formats
Formats Criteria Winner
.GZ vs .BZ2
Compression speed
Gzip compresses and decompresses 3-5x faster than bzip2. For real-time pipelines and web serving, gzip's speed advantage outweighs bzip2's 10-15% better compression ratio.
GZ wins
.GZ vs .XZ
Compression ratio
XZ (LZMA2) produces 20-30% smaller output than gzip on text-heavy data. XZ is 5-10x slower to compress but only 2x slower to decompress.
XZ wins
.GZ vs .ZSTANDARD
Overall performance
Zstandard matches gzip compression ratios at 3-5x higher speed, and exceeds gzip ratios at comparable speed. It is increasingly replacing gzip in high-performance systems.
ZSTANDARD wins
.GZ vs .ZIP
Architecture
Both use DEFLATE compression, but gz compresses a single data stream while ZIP is a multi-file archive with a central directory. gz + tar is the Unix equivalent of ZIP's combined archive-and-compress approach.
Draw

Technical reference

MIME Type
application/gzip
Magic Bytes
1F 8B Gzip magic number.
Developer
Jean-loup Gailly / Mark Adler
Year Introduced
1992
Open Standard
Yes — View specification
000000001F8B ..

Gzip magic number.

Binary Structure

A gzip file consists of a 10-byte fixed header, optional extra fields, DEFLATE-compressed data, and an 8-byte trailer. The header starts with magic bytes 1F 8B, followed by the compression method byte (08 = DEFLATE, the only defined method), a flags byte (FTEXT, FHCRC, FEXTRA, FNAME, FCOMMENT bits), 4-byte modification time (Unix timestamp, little-endian), extra flags byte (XFL: 02 = best compression, 04 = fastest), and OS byte (00=FAT, 03=Unix, 07=Macintosh, 0B=NTFS, FF=unknown). If the FEXTRA flag is set, a 2-byte length followed by that many bytes of extra data follows the header. If FNAME is set, an original filename string (null-terminated, Latin-1) follows. If FCOMMENT is set, a null-terminated comment string follows. If FHCRC is set, a 2-byte CRC-16 of the header appears. The compressed data block uses raw DEFLATE (RFC 1951) without a zlib wrapper. The 8-byte trailer contains a CRC-32 of the uncompressed data (4 bytes, little-endian) and ISIZE (4 bytes, little-endian) — the original file size modulo 2^32. Multiple gzip streams can be concatenated into a single .gz file, and compliant decompressors must process all streams sequentially.

OffsetLengthFieldExampleDescription
0x00 2 bytes Magic Number 1F 8B Fixed gzip identification bytes. Any other value means the file is not a valid gzip stream.
0x02 1 byte Compression Method 08 (DEFLATE) Compression algorithm identifier. Only 08 (DEFLATE) is defined. Values 0-7 are reserved.
0x03 1 byte Flags (FLG) 08 (FNAME set) Bit flags: FTEXT(0x01), FHCRC(0x02), FEXTRA(0x04), FNAME(0x08), FCOMMENT(0x10). Upper 3 bits reserved.
0x04 4 bytes Modification Time varies (Unix timestamp, LE) Little-endian Unix timestamp of the original file. Value 0 means no timestamp is available.
0x08 1 byte Extra Flags (XFL) 02 (max compression) Compressor hint: 02 = best compression (slowest), 04 = fastest compression. Informational only.
0x09 1 byte Operating System 03 (Unix) OS where the file was compressed. 00=FAT, 03=Unix, 07=Macintosh, 0B=NTFS, FF=unknown.
EOF-8 4 bytes CRC-32 varies CRC-32 checksum of the uncompressed data. Detects corruption during transfer or storage.
EOF-4 4 bytes ISIZE varies Original uncompressed file size modulo 2^32 (little-endian). Files larger than 4 GB wrap around.
1992Jean-loup Gailly and Mark Adler release gzip as a free replacement for Unix compress (which used patented LZW)1996RFC 1952 published, formally specifying the GZIP file format version 4.31999HTTP/1.1 (RFC 2616) standardizes Content-Encoding: gzip for transparent web compression2004pigz released — parallel gzip implementation using multiple CPU cores2015Brotli (RFC 7932) begins displacing gzip for HTTP compression with 15-25% better ratios
Compress a file with gzip (replaces original) other
gzip file

Compresses the file in-place, replacing the original with file.gz. The original file is deleted after compression. Use gzip -k to keep the original.

Decompress a gzip file other
gunzip file.gz

Restores the original uncompressed file and removes the .gz file. Equivalent to gzip -d file.gz. Use gunzip -k or gzip -dk to keep the .gz file.

View compressed file contents without extracting other
zcat file.gz

Decompresses to stdout without modifying the .gz file. Pipe to head, grep, or other tools for inspection. zless provides pager support for large files.

Test gzip file integrity without extracting other
gzip -t file.gz

Verifies the CRC-32 checksum and ISIZE field without producing output. Returns exit code 0 on success, non-zero if the file is corrupt or truncated.

GZ UNCOMPRESSED export lossless Decompressing a .gz file restores the original uncompressed file. Required when downstream tools cannot read gzip-compressed input directly.
GZ BZ2 transcode lossless Bzip2 achieves 10-15% better compression than gzip on text-heavy data. Recompressing a .gz file as .bz2 reduces storage footprint for long-term archival at the cost of slower decompression.
GZ XZ transcode lossless XZ (LZMA2) provides 20-30% better compression ratios than gzip. Modern Linux distributions and source code releases have shifted from .tar.gz to .tar.xz for smaller download sizes.
LOW

Attack Vectors

  • Gzip bomb: a small .gz file (e.g., 45 KB) decompresses to gigabytes or terabytes of data, exhausting disk space and memory on the target system
  • HTTP gzip decompression attacks: malicious servers send gzip-compressed responses that expand to massive payloads, overwhelming client memory (BREACH attack variant)
  • Directory traversal via FNAME header: the original filename field in the gzip header could contain path separators, but modern decompressors strip path components

Mitigation: FileDex does not decompress arbitrary gzip files server-side. All format analysis is reference-only. When handling untrusted .gz files, set decompression size limits (gunzip does not enforce limits by default). For HTTP gzip, configure max_body_size in your web server/proxy.

Standard GNU compression utility, pre-installed on all Unix systems
pigz tool
Parallel gzip — uses multiple CPU cores for 4-8x faster compression
zlib library
C library implementing DEFLATE compression, used by gzip, PNG, HTTP, and Git
pako library
JavaScript zlib port for browser and Node.js gzip/inflate operations
Nginx tool
Web server with built-in gzip_static and dynamic gzip compression modules