Gzip Compressed
Gzip compresses a single file or data stream using the DEFLATE algorithm (LZ77 + Huffman coding), producing a .gz file with a CRC-32 integrity check. For multi-file archives, combine with TAR to create .tar.gz tarballs — the standard packaging format for Unix software distribution.
Gzip single-file compression; archive conversion requires recompression libraries not available in browser WASM.
أسئلة شائعة
What is the difference between .gz and .zip?
Gzip compresses a single file or stream using DEFLATE. ZIP is a multi-file archive format that also uses DEFLATE but bundles multiple files with a central directory index. For multiple files with gzip, you first bundle them with tar (.tar.gz). Both use the same DEFLATE algorithm, so compression ratios are similar.
Can I compress multiple files into one .gz file?
Gzip compresses only one file at a time. To compress multiple files, first create a tar archive and then compress it: tar czf archive.tar.gz folder/. The resulting .tar.gz contains all files in a single compressed archive.
How do I check if a .gz file is corrupted?
Run gzip -t file.gz to verify the CRC-32 checksum without extracting. A successful test exits silently; a corrupt file prints an error message with the specific CRC mismatch. For .tar.gz files, use tar tzf archive.tar.gz to verify both the gzip layer and the tar structure.
Is gzip still used for web compression?
Gzip remains the most widely supported HTTP compression method — every browser and web server supports Content-Encoding: gzip. Brotli (br) has largely replaced gzip for modern HTTPS connections with 15-25% better compression, but gzip is still the universal fallback and is required for HTTP/1.1 compliance.
ما يميز .GZ
What is a GZ file?
GZ (GNU Zip) is a file compression format based on the DEFLATE algorithm, created by Jean-loup Gailly and Mark Adler in 1992 as a free replacement for the proprietary compress format on Unix systems. Unlike ZIP, GZ compresses a single file or data stream without bundling multiple files — for multi-file archives, GZ is combined with TAR, producing the ubiquitous .tar.gz (also written .tgz) format.
اكتشف التفاصيل التقنية
GZ is deeply embedded in Unix/Linux infrastructure: log rotation, web server content encoding, software distribution, and database backups all rely on it. The format is lossless and patent-free, making it universally available.
How to open GZ files
- gunzip / gzip -d (macOS, Linux) — Built-in CLI:
gunzip file.gz - tar (macOS, Linux) — Extract
.tar.gzin one step:tar -xzf archive.tar.gz - 7-Zip (Windows) — Free, open-source
- WinRAR (Windows) — Built-in GZ support
- The Unarchiver (macOS) — Free, handles
.gzand.tar.gz - PeaZip (Windows, Linux) — Free alternative
Technical specifications
| Property | Value |
|---|---|
| Algorithm | DEFLATE (LZ77 + Huffman coding) |
| Single-file | Compresses one file or stream |
| Checksum | CRC-32 for integrity verification |
| Concatenation | Multiple GZ streams can be concatenated |
| Common pairs | .tar.gz / .tgz, .sql.gz, .log.gz |
| Magic bytes | 1F 8B |
| MIME type | application/gzip |
Common use cases
- Web compression: HTTP
Content-Encoding: gzipheader compresses text responses (HTML, CSS, JS, JSON) for faster transfer — typically 60-80% smaller - Log file rotation:
logrotateon Linux compresses old log files with gzip to save disk space - Source code distribution:
.tar.gztarballs are the standard format for open-source software releases - Database backups:
mysqldump | gzip > backup.sql.gzcreates compressed SQL dumps - Pipeline compression: GZ can compress data streams in real-time (piped between processes)
GZ vs other compression formats
| Format | Algorithm | Ratio | Speed | Single-file |
|---|---|---|---|---|
| GZ | DEFLATE | Good | Fast | ✅ |
| BZ2 | BWT + Huffman | Better | Slower | ✅ |
| XZ | LZMA2 | Best | Slowest | ✅ |
| ZIP | DEFLATE | Good | Fast | ❌ (multi-file) |
| Zstandard | zstd | Very good | Very fast | ✅ |
For most use cases, GZ offers the best balance of speed and compression. XZ is preferred when maximum compression matters more than time. Zstandard (.zst) is increasingly replacing GZ in high-performance scenarios.
Working with GZ on the command line
# Compress a file (replaces original with .gz)
gzip file.sql
# Compress while keeping the original
gzip -k file.sql
# Decompress
gunzip file.sql.gz
# or
gzip -d file.sql.gz
# Create .tar.gz archive
tar -czf archive.tar.gz /path/to/folder/
# Extract .tar.gz archive
tar -xzf archive.tar.gz
# View compressed file without extracting
zcat file.txt.gz | head -20
HTTP gzip compression
Web servers enable gzip compression for text-based responses to reduce bandwidth and improve load times. In Nginx: gzip on; gzip_types text/html text/css application/javascript;. In Apache: enable mod_deflate. Browsers advertise support via the Accept-Encoding: gzip, deflate, br request header, and servers respond with Content-Encoding: gzip when they compress the body. This is transparent to the end user but significantly reduces page load times on slow connections.
Integrity verification
GZ files include a CRC-32 checksum and the uncompressed file size. If a GZ file is truncated or corrupted during transfer, gunzip will report an error. Verify integrity with gzip -t file.gz (test mode) before depending on the contents.
المرجع التقني
- نوع MIME
application/gzip- Magic Bytes
1F 8BGzip magic number.- المطوّر
- Jean-loup Gailly / Mark Adler
- سنة التقديم
- 1992
- معيار مفتوح
- نعم — عرض المواصفات
Gzip magic number.
البنية الثنائية
A gzip file consists of a 10-byte fixed header, optional extra fields, DEFLATE-compressed data, and an 8-byte trailer. The header starts with magic bytes 1F 8B, followed by the compression method byte (08 = DEFLATE, the only defined method), a flags byte (FTEXT, FHCRC, FEXTRA, FNAME, FCOMMENT bits), 4-byte modification time (Unix timestamp, little-endian), extra flags byte (XFL: 02 = best compression, 04 = fastest), and OS byte (00=FAT, 03=Unix, 07=Macintosh, 0B=NTFS, FF=unknown). If the FEXTRA flag is set, a 2-byte length followed by that many bytes of extra data follows the header. If FNAME is set, an original filename string (null-terminated, Latin-1) follows. If FCOMMENT is set, a null-terminated comment string follows. If FHCRC is set, a 2-byte CRC-16 of the header appears. The compressed data block uses raw DEFLATE (RFC 1951) without a zlib wrapper. The 8-byte trailer contains a CRC-32 of the uncompressed data (4 bytes, little-endian) and ISIZE (4 bytes, little-endian) — the original file size modulo 2^32. Multiple gzip streams can be concatenated into a single .gz file, and compliant decompressors must process all streams sequentially.
| Offset | Length | Field | Example | Description |
|---|---|---|---|---|
0x00 | 2 bytes | Magic Number | 1F 8B | Fixed gzip identification bytes. Any other value means the file is not a valid gzip stream. |
0x02 | 1 byte | Compression Method | 08 (DEFLATE) | Compression algorithm identifier. Only 08 (DEFLATE) is defined. Values 0-7 are reserved. |
0x03 | 1 byte | Flags (FLG) | 08 (FNAME set) | Bit flags: FTEXT(0x01), FHCRC(0x02), FEXTRA(0x04), FNAME(0x08), FCOMMENT(0x10). Upper 3 bits reserved. |
0x04 | 4 bytes | Modification Time | varies (Unix timestamp, LE) | Little-endian Unix timestamp of the original file. Value 0 means no timestamp is available. |
0x08 | 1 byte | Extra Flags (XFL) | 02 (max compression) | Compressor hint: 02 = best compression (slowest), 04 = fastest compression. Informational only. |
0x09 | 1 byte | Operating System | 03 (Unix) | OS where the file was compressed. 00=FAT, 03=Unix, 07=Macintosh, 0B=NTFS, FF=unknown. |
EOF-8 | 4 bytes | CRC-32 | varies | CRC-32 checksum of the uncompressed data. Detects corruption during transfer or storage. |
EOF-4 | 4 bytes | ISIZE | varies | Original uncompressed file size modulo 2^32 (little-endian). Files larger than 4 GB wrap around. |
نقاط الضعف
- Gzip bomb: a small .gz file (e.g., 45 KB) decompresses to gigabytes or terabytes of data, exhausting disk space and memory on the target system
- HTTP gzip decompression attacks: malicious servers send gzip-compressed responses that expand to massive payloads, overwhelming client memory (BREACH attack variant)
- Directory traversal via FNAME header: the original filename field in the gzip header could contain path separators, but modern decompressors strip path components
الحماية: FileDex does not decompress arbitrary gzip files server-side. All format analysis is reference-only. When handling untrusted .gz files, set decompression size limits (gunzip does not enforce limits by default). For HTTP gzip, configure max_body_size in your web server/proxy.