ZIP Archive
ZIP is the universal archive format supported natively by Windows, macOS, Linux, Android, and iOS. It bundles multiple files with DEFLATE compression, CRC-32 integrity checks, and optional AES-256 encryption — all within an open specification that any tool can read and write.
ZIP extraction and repackaging between archive formats requires format-specific libraries not available in browser WASM.
Common questions
How do I open a ZIP file without installing software?
Windows, macOS, and most Linux distributions support ZIP natively. On Windows, double-click or right-click and choose 'Extract All'. On macOS, double-click to auto-extract via Archive Utility. On Linux, run unzip archive.zip from the terminal.
How do I password-protect a ZIP file securely?
Use 7-Zip with AES-256 encryption: 7z a -tzip -mem=AES256 -p'password' archive.zip files/. Never use the default ZipCrypto encryption — it is cryptographically broken and crackable in seconds with known-plaintext attacks.
What is the maximum file size a ZIP can handle?
Standard ZIP is limited to 4 GB per file and 4 GB total archive size due to 32-bit size fields. ZIP64 extensions remove this limit, supporting files up to 16 exabytes. All modern tools create ZIP64 automatically when needed.
What is a ZIP bomb and how do I protect against it?
A ZIP bomb is a small archive that expands to enormous size when extracted — sometimes petabytes from kilobytes. Protect against them by using extraction tools that enforce decompressed-size limits (7-Zip warns about suspicious ratios) and scanning archives with antivirus before extraction.
Can I recover files from a corrupted ZIP archive?
Partial recovery is possible using zip -FF corrupt.zip --out fixed.zip, which reconstructs the archive from local file headers when the central directory is damaged. 7-Zip can also extract individual files with warnings. Full recovery depends on which sections of the archive are corrupted.
How do I password-protect a ZIP file?
Use 7-Zip (free): right-click your files, choose 7-Zip > Add to archive, set format to ZIP, enter a password, and select AES-256 encryption. Avoid the legacy ZipCrypto encryption — it is weak and easily cracked. From the command line: `zip -e -r protected.zip folder/` (uses legacy encryption) or `7z a -tzip -mem=AES256 -p'password' archive.zip files/` for proper AES-256.
My ZIP file is too large to open or extract — what can I do?
ZIP files over 4 GB require ZIP64 support. All modern tools support ZIP64, but very old software may not. If you are hitting memory limits, extract with 7-Zip or the command-line `unzip` which streams directly to disk. On Windows, right-click and 'Extract All' uses more memory than 7-Zip's extraction.
ZIP vs RAR — which is better?
ZIP is better for sharing files publicly because every operating system supports it natively. RAR is better when you control both ends (sender and receiver), need recovery records, or want slightly smaller archives. For maximum compression, use 7z instead of either.
What makes .ZIP special
What is a ZIP file?
ZIP is a lossless data compression and archive format created by Phil Katz in 1989. It can contain one or more files or directories that have been compressed to reduce total file size. ZIP is the most universally supported archive format across all operating systems.
Continue reading — full technical deep dive
Unlike formats such as GZ or BZ2 that compress a single stream, ZIP is a true container: it stores each file independently with its own compression method and CRC-32 checksum. This means individual files can be extracted without decompressing the entire archive — a property called random access. The trade-off is slightly lower compression ratios compared to solid-mode formats like 7z or RAR.
How to open ZIP files
- Windows Explorer (Windows) — Built-in extraction; right-click > Extract All
- Finder / Archive Utility (macOS) — Double-click to auto-extract
- 7-Zip (Windows) — Free, open-source; superior to built-in tools for large or complex archives
- WinRAR (Windows) — Popular archive manager with ZIP support
- The Unarchiver (macOS) — Free, handles edge cases Windows apps create
- unzip (Linux/macOS) — CLI tool pre-installed on most Unix systems
Technical specifications
| Property | Value |
|---|---|
| Compression | DEFLATE, Store (none), BZip2, LZMA |
| Encryption | AES-256 (modern), ZipCrypto (legacy, insecure) |
| Max file size | 4 GB per file (ZIP64: 16 EB) |
| Max archive size | 4 GB (ZIP64: 16 EB) |
| Multi-volume | Supported (split archives: .zip, .z01, .z02...) |
| Unicode filenames | UTF-8 supported (general purpose bit 11) |
| Random access | Yes — each file stored independently |
| Magic bytes | 50 4B 03 04 (PK\x03\x04 at start of each entry) |
Structure: how a ZIP file is laid out
ZIP has an unusual structure: the central directory (the master index of all files) lives at the end of the file, not the beginning. This enables efficient appending of files without rewriting the archive. The file on disk looks like:
[Local file header + data] × N files
[Central directory headers] × N files
[End of central directory record] ← parsers start here
This design makes ZIP efficient to append to but fragile to truncation — if the end-of-central-directory record is lost, the whole archive appears invalid even if all file data is intact.
Common use cases
- File distribution: Software downloads, installers, and web attachments
- Email attachments: ZIP compresses multiple files into one attachment
- Backup: Compress folders for archival storage
- Software packaging: Application installers (.jar, .docx, .apk, .xlsx are all ZIPs internally)
- Web assets:
.war,.earJava deployment archives are ZIP-based
ZIP64 — overcoming the 4 GB limit
The original ZIP specification stores sizes in 32-bit fields, limiting both individual files and total archive size to ~4 GB. ZIP64 extensions (added in 2001) use 64-bit fields and allow archives up to 16 exabytes. All modern tools support ZIP64 automatically. Legacy extractors from before 2001 cannot read ZIP64 archives.
.ZIP compared to alternatives
| Formats | Criteria | Winner |
|---|---|---|
| .ZIP vs .RAR | Platform compatibility ZIP is natively supported by Windows, macOS, Linux, iOS, and Android without any software installation. RAR requires third-party tools on every platform except for extraction on Windows 11. | ZIP wins |
| .ZIP vs .7Z | Compression ratio 7z's LZMA2 algorithm achieves 30-70% better compression than ZIP's DEFLATE, particularly on text, code, and repetitive data. ZIP's advantage is compatibility, not compression efficiency. | 7Z wins |
| .ZIP vs .TAR.GZ | Random access extraction ZIP stores each file independently with its own local header, enabling extraction of individual files without processing the entire archive. TAR.GZ requires sequential reading through all preceding entries. | ZIP wins |
| .ZIP vs .RAR | Corruption recovery RAR supports recovery records — redundant data blocks that allow repairing partially corrupted archives. ZIP has no recovery mechanism; a corrupted central directory makes the entire archive unreadable. | RAR wins |
Technical reference
- MIME Type
application/zip- Magic Bytes
50 4B 03 04PK signature (Phil Katz initials).- Developer
- Phil Katz / PKWARE
- Year Introduced
- 1989
- Open Standard
- Yes
PK signature (Phil Katz initials).
Binary Structure
ZIP uses a three-section layout: local file entries, central directory, and end-of-central-directory record (EOCD). Each local file entry begins with signature 50 4B 03 04 (PK\x03\x04), followed by version, flags, compression method, timestamps, CRC-32, sizes, filename length, and the compressed data. The central directory at the end of the file mirrors each local entry with signature 50 4B 01 02, adding file attributes, comments, and the byte offset to each local header. The EOCD record (50 4B 05 06) stores the total entry count and the offset to the central directory start. Parsers locate the EOCD by scanning backward from the file end. This end-anchored design makes ZIP efficient for appending files but fragile to truncation — losing the EOCD makes the entire archive unreadable even if all file data is intact.
| Offset | Length | Field | Example | Description |
|---|---|---|---|---|
0x00 | 4 bytes | Local File Header Signature | 50 4B 03 04 (PK\x03\x04) | Marks the start of each file entry. 'PK' honors Phil Katz, creator of the ZIP format. |
0x04 | 2 bytes | Version Needed | 14 00 (v2.0) | Minimum ZIP spec version required to extract. 0x0014 = 2.0 (DEFLATE). 0x002D = 4.5 (ZIP64). |
0x08 | 2 bytes | Compression Method | 08 00 (DEFLATE) | 0x0000 = Store (no compression). 0x0008 = DEFLATE. 0x000E = LZMA. |
0x0E | 4 bytes | Last Modified Time | varies | MS-DOS timestamp format: 2-second granularity, local time, no timezone information. |
0x12 | 4 bytes | CRC-32 | varies | CRC-32 checksum of uncompressed file data. Used to verify extraction integrity. |
0x16 | 4 bytes | Compressed Size | varies | Compressed data size in bytes. 0xFFFFFFFF signals ZIP64 extension with 64-bit size field. |
EOF-22 | 22 bytes | EOCD Record | 50 4B 05 06 (PK\x05\x06) | End of Central Directory. Contains entry count and offset to central directory start. Parsers scan backward from EOF to find this. |
Attack Vectors
- Zip Slip (path traversal)
- ZIP Bomb (decompression bomb)
- Malicious macros in contained Office files
- Symlink attacks
- ZipCrypto weak encryption
Mitigation: