Microsoft Word Document (Open XML)
DOCX is Microsoft Word's default format since Office 2007, storing documents as compressed XML inside a ZIP container (Office Open XML, ISO/IEC 29500). Convert DOCX to PDF using LibreOffice headless mode or open in Google Docs, Apple Pages, or OnlyOffice.
Office Open XML requires complex rendering engine for styles, templates, and embedded objects not available in browser WASM.
Looking to convert? Try a related format:
Common questions
How do I convert DOCX to PDF without Microsoft Word?
Install LibreOffice and run: libreoffice --headless --convert-to pdf input.docx. This produces a PDF with preserved formatting, fonts, and images. Google Docs also exports DOCX to PDF via File > Download > PDF.
Can I open a DOCX file on Linux?
LibreOffice Writer opens DOCX files natively on all Linux distributions. Install via your package manager (apt install libreoffice-writer on Debian/Ubuntu). OnlyOffice is another option with high DOCX fidelity.
Why does my DOCX look different in LibreOffice vs Word?
Font substitution is the primary cause. DOCX files reference Windows fonts (Calibri, Cambria) that may not be installed on Linux. Install the Microsoft core fonts package (ttf-mscorefonts-installer) or embed fonts in the document before sharing.
Is DOCX the same as DOC?
No. DOC is a proprietary binary format used by Word 97-2003. DOCX is an open XML format (ISO/IEC 29500) introduced with Office 2007. DOCX files are smaller, more interoperable, and easier to recover from corruption because the XML is human-readable.
What makes .DOCX special
What is a DOCX file?
DOCX is the default document format for Microsoft Word since Office 2007. It uses the Office Open XML (OOXML) standard, storing content as compressed XML files inside a ZIP container. This makes it more compact and resilient than the legacy binary DOC format.
Continue reading — full technical deep dive
How to open DOCX files
- Microsoft Word (Windows, macOS, Web) — Full editing
- Google Docs (Web) — Free, online editing
- LibreOffice Writer (Windows, macOS, Linux) — Free, open-source
- Apple Pages (macOS, iOS) — Free
- OnlyOffice (Windows, macOS, Linux) — Free, open-source
Technical specifications
| Property | Value |
|---|---|
| Format | Office Open XML (OOXML) |
| Container | ZIP archive |
| Content | XML documents + media resources |
| Standard | ISO/IEC 29500, ECMA-376 |
| Macros | .docm extension for macro-enabled |
Programs that open DOCX files
- Microsoft Word — Native editor
- Google Docs — Free online editing
- LibreOffice Writer — Free office suite
- WPS Office — Free alternative
- OnlyOffice — Open-source office
Common use cases
- Business documents: Reports, letters, proposals
- Academic papers: Essays, theses, dissertations
- Resumes: Job applications and CVs
- Legal documents: Contracts and agreements
.DOCX compared to alternatives
| Formats | Criteria | Winner |
|---|---|---|
| .DOCX vs .DOC | File size DOCX uses ZIP compression, typically producing files 30-50% smaller than equivalent binary DOC files. The XML structure also compresses text content more efficiently than the binary compound document format. | DOCX wins |
| .DOCX vs .ODT | Interoperability Both are XML-in-ZIP formats with ISO standards (29500 vs 26300). DOCX has wider application support due to Office market share. ODT has better fidelity in LibreOffice. Complex features like SmartArt and ActiveX controls exist only in DOCX. | Draw |
| .DOCX vs .PDF | Editability DOCX stores content as semantic paragraphs with styles, sections, and tracked changes — designed for editing. PDF stores positioned glyphs in content streams optimized for rendering, not modification. | DOCX wins |
| .DOCX vs .RTF | Feature support RTF supports basic formatting, tables, and images but lacks track changes, SmartArt, themes, content controls, and structured document properties. DOCX supports the full Office feature set including OLE embedding. | DOCX wins |
Technical reference
- MIME Type
application/vnd.openxmlformats-officedocument.wordprocessingml.document- Magic Bytes
50 4B 03 04ZIP archive header. Contains [Content_Types].xml and word/ directory.- Developer
- Microsoft / Ecma International
- Year Introduced
- 2007
- Open Standard
- Yes — View specification
ZIP archive header. Contains [Content_Types].xml and word/ directory.
Binary Structure
A DOCX file is a standard ZIP archive with the magic bytes 50 4B 03 04 (PK). Inside, [Content_Types].xml at the root declares MIME types for each part. The _rels/.rels file defines relationships between parts. The main document body lives in word/document.xml, containing paragraphs (<w:p>), runs (<w:r>), and text nodes (<w:t>) in the WordprocessingML namespace. Styles are defined in word/styles.xml, numbering definitions in word/numbering.xml, and font tables in word/fontTable.xml. Images and media are stored in word/media/ as binary files referenced by relationship IDs. Headers and footers are separate XML files (header1.xml, footer1.xml) linked via section properties. The word/settings.xml file controls document-level settings like track changes, compatibility mode, and zoom level.
| Offset | Length | Field | Example | Description |
|---|---|---|---|---|
0x00 | 4 bytes | ZIP Signature | 50 4B 03 04 (PK) | Standard ZIP local file header. Shared with all OOXML formats (.xlsx, .pptx) and other ZIP-based files. |
0x04 | 2 bytes | Version needed | 14 00 (v2.0) | Minimum ZIP version needed to extract. OOXML typically uses version 2.0 (value 20). |
0x1A | 2 bytes | Filename length | 13 00 | Length of the first archived filename, usually [Content_Types].xml. |
Attack Vectors
- Macro-enabled DOCM files (renamed to DOCX) can execute VBA code on opening if macros are enabled in the user's Office security settings
- External data connections and linked OLE objects can fetch remote payloads when the document is opened, bypassing initial file scanning
- Embedded ActiveX controls in DOCX files can execute arbitrary code in Office versions prior to the Protected View sandbox
- Template injection via _rels/document.xml.rels can redirect the document template to a remote URL hosting a macro-enabled template
Mitigation: FileDex does not execute DOCX files. The format page is reference-only. For safe handling, always open untrusted DOCX files in Protected View (Office) or upload to Google Docs (which strips macros and active content).