Plain Text
TXT is a raw character stream with no formatting, no metadata, and no binary structure. Every operating system, programming language, and text editor reads TXT natively, making it the most interoperable file format in existence.
Plain text has no binary structure to convert. Content interpretation depends on context, not format.
Common questions
How do I find out what encoding a TXT file uses?
Run file --mime-encoding filename.txt on Linux/macOS to detect encoding via byte pattern analysis. On Windows, Notepad++ shows the encoding in the status bar. If the file contains mojibake (garbled characters), it was likely saved in one encoding and opened in another.
Why does my TXT file show weird characters when I open it?
Encoding mismatch is the cause. The file was saved in Windows-1252 or ISO-8859-1 but your editor is reading it as UTF-8, or vice versa. Use iconv to convert the file to UTF-8, or manually select the correct encoding in your text editor.
What is the difference between LF and CRLF line endings?
LF (0x0A) is the Unix/Linux/macOS line terminator. CRLF (0x0D 0x0A) is the Windows line terminator. Mixing them in one file causes visible ^M characters in Unix terminals and can break shell scripts. Git's core.autocrlf setting handles conversion automatically across platforms.
Should I add a UTF-8 BOM to my text files?
No, unless a specific tool requires it. UTF-8 BOM (EF BB BF) is unnecessary because UTF-8 has no byte-order ambiguity. BOM breaks JSON parsing (RFC 8259 prohibits it), corrupts shell script shebangs, and confuses CSV header detection. Windows Notepad historically added BOM by default; modern Notepad saves without it.
What makes .TXT special
What is a TXT file?
TXT (Plain Text) is the simplest and most universal text format. It contains only raw text characters with no formatting, styling, or embedded objects. TXT files can be read and edited by virtually any application on any platform.
Continue reading — full technical deep dive
How to open TXT files
- Notepad (Windows) — Built-in
- TextEdit (macOS) — Built-in
- nano / vim (Linux) — Terminal editors
- VS Code (Windows, macOS, Linux) — Free editor
- Any web browser — Direct rendering
Technical specifications
| Property | Value |
|---|---|
| Encoding | ASCII, UTF-8, UTF-16, etc. |
| Formatting | None (plain text only) |
| Line Endings | LF (Unix), CRLF (Windows), CR (old Mac) |
| Max Size | Limited only by filesystem |
| Metadata | None |
Programs that open TXT files
- Notepad / Notepad++ — Windows text editors
- VS Code — Cross-platform code editor
- Sublime Text — Fast, feature-rich editor
- TextEdit — macOS built-in editor
- Any text editor — Universal support
Common use cases
- Notes: Quick notes and reminders
- Configuration: Config files (often .conf or .cfg)
- Logs: Application and system logs
- README files: Project documentation
- Data exchange: Simple data interchange
.TXT compared to alternatives
| Formats | Criteria | Winner |
|---|---|---|
| .TXT vs .RTF | Formatting support RTF supports bold, italic, fonts, colors, and tables through control words. TXT carries zero formatting — what you see is the raw character data. | RTF wins |
| .TXT vs .RTF | Interoperability TXT opens in every application on every platform without a parser. RTF requires a word processor or RTF-aware editor to render formatting correctly. | TXT wins |
| .TXT vs .CSV | Tabular data CSV has a defined delimiter and quoting convention (RFC 4180) for structured rows and columns. TXT has no built-in column semantics — tabular data in TXT requires ad-hoc parsing. | CSV wins |
| .TXT vs .MARKDOWN | Readability vs formatting Markdown adds lightweight formatting (headings, links, code blocks) while remaining human-readable as plain text. TXT has no formatting syntax at all. | MARKDOWN wins |
Technical reference
- MIME Type
text/plain- Developer
- N/A (universal standard)
- Year Introduced
- 1963
- Open Standard
- Yes
Binary Structure
TXT files have no binary structure. The file is a raw stream of encoded characters terminated by an OS-dependent line ending convention: LF (0x0A) on Unix/Linux/macOS, CRLF (0x0D 0x0A) on Windows, or CR (0x0D) on legacy Mac OS 9 and earlier. An optional Byte Order Mark (BOM) may appear at byte 0: EF BB BF indicates UTF-8, FF FE indicates UTF-16LE, FE FF indicates UTF-16BE, and FF FE 00 00 indicates UTF-32LE. The BOM is not required by any spec and many tools strip or ignore it. Without a BOM, encoding detection relies on heuristics or external metadata (HTTP Content-Type charset, XML declaration, locale defaults). Null bytes (0x00) are invalid in most plain text contexts and indicate either binary contamination or UTF-16/UTF-32 encoding. The file has no header, no footer, no index, and no length field — EOF is determined entirely by the filesystem.
Attack Vectors
- Homoglyph substitution — visually identical Unicode characters (Cyrillic а vs Latin a) used in phishing URLs or spoofed filenames embedded in text
- Bidirectional text override (U+202E) — right-to-left override character reverses displayed filename to disguise executable extensions (e.g., txt.exe appears as exe.txt)
- Oversized file denial of service — multi-gigabyte TXT file exhausts memory in editors that load entire file into RAM
Mitigation: TXT files contain no executable code, no macros, and no embedded objects. FileDex processes text files entirely in the browser with no server upload. The primary risk vector is encoding-based deception, not code execution.