.TXT Plain Text
.txt

Plain Text

TXT is a raw character stream with no formatting, no metadata, and no binary structure. Every operating system, programming language, and text editor reads TXT natively, making it the most interoperable file format in existence.

Document structure
Header version
Body content tree
Index references
Plain Texttext/plainNo Binary HeaderASCII 1963UTF-8 1993Universal
By FileDex
Not convertible

Plain text has no binary structure to convert. Content interpretation depends on context, not format.

Common questions

How do I find out what encoding a TXT file uses?

Run file --mime-encoding filename.txt on Linux/macOS to detect encoding via byte pattern analysis. On Windows, Notepad++ shows the encoding in the status bar. If the file contains mojibake (garbled characters), it was likely saved in one encoding and opened in another.

Why does my TXT file show weird characters when I open it?

Encoding mismatch is the cause. The file was saved in Windows-1252 or ISO-8859-1 but your editor is reading it as UTF-8, or vice versa. Use iconv to convert the file to UTF-8, or manually select the correct encoding in your text editor.

What is the difference between LF and CRLF line endings?

LF (0x0A) is the Unix/Linux/macOS line terminator. CRLF (0x0D 0x0A) is the Windows line terminator. Mixing them in one file causes visible ^M characters in Unix terminals and can break shell scripts. Git's core.autocrlf setting handles conversion automatically across platforms.

Should I add a UTF-8 BOM to my text files?

No, unless a specific tool requires it. UTF-8 BOM (EF BB BF) is unnecessary because UTF-8 has no byte-order ambiguity. BOM breaks JSON parsing (RFC 8259 prohibits it), corrupts shell script shebangs, and confuses CSV header detection. Windows Notepad historically added BOM by default; modern Notepad saves without it.

What makes .TXT special

What is a TXT file?

TXT (Plain Text) is the simplest and most universal text format. It contains only raw text characters with no formatting, styling, or embedded objects. TXT files can be read and edited by virtually any application on any platform.

Continue reading — full technical deep dive

How to open TXT files

  • Notepad (Windows) — Built-in
  • TextEdit (macOS) — Built-in
  • nano / vim (Linux) — Terminal editors
  • VS Code (Windows, macOS, Linux) — Free editor
  • Any web browser — Direct rendering

Technical specifications

Property Value
Encoding ASCII, UTF-8, UTF-16, etc.
Formatting None (plain text only)
Line Endings LF (Unix), CRLF (Windows), CR (old Mac)
Max Size Limited only by filesystem
Metadata None

Programs that open TXT files

  • Notepad / Notepad++ — Windows text editors
  • VS Code — Cross-platform code editor
  • Sublime Text — Fast, feature-rich editor
  • TextEdit — macOS built-in editor
  • Any text editor — Universal support

Common use cases

  • Notes: Quick notes and reminders
  • Configuration: Config files (often .conf or .cfg)
  • Logs: Application and system logs
  • README files: Project documentation
  • Data exchange: Simple data interchange

.TXT compared to alternatives

.TXT compared to alternative formats
Formats Criteria Winner
.TXT vs .RTF
Formatting support
RTF supports bold, italic, fonts, colors, and tables through control words. TXT carries zero formatting — what you see is the raw character data.
RTF wins
.TXT vs .RTF
Interoperability
TXT opens in every application on every platform without a parser. RTF requires a word processor or RTF-aware editor to render formatting correctly.
TXT wins
.TXT vs .CSV
Tabular data
CSV has a defined delimiter and quoting convention (RFC 4180) for structured rows and columns. TXT has no built-in column semantics — tabular data in TXT requires ad-hoc parsing.
CSV wins
.TXT vs .MARKDOWN
Readability vs formatting
Markdown adds lightweight formatting (headings, links, code blocks) while remaining human-readable as plain text. TXT has no formatting syntax at all.
MARKDOWN wins

Technical reference

MIME Type
text/plain
Developer
N/A (universal standard)
Year Introduced
1963
Open Standard
Yes

Binary Structure

TXT files have no binary structure. The file is a raw stream of encoded characters terminated by an OS-dependent line ending convention: LF (0x0A) on Unix/Linux/macOS, CRLF (0x0D 0x0A) on Windows, or CR (0x0D) on legacy Mac OS 9 and earlier. An optional Byte Order Mark (BOM) may appear at byte 0: EF BB BF indicates UTF-8, FF FE indicates UTF-16LE, FE FF indicates UTF-16BE, and FF FE 00 00 indicates UTF-32LE. The BOM is not required by any spec and many tools strip or ignore it. Without a BOM, encoding detection relies on heuristics or external metadata (HTTP Content-Type charset, XML declaration, locale defaults). Null bytes (0x00) are invalid in most plain text contexts and indicate either binary contamination or UTF-16/UTF-32 encoding. The file has no header, no footer, no index, and no length field — EOF is determined entirely by the filesystem.

1963ASCII (American Standard Code for Information Interchange) standardized as ASA X3.4-1963 — 7-bit, 128 code points1967ASCII revised to final form (ANSI X3.4-1967) still used today1987ISO 8859-1 (Latin-1) extends ASCII to 8-bit (256 code points) for Western European languages1991Unicode 1.0 published — universal character set aiming to encode all writing systems1993UTF-8 encoding designed by Ken Thompson and Rob Pike — variable-width, ASCII-compatible, backward-safe2003RFC 3629 restricts UTF-8 to U+0000 through U+10FFFF, making it the dominant text encoding for the web2008UTF-8 surpasses ASCII as the most common encoding on the web (W3Techs data)
Detect file encoding other
file --mime-encoding input.txt

Uses libmagic to detect the character encoding of a text file by analyzing byte patterns. Returns values like utf-8, ascii, iso-8859-1, or binary.

Convert Windows-1252 to UTF-8 other
iconv -f WINDOWS-1252 -t UTF-8 input.txt > output.txt

iconv transcodes character encoding between any two charsets. -f specifies source encoding, -t specifies target. Fails on byte sequences invalid in the source encoding.

Convert Windows CRLF line endings to Unix LF other
sed -i 's/\r$//' input.txt

Strips trailing carriage return (0x0D) from each line, converting CRLF to LF. The -i flag edits the file in-place. Essential when running Windows-edited scripts on Linux.

Strip UTF-8 BOM from a file other
sed -i '1s/^\xEF\xBB\xBF//' input.txt

Removes the 3-byte UTF-8 BOM (EF BB BF) from the first line of a file. BOM presence breaks JSON parsers, shell script shebangs, and CSV column header detection.

Count lines, words, and characters other
wc -lwm input.txt

wc reports line count (-l), word count (-w), and character count (-m, multibyte-aware). Use -c instead of -m for raw byte count.

TXT PDF render lossless PDF preserves exact page layout and typography for printing, archival, or sharing documents where the recipient should not modify content. Converting TXT to PDF fixes font, margins, and page breaks that plain text cannot express.
TXT CSV render lossless Tab-delimited or space-delimited TXT log files can be restructured into CSV for import into spreadsheets and data analysis tools like pandas or Excel.
TXT JSON render lossless Line-delimited or key-value TXT config files can be parsed into structured JSON for consumption by APIs or web applications that expect typed data.
LOW

Attack Vectors

  • Homoglyph substitution — visually identical Unicode characters (Cyrillic а vs Latin a) used in phishing URLs or spoofed filenames embedded in text
  • Bidirectional text override (U+202E) — right-to-left override character reverses displayed filename to disguise executable extensions (e.g., txt.exe appears as exe.txt)
  • Oversized file denial of service — multi-gigabyte TXT file exhausts memory in editors that load entire file into RAM

Mitigation: TXT files contain no executable code, no macros, and no embedded objects. FileDex processes text files entirely in the browser with no server upload. The primary risk vector is encoding-based deception, not code execution.

Notepad++ tool
Windows text editor with encoding detection, conversion, and EOL visualization
Cross-platform editor with built-in encoding picker and EOL toggle
iconv tool
POSIX CLI for character encoding conversion between any two charsets
chardet library
Python library for automatic character encoding detection
dos2unix tool
CLI tool to convert CRLF line endings to LF and vice versa