.TXT Plain Text
.txt

Plain Text

TXT is a raw character stream with no formatting, no metadata, and no binary structure. Every operating system, programming language, and text editor reads TXT natively, making it the most interoperable file format in existence.

بنية الصيغة
Header version
Body content tree
Index references
Plain Texttext/plainNo Binary HeaderASCII 1963UTF-8 1993Universal
بواسطة FileDex
غير قابل للتحويل

Plain text has no binary structure to convert. Content interpretation depends on context, not format.

أسئلة شائعة

How do I find out what encoding a TXT file uses?

Run file --mime-encoding filename.txt on Linux/macOS to detect encoding via byte pattern analysis. On Windows, Notepad++ shows the encoding in the status bar. If the file contains mojibake (garbled characters), it was likely saved in one encoding and opened in another.

Why does my TXT file show weird characters when I open it?

Encoding mismatch is the cause. The file was saved in Windows-1252 or ISO-8859-1 but your editor is reading it as UTF-8, or vice versa. Use iconv to convert the file to UTF-8, or manually select the correct encoding in your text editor.

What is the difference between LF and CRLF line endings?

LF (0x0A) is the Unix/Linux/macOS line terminator. CRLF (0x0D 0x0A) is the Windows line terminator. Mixing them in one file causes visible ^M characters in Unix terminals and can break shell scripts. Git's core.autocrlf setting handles conversion automatically across platforms.

Should I add a UTF-8 BOM to my text files?

No, unless a specific tool requires it. UTF-8 BOM (EF BB BF) is unnecessary because UTF-8 has no byte-order ambiguity. BOM breaks JSON parsing (RFC 8259 prohibits it), corrupts shell script shebangs, and confuses CSV header detection. Windows Notepad historically added BOM by default; modern Notepad saves without it.

ما يميز .TXT

What is a TXT file?

TXT (Plain Text) is the simplest and most universal text format. It contains only raw text characters with no formatting, styling, or embedded objects. TXT files can be read and edited by virtually any application on any platform.

اكتشف التفاصيل التقنية

How to open TXT files

  • Notepad (Windows) — Built-in
  • TextEdit (macOS) — Built-in
  • nano / vim (Linux) — Terminal editors
  • VS Code (Windows, macOS, Linux) — Free editor
  • Any web browser — Direct rendering

Technical specifications

Property Value
Encoding ASCII, UTF-8, UTF-16, etc.
Formatting None (plain text only)
Line Endings LF (Unix), CRLF (Windows), CR (old Mac)
Max Size Limited only by filesystem
Metadata None

Programs that open TXT files

  • Notepad / Notepad++ — Windows text editors
  • VS Code — Cross-platform code editor
  • Sublime Text — Fast, feature-rich editor
  • TextEdit — macOS built-in editor
  • Any text editor — Universal support

Common use cases

  • Notes: Quick notes and reminders
  • Configuration: Config files (often .conf or .cfg)
  • Logs: Application and system logs
  • README files: Project documentation
  • Data exchange: Simple data interchange

المرجع التقني

نوع MIME
text/plain
المطوّر
N/A (universal standard)
سنة التقديم
1963
معيار مفتوح
نعم

البنية الثنائية

TXT files have no binary structure. The file is a raw stream of encoded characters terminated by an OS-dependent line ending convention: LF (0x0A) on Unix/Linux/macOS, CRLF (0x0D 0x0A) on Windows, or CR (0x0D) on legacy Mac OS 9 and earlier. An optional Byte Order Mark (BOM) may appear at byte 0: EF BB BF indicates UTF-8, FF FE indicates UTF-16LE, FE FF indicates UTF-16BE, and FF FE 00 00 indicates UTF-32LE. The BOM is not required by any spec and many tools strip or ignore it. Without a BOM, encoding detection relies on heuristics or external metadata (HTTP Content-Type charset, XML declaration, locale defaults). Null bytes (0x00) are invalid in most plain text contexts and indicate either binary contamination or UTF-16/UTF-32 encoding. The file has no header, no footer, no index, and no length field — EOF is determined entirely by the filesystem.

1963ASCII (American Standard Code for Information Interchange) standardized as ASA X3.4-1963 — 7-bit, 128 code points1967ASCII revised to final form (ANSI X3.4-1967) still used today1987ISO 8859-1 (Latin-1) extends ASCII to 8-bit (256 code points) for Western European languages1991Unicode 1.0 published — universal character set aiming to encode all writing systems1993UTF-8 encoding designed by Ken Thompson and Rob Pike — variable-width, ASCII-compatible, backward-safe2003RFC 3629 restricts UTF-8 to U+0000 through U+10FFFF, making it the dominant text encoding for the web2008UTF-8 surpasses ASCII as the most common encoding on the web (W3Techs data)
Detect file encoding أخرى
file --mime-encoding input.txt

Uses libmagic to detect the character encoding of a text file by analyzing byte patterns. Returns values like utf-8, ascii, iso-8859-1, or binary.

Convert Windows-1252 to UTF-8 أخرى
iconv -f WINDOWS-1252 -t UTF-8 input.txt > output.txt

iconv transcodes character encoding between any two charsets. -f specifies source encoding, -t specifies target. Fails on byte sequences invalid in the source encoding.

Convert Windows CRLF line endings to Unix LF أخرى
sed -i 's/\r$//' input.txt

Strips trailing carriage return (0x0D) from each line, converting CRLF to LF. The -i flag edits the file in-place. Essential when running Windows-edited scripts on Linux.

Strip UTF-8 BOM from a file أخرى
sed -i '1s/^\xEF\xBB\xBF//' input.txt

Removes the 3-byte UTF-8 BOM (EF BB BF) from the first line of a file. BOM presence breaks JSON parsers, shell script shebangs, and CSV column header detection.

Count lines, words, and characters أخرى
wc -lwm input.txt

wc reports line count (-l), word count (-w), and character count (-m, multibyte-aware). Use -c instead of -m for raw byte count.

TXT PDF render lossless PDF preserves exact page layout and typography for printing, archival, or sharing documents where the recipient should not modify content. Converting TXT to PDF fixes font, margins, and page breaks that plain text cannot express.
TXT CSV render lossless Tab-delimited or space-delimited TXT log files can be restructured into CSV for import into spreadsheets and data analysis tools like pandas or Excel.
TXT JSON render lossless Line-delimited or key-value TXT config files can be parsed into structured JSON for consumption by APIs or web applications that expect typed data.
منخفض

نقاط الضعف

  • Homoglyph substitution — visually identical Unicode characters (Cyrillic а vs Latin a) used in phishing URLs or spoofed filenames embedded in text
  • Bidirectional text override (U+202E) — right-to-left override character reverses displayed filename to disguise executable extensions (e.g., txt.exe appears as exe.txt)
  • Oversized file denial of service — multi-gigabyte TXT file exhausts memory in editors that load entire file into RAM

الحماية: TXT files contain no executable code, no macros, and no embedded objects. FileDex processes text files entirely in the browser with no server upload. The primary risk vector is encoding-based deception, not code execution.

Notepad++ أداة
Windows text editor with encoding detection, conversion, and EOL visualization
Cross-platform editor with built-in encoding picker and EOL toggle
iconv أداة
POSIX CLI for character encoding conversion between any two charsets
chardet مكتبة
Python library for automatic character encoding detection
dos2unix أداة
CLI tool to convert CRLF line endings to LF and vice versa