What is an HTML file?
HTML (HyperText Markup Language) is the standard markup language for documents displayed in web browsers. It defines the structure and content of web pages using elements (tags) like headings, paragraphs, links, images, and forms. HTML was invented by Tim Berners-Lee in 1991 and is now maintained as a Living Standard by WHATWG, meaning it evolves continuously rather than in numbered releases.
Every page on the web is ultimately an HTML document. When you visit a URL, your browser receives an HTML file and renders it visually. CSS controls appearance, and JavaScript adds behavior — but HTML is the foundation that makes a document a webpage.
How to open HTML files
- Any web browser (Chrome, Firefox, Edge, Safari) — Double-click to render as a web page
- VS Code (Windows, macOS, Linux) — Code editing with live preview via extensions
- Notepad++ (Windows) — Syntax-highlighted editing
- Sublime Text (Windows, macOS, Linux) — Fast code editor
Technical specifications
| Property | Value |
|---|---|
| Current Version | HTML5 (Living Standard) |
| Encoding | UTF-8 (recommended) |
| Type | Markup language |
| Standard | WHATWG Living Standard |
| MIME type | text/html |
| Related | CSS (styling), JavaScript (behavior) |
Common use cases
- Web pages: Every website is built on HTML
- Email templates: HTML-formatted emails with rich formatting
- Documentation: Technical docs, help files, and manuals
- Web applications: Single-page applications (SPAs) use a single HTML shell
- Progressive Web Apps (PWAs): Installable apps built on web technologies
HTML document structure
A minimal valid HTML5 document:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Page Title</title>
</head>
<body>
<h1>Hello, World</h1>
<p>This is a paragraph.</p>
</body>
</html>
The <!DOCTYPE html> declaration tells the browser to use standards mode. The <head> contains metadata not shown to users (title, character set, stylesheets), and <body> contains the visible content.
Semantic HTML
HTML5 introduced semantic elements that describe their content’s meaning to both browsers and search engines:
<header>,<footer>,<main>,<nav>,<aside>— Page structure<article>,<section>— Content grouping<figure>,<figcaption>— Images with captions<time datetime="2024-01-15">— Machine-readable dates
Semantic HTML improves accessibility (screen readers understand the page structure) and SEO (Google better understands content hierarchy).
HTML and SEO
Search engines read HTML directly. Key elements that affect ranking:
<title>— Shown in search result titles<meta name="description">— Search snippet text<h1>–<h6>heading hierarchy — Signals content structurealtattributes on images — Enables image indexing<link rel="canonical">— Prevents duplicate content penalties
Accessibility
Well-written HTML is inherently accessible. Use alt text on all images, <label> elements for form inputs, logical heading order (h1 before h2), and ARIA attributes (role, aria-label) for custom interactive components. HTML that passes WCAG 2.1 AA guidelines works better for all users, including those using screen readers or keyboard-only navigation.