.MP4 MPEG-4 Video
.mp4

MPEG-4 Video

MP4 is a container (ISO 14496-14), not a codec — it wraps H.264, H.265, or AV1 video with AAC audio in ISOBMFF boxes. Where the moov atom sits determines instant or stalled playback. Convert, trim, or compress MP4 in your browser with FileDex — no upload.

Learn more ↓
Container structure
ftyp brand
moov metadata · tracks
mdat video + audio data
VideoAudioStreamingISO 144962001
By FileDex

Your files never leave your device

Common questions

Is converting MP4 files safe and private?

FileDex converts MP4 files entirely inside your browser using WebAssembly. Your video never leaves your device and is never uploaded to any server. Processing happens in a Web Worker sandbox with no network access.

How do I convert MP4 to MP3 without installing software?

Drop your MP4 file into FileDex, select MP3 as the output format, and click Convert. The audio track is extracted and downloaded as an MP3 file in seconds. No installation or account required.

Why is my MP4 file not playing on iPhone or Safari?

Safari requires H.264 with YUV 4:2:0 chroma subsampling. MP4 files encoded with H.265 need iOS 11+, and 4:4:4 or high-bit-depth streams fail silently on older WebKit builds. Re-encode to H.264 baseline or main profile with -pix_fmt yuv420p for maximum compatibility.

Can I convert MP4 to WebM without losing quality?

WebM uses VP9 or AV1 codecs, so a transcode is required (not a remux). At high quality settings the output is visually transparent, but the conversion is technically lossy. For a lossless container swap, choose MKV instead — it accepts the same H.264 stream without re-encoding.

What is the difference between MP4 and MOV?

Both use the ISOBMFF box structure and support identical codecs (H.264, H.265, AAC). MP4 is the ISO standard for cross-platform delivery. MOV adds Apple-specific extensions like ProRes support and edit lists used by Final Cut Pro. Stream copying between them is zero-loss.

How do I convert MP4 to MP3?

Upload your MP4 file above, select .mp3 as the output format, then click Convert. The audio track is extracted and downloaded as an MP3 file in seconds — no installation needed.

What is the best free MP4 converter online?

FileDex is a free, private, browser-based MP4 converter. It supports conversion to WEBM, AVI, MOV, MKV, GIF, MP3, AAC, and M4A with no file size limits imposed by server uploads.

Why is my MP4 file not playing?

MP4 files may not play if the codec inside (H.265, AV1) is not supported by your player. Drop the file into FileDex's converter and select H.264 MP4 for maximum compatibility.

What makes .MP4 special

Since 2001
Plays on literally everything
Every browser, phone, smart TV, streaming platform. No other video format has this level of support. iPhone made it the default in 2007.
WhatsApp fact
Your 4K video becomes 16MB
WhatsApp re-compresses every video to max 16MB. A 3-minute clip loses visible detail on flagship screens.
Streaming
YouTube, TikTok, Netflix — all MP4
Fragmented MP4 (fMP4) powers both DASH and HLS adaptive streaming. It is the video backbone of the internet.
Smart trick
moov position = instant playback
Moving the metadata block to the file start lets video play before fully downloading. That is -movflags +faststart in FFmpeg.

MP4 is a container, not a codec — a distinction that trips up even experienced developers. The file itself holds no video encoding logic. It wraps one or more codec-compressed streams (H.264, H.265, AV1 for video; AAC, Opus, AC-3 for audio) inside a standardized box structure defined by ISO/IEC 14496-14, part of the MPEG-4 family published in 2003.

Continue reading — full technical deep dive

ISOBMFF atom structure

Every MP4 file is a tree of nested boxes (historically called "atoms"). Three top-level boxes matter most:

Box Role
ftyp File type declaration — tells parsers which brand (isom, mp41, mp42) and compatible standards apply
moov Metadata container — holds track definitions, sample tables, timing info, codec parameters
mdat Raw encoded media samples — the actual compressed video and audio bytes

Inside moov, each track gets a trak box containing mdiaminfstbl. The sample table (stbl) maps every frame to its byte offset in mdat, its decode timestamp, and whether it's a keyframe. This indirection is what makes random-access seeking possible without scanning the entire file.

Why moov position matters

When a browser requests an MP4 over HTTP, it needs the moov box before it can decode anything. If moov sits at the end of the file (the default output of most encoders), the player must either download the entire file or issue a second range request to fetch the tail. Both add latency.

The fix: relocate moov before mdat. FFmpeg's -movflags +faststart flag does this in a post-processing pass — it encodes normally, then rewrites the file with moov shifted forward and all stbl offsets adjusted. The cost is one extra sequential read/write of the file. FileDex applies this flag automatically on every MP4 output.

Fragmented MP4 for adaptive streaming

Standard MP4 stores all metadata in a single moov box, which means the entire file must be finalized before playback begins. Fragmented MP4 (fMP4) solves this by splitting the file into independent fragments, each with its own moof (movie fragment) and mdat pair.

This structure powers both MPEG-DASH and modern HLS (RFC 8216 revision 2). Each segment is self-describing, enabling adaptive bitrate (ABR) switching at segment boundaries without re-downloading metadata. Older HLS used MPEG-TS segments, which carry per-segment overhead from repeated PAT/PMT tables — fMP4 segments are leaner and support random access without full keyframe dependency chains.

Rate control and file size

Constant Rate Factor (CRF) is the most common quality-targeting mode for H.264 and H.265 encoding. Lower CRF means higher quality and larger files. The scale runs 0–51 for H.264 (0 = lossless, 23 = default, 51 = worst).

1080p H.264 at CRF 23 produces roughly 12 MB per minute of video; at CRF 18 that jumps to approximately 30 MB per minute. The relationship is nonlinear — each 6-point CRF decrease roughly doubles the bitrate. For archival work, CRF 18 is a common target. For web delivery where bandwidth matters, CRF 23–28 balances visual quality against download speed.

Two-pass encoding offers tighter bitrate control for streaming platforms that enforce bitrate caps. First pass analyzes scene complexity; second pass allocates bits accordingly. CRF is simpler and generally preferred for single-output workflows.

Codec comparison inside the MP4 container

H.264 (AVC) remains the universal baseline. Every browser, phone, smart TV, and set-top box manufactured since 2010 decodes it natively. Hardware encoding is available on every modern GPU. Patent licensing through MPEG-LA is well-established, and the cost is absorbed into device prices.

H.265 (HEVC) achieves roughly 40% bitrate reduction at equivalent visual quality. iPhone has recorded HEVC by default since iOS 11 (2017). The catch: fragmented patent pools (MPEG-LA, HEVC Advance, Velos Media) create licensing uncertainty. Firefox still lacks native HEVC decoding. Safari and Chrome support it; Edge supports it on Windows with HEVC Video Extensions installed.

AV1 matches or exceeds HEVC compression while carrying zero royalty obligations. YouTube, Netflix, and Twitch use it for server-side delivery. Hardware decode support arrived with Intel 12th-gen, AMD RX 7000, and Apple M3. Encoding is slow — libaom at speed 4 runs roughly 10x slower than x264 at medium preset. SVT-AV1 and rav1e close the gap but still trail H.264 encoders significantly.

Audio track options

AAC-LC is the default audio codec paired with MP4. It delivers transparent quality at 128–192 kbps stereo and enjoys universal hardware decode support. HE-AAC v2 extends this to low-bitrate scenarios (32–64 kbps) using spectral band replication and parametric stereo.

Opus inside MP4 is technically valid per RFC 6716, but support varies. Browsers handle it well; iOS native players do not. For maximum compatibility, AAC-LC at 128 kbps remains the safe default.

AC-3 (Dolby Digital) and E-AC-3 fit inside MP4 for broadcast and streaming workflows requiring 5.1 surround. Apple TV+ and Disney+ deliver Atmos via E-AC-3 in fMP4 segments.

Metadata and chapters

The udta (user data) box inside moov holds freeform metadata — title, artist, comment, copyright. iTunes-style metadata uses a nested ilst box with well-known keys (©nam for title, ©ART for artist, covr for artwork).

Chapter support works through two mechanisms. QuickTime-style chapter tracks define a text track with timed entries. iTunes-style chapter atoms embed inside udta. Most desktop players read both; web players typically ignore chapters unless the application layer (like a custom JS player) parses them.

Gotchas and limitations

moov at end breaks streaming. Files encoded without faststart force a full download before playback begins in browsers. Always verify moov position with ffprobe -v quiet -show_entries format_tags or tools like mp4info.

iPhone HEVC files confuse older software. iOS records HEVC in a MOV container (not MP4), and some files use 10-bit color. Windows machines without HEVC Video Extensions ($0.99 on Microsoft Store) show a black frame. Converting to H.264 MP4 eliminates the compatibility issue.

B-frame reordering inflates latency. H.264 High profile uses bidirectional frames that require decode reordering. For real-time applications (video conferencing, live preview), Baseline profile with zero B-frames drops decode latency at the cost of ~15% larger files.

Edit lists (elst) cause A/V sync issues. Some encoders write edit list atoms to handle initial decoder delay. FFmpeg handles these correctly; many hardware decoders and web players do not, producing audio that starts 20–80 ms early. Encoding with -avoid_negative_ts make_zero sidesteps the problem.

Patent exposure persists. H.264 patents begin expiring in 2027, but HEVC patent pools extend through the 2030s. AV1 is the only patent-free option among modern codecs, making it the safest long-term bet for products that need to avoid per-unit licensing.

.MP4 compared to alternatives

.MP4 compared to alternative formats
Formats Criteria Winner
.MP4 vs .WEBM
Device compatibility
MP4 with H.264 plays on every browser, OS, smart TV, and mobile device. WebM VP9 lacks native support on Safari/iOS older than 14 and most legacy smart TVs.
MP4 wins
.MP4 vs .WEBM
Compression efficiency
VP9 and AV1 codecs in WebM deliver 30-50% smaller files at equivalent visual quality compared to H.264 in MP4.
WEBM wins
.MP4 vs .MKV
Codec flexibility
MKV supports virtually every codec including TrueHD, DTS-HD MA, FLAC, and ASS/SSA subtitles. MP4 is limited to a defined set of codecs per the ISOBMFF spec.
MKV wins
.MP4 vs .MKV
Platform delivery
Social platforms, CDNs, and mobile apps universally accept MP4. MKV is rejected by YouTube, Instagram, TikTok, and most upload endpoints.
MP4 wins
.MP4 vs .MOV
Cross-platform playback
MP4 plays natively everywhere. MOV requires QuickTime or compatible decoders on Windows/Linux and is primarily an Apple ecosystem format.
MP4 wins
.MP4 vs .AVI
Streaming support
MP4 supports progressive download, DASH, and HLS via fMP4 segments. AVI has no native streaming capability and requires full download before playback.
MP4 wins

Convert .MP4 to...

mp4 webm transcode VP9/AV1 inside WebM achieves 30-50% better compression at equivalent visual quality compared to H.264 through larger prediction block sizes and more efficient entropy coding. Web-first projects benefit from reduced bandwidth costs without visible quality degradation. mp4 mkv remux Matroska accepts every codec MP4 supports plus TrueHD, FLAC, DTS-HD, and advanced subtitle formats (ASS/SSA) absent from MP4. Stream copy transfers all tracks without re-encoding, making this ideal for media server archival on Plex, Jellyfin, or Kodi. mp4 mov remux MOV shares ISOBMFF container ancestry with MP4. Stream copying H.264/AAC between them is bit-identical. MOV is required by Final Cut Pro and Motion for native timeline editing with ProRes codec support and Apple-specific edit lists. mp4 gif transcode GIF uses LZW compression on a 256-color indexed palette and requires no video player dependency. Email clients, Markdown renderers, and chat platforms display GIF inline without plugin or codec negotiation. mp4 mp3 export Extracting the audio track from a video for standalone playback on music players, podcasts apps, or offline listening. If the source contains AAC, transcoding to MP3 provides broader device compatibility at the cost of one lossy generation. mp4 m4a remux M4A is MP4 with an audio-only ftyp brand. Demuxing AAC from MP4 into M4A is a zero-loss stream copy. M4A retains full AAC metadata and is natively recognized by iTunes, Apple Music, and iOS media frameworks.

Technical reference

MIME Type
video/mp4
Magic Bytes
00 00 00 xx 66 74 79 70 Bytes 4-7 spell ftyp. Bytes 0-3 vary (box size).
Developer
ISO / Moving Picture Experts Group
Year Introduced
2001
Open Standard
Yes — View specification
00000000000000XX66747970 ....ftyp

Bytes 4-7 spell ftyp. Bytes 0-3 vary (box size).

Binary Structure

MP4 files use the ISO Base Media File Format (ISOBMFF) box model. Every byte in the file belongs to a box (also called an atom). Each box starts with a 4-byte big-endian size field followed by a 4-byte ASCII type code. The file begins with an ftyp box declaring the brand and compatible specs. The moov box contains all metadata: track definitions (trak), media headers (mdia), sample tables (stbl with stts, stsc, stsz, stco/co64, stss), and codec configuration (avcC for H.264, hvcC for H.265). The mdat box holds raw encoded samples referenced by byte offsets in stco/co64. For streaming, moov must precede mdat so the player can build the sample index before downloading media data. Fragmented MP4 (fMP4) replaces the single moov+mdat pair with repeated moof+mdat fragment pairs, each moof carrying a track fragment header (tfhd) and run table (trun) for that segment.

OffsetLengthFieldExampleDescription
0x00 4 bytes ftyp Box Size 00 00 00 20 (32 bytes) Total size of the ftyp box including this field. Value of 1 signals 64-bit largesize in next 8 bytes.
0x04 4 bytes Box Type 66 74 79 70 (ftyp) ASCII box type identifier. Must be 'ftyp' for a valid ISOBMFF file.
0x08 4 bytes Major Brand 69 73 6F 6D (isom) Primary spec this file conforms to. MP4 brands: isom, mp41, mp42. QuickTime: qt (space)(space).
0x0C 4 bytes Minor Version 00 00 02 00 Sub-version of the major brand. Informational only; parsers should not reject based on this value.
0x10 variable Compatible Brands 69 73 6F 6D 69 73 6F 32 (isom, iso2) Array of 4-byte brand codes listing all specs this file conforms to. Extends to ftyp box boundary.
1998MPEG-4 Part 1 (Systems) published by ISO/IEC2001MPEG-4 Part 12 (ISOBMFF) finalized; MP4 derived from Apple QuickTime MOV container2003ISO/IEC 14496-14 formally published — .mp4 extension standardized2008H.264/AVC becomes dominant codec inside MP4; Flash Video begins industry decline2013HEVC/H.265 added as standardized codec option for MP4 containers2018AV1 codec ratified by Alliance for Open Media2020CMAF (Common Media Application Format) standardizes fMP4 for unified HLS/DASH streaming
Convert to H.264 MP4 with streaming optimisation ffmpeg
ffmpeg -i input.mp4 -c:v libx264 -crf 23 -preset slow -c:a aac -b:a 128k -movflags +faststart output.mp4

Re-encode to H.264/AAC with CRF quality control and moov atom relocated to file head for progressive streaming. -crf 23 sets quality (0=lossless, 51=worst). -preset slow trades CPU time for ~10% smaller output. -movflags +faststart moves moov to byte 0 for HTTP progressive download.

Extract AAC audio (zero-loss stream copy) ffmpeg
ffmpeg -i input.mp4 -vn -c:a copy output.aac

Demux AAC audio track without transcoding. -vn suppresses all video output. -c:a copy performs stream copy with zero quality loss, finishing in milliseconds regardless of file duration.

Trim without re-encode (keyframe-boundary cut) ffmpeg
ffmpeg -ss 00:00:10 -to 00:00:40 -i input.mp4 -c copy output.mp4

Extract a 30-second segment using stream copy. -ss before -i performs input seeking using the keyframe index (fast but snaps to nearest keyframe). -c copy avoids re-encoding, preserving original quality.

Compress for web (target 1080p, 4 Mbps ceiling) ffmpeg
ffmpeg -i input.mp4 -c:v libx264 -crf 23 -vf scale=-2:1080 -c:a aac -b:a 128k -maxrate 4M -bufsize 8M -movflags +faststart web_output.mp4

Constrain output to 1080p height with a VBR ceiling to prevent bitrate spikes. scale=-2:1080 auto-calculates width (divisible by 2). -maxrate 4M -bufsize 8M sets VBV rate control (bufsize=2x maxrate is the standard CDN recommendation).

Parse ISOBMFF box structure (Python) other
import struct, sys

def parse_boxes(f, end, depth=0):
    while f.tell() < end:
        header = f.read(8)
        if len(header) < 8:
            break
        size, box_type = struct.unpack('>I4s', header)
        box_type = box_type.decode('ascii', errors='replace')
        if size == 1:
            size = struct.unpack('>Q', f.read(8))[0]
        elif size == 0:
            size = end - f.tell() + 8
        print('  ' * depth + f"{box_type}  ({size} bytes)")
        if box_type in ('moov', 'trak', 'mdia', 'minf', 'stbl', 'udta'):
            parse_boxes(f, f.tell() + size - 8, depth + 1)
        else:
            f.seek(size - 8, 1)

with open(sys.argv[1], 'rb') as f:
    f.seek(0, 2); eof = f.tell(); f.seek(0)
    parse_boxes(f, eof)

Python script that recursively parses the ISOBMFF box hierarchy of an MP4 file. Each box has a 4-byte size + 4-byte type header. Container boxes (moov, trak, mdia, minf, stbl, udta) are expanded recursively to reveal the full metadata tree.

MP4 WEBM transcode lossy VP9/AV1 inside WebM achieves 30-50% better compression at equivalent visual quality compared to H.264 through larger prediction block sizes and more efficient entropy coding. Web-first projects benefit from reduced bandwidth costs without visible quality degradation.
MP4 MKV remux lossless Matroska accepts every codec MP4 supports plus TrueHD, FLAC, DTS-HD, and advanced subtitle formats (ASS/SSA) absent from MP4. Stream copy transfers all tracks without re-encoding, making this ideal for media server archival on Plex, Jellyfin, or Kodi.
MP4 MOV remux lossless MOV shares ISOBMFF container ancestry with MP4. Stream copying H.264/AAC between them is bit-identical. MOV is required by Final Cut Pro and Motion for native timeline editing with ProRes codec support and Apple-specific edit lists.
MP4 GIF transcode lossy GIF uses LZW compression on a 256-color indexed palette and requires no video player dependency. Email clients, Markdown renderers, and chat platforms display GIF inline without plugin or codec negotiation.
MP4 MP3 export lossy Extracting the audio track from a video for standalone playback on music players, podcasts apps, or offline listening. If the source contains AAC, transcoding to MP3 provides broader device compatibility at the cost of one lossy generation.
MP4 M4A remux lossless M4A is MP4 with an audio-only ftyp brand. Demuxing AAC from MP4 into M4A is a zero-loss stream copy. M4A retains full AAC metadata and is natively recognized by iTunes, Apple Music, and iOS media frameworks.
MEDIUM

Attack Vectors

  • moov atom heap overflow
  • MP4/3GP polyglot file bypass
  • H.264 avcC box decoder exploit
  • Fragmented MP4 (fMP4) infinite loop

Mitigation: FileDex processes MP4 files entirely in-browser via FFmpeg WASM inside a Web Worker sandbox. No file data leaves the device. The WASM runtime operates within browser memory limits, preventing moov-based memory exhaustion. Atom size validation occurs during FFmpeg demuxing before any codec initialization.

Cross-platform media player with native MP4/H.264/HEVC decoding
FFmpeg tool
CLI tool for MP4 muxing, transcoding, and stream manipulation
YouTube service
Video platform accepting MP4 as canonical upload format
HandBrake tool
Open-source video transcoder with H.264/H.265 MP4 output
Adobe Premiere tool
Professional NLE with full MP4 import/export and codec support
DaVinci Resolve tool
Professional color grading and editing suite with MP4 delivery
mp4box.js library
JavaScript ISOBMFF parser and MP4 box inspector
CLI for MP4 muxing, DASH packaging, and box-level inspection