.MP3 MP3 Audio
.mp3

MP3 Audio

Invented at Fraunhofer IIS in Erlangen using Suzanne Vega's "Tom's Diner" as the test track, MP3 became patent-free in 2017. Convert MP3 to WAV, FLAC, OGG, or AAC in your browser — no upload, no server. FileDex decodes and re-encodes locally via FFmpeg WebAssembly.

Learn more ↓
Audio structure
ID3v2 tags · cover
Frames MPEG audio
ID3v1 legacy tags
LossyAudioISO 111721993
By FileDex

Your files never leave your device

Common questions

Does converting MP3 to WAV improve audio quality?

No. Decoding MP3 to WAV does not restore audio data discarded during MP3 encoding. The WAV file is larger but contains identical audio fidelity. WAV conversion is useful for compatibility with audio editors that require uncompressed PCM input.

What is the best MP3 bitrate for music?

192 kbps VBR using the LAME encoder is widely considered transparent — indistinguishable from CD audio in double-blind tests for most listeners. 320 kbps CBR is the maximum and is used for archival delivery. Below 96 kbps, frequency masking artifacts become audible on high-frequency content.

Can I convert MP3 to FLAC without losing quality?

The conversion itself is lossless — FLAC perfectly preserves the decoded MP3 audio. However, MP3 is already lossy, so the FLAC file will not recover original quality discarded during MP3 encoding. The result is a lossless container around lossy audio data.

Why does my MP3 have a gap or click between tracks?

MP3 frames are fixed at 1,152 samples, so encoder delay and padding create silence at track boundaries. LAME writes an Info/Xing header with exact sample counts for gapless-capable players. Players that ignore this header insert brief silence or clicks.

Are MP3 files still patent-encumbered?

No. Fraunhofer IIS terminated the MP3 patent licensing program in April 2017. All key patents have expired globally. MP3 encoding and decoding is now fully patent-free for any use case.

What is the difference between CBR, VBR, and ABR?

CBR (Constant Bit Rate) uses the same bitrate for every frame — predictable file sizes, required by some hardware decoders. VBR (Variable Bit Rate) allocates more bits to complex passages and fewer to silence — better quality per kilobyte. ABR (Average Bit Rate) targets an average bitrate across the file — a middle ground. LAME -V 0 (VBR quality 0) typically outperforms CBR 320kbps at smaller file sizes.

What programs create and edit ID3 tags on MP3 files?

Mp3tag (Windows, macOS) is the most popular dedicated tag editor. MusicBrainz Picard uses acoustic fingerprinting to automatically tag files from the MusicBrainz database. foobar2000 and VLC also support tag editing. Command-line: id3v2 -t 'Title' -a 'Artist' input.mp3

What makes .MP3 special

Patent-free
All MP3 patents expired in 2017
Fraunhofer held the key patents for decades. Since 2017, MP3 is fully free to encode and decode without licensing fees.
Psychoacoustics
It hears what you cannot
MP3 uses frequency and temporal masking to discard sounds your ear would never perceive. A loud tone hides quieter ones nearby.
128 kbps ceiling
Frequencies above 16 kHz are gone
At 128 kbps, MP3 cuts everything above 16 kHz. That is why cymbals sound dull. At 320 kbps, the cutoff reaches 20 kHz.
Gapless problem
Frames always contain 1152 samples
The encoder pads the final frame with silence, creating audible gaps between tracks. LAME solves this with a Xing header that trims the padding.

The MPEG-1 Audio Layer III codec compresses audio by exploiting the human ear's inability to hear certain frequencies when louder sounds are present nearby. This psychoacoustic model is the engine behind every MP3 file — it decides what to keep and what to discard, producing files roughly 10x smaller than uncompressed PCM at acceptable quality.

Continue reading — full technical deep dive

The psychoacoustic model

Two masking phenomena drive MP3's compression. Frequency masking (simultaneous masking) occurs when a loud tone renders nearby quieter tones inaudible — a 1 kHz tone at 80 dB masks everything within a critical band around it below roughly 60 dB. Temporal masking suppresses perception for 5–20 ms after a loud transient and 2–5 ms before one (pre-masking). The encoder calculates masking thresholds across 32 subbands for every frame, then allocates bits only to signal components that exceed those thresholds.

Encoding pipeline

MP3 encoding proceeds through four stages:

  1. Polyphase subband filter — splits the input PCM into 32 equal-width subbands (each 625 Hz wide at 44.1 kHz sample rate)
  2. MDCT (Modified Discrete Cosine Transform) — transforms each subband into frequency-domain coefficients with finer resolution (576 frequency lines per granule)
  3. Quantization — scales coefficients based on the psychoacoustic model's masking thresholds, discarding inaudible detail to meet the target bitrate
  4. Huffman coding — entropy-encodes the quantized values using lookup tables optimized for typical audio spectral shapes

The encoder iterates quantization parameters (scale factors and global gain) in an inner/outer loop until distortion stays below masking thresholds while hitting the bitrate target. This iterative process is why MP3 encoding is asymmetric — decoding is a single-pass operation roughly 10x faster.

Bitrate tiers and frequency cutoffs

At 128 kbps, MP3 encoding discards frequencies above approximately 16 kHz; at 320 kbps the cutoff reaches roughly 20 kHz. Between these extremes, each step trades bandwidth for fidelity:

Bitrate Frequency ceiling Typical use Stereo file size/min
128 kbps ~16 kHz Podcasts, voice, casual listening 0.94 MB
192 kbps ~18 kHz General music streaming 1.41 MB
256 kbps ~19.5 kHz High-quality streaming 1.88 MB
320 kbps ~20 kHz Archival, critical listening 2.34 MB

The 16 kHz cutoff at 128 kbps explains why cymbals and sibilants sound dull at that rate — high-frequency harmonic content simply isn't encoded. Most adults over 25 cannot hear above 16 kHz anyway, which is why 128 kbps podcasts sound acceptable for speech.

VBR, CBR, and ABR modes

Constant Bitrate (CBR) assigns the same number of bits to every frame regardless of complexity. Silence wastes bits; dense orchestral passages starve. CBR's advantage is predictable file size and guaranteed compatibility with older hardware players.

Variable Bitrate (VBR) lets the encoder allocate bits per frame based on signal complexity. LAME's VBR quality scale (V0 through V9) targets perceptual quality rather than a fixed rate. V0 averages around 245 kbps and is considered transparent for most material. V5 averages ~130 kbps. VBR produces better quality-per-byte than CBR in every case.

Average Bitrate (ABR) is a hybrid — it varies per-frame like VBR but constrains the running average to a target. Useful when you need approximately predictable file sizes without CBR's quality compromises.

Frame structure

An MP3 file is a sequence of independent frames. Each frame contains:

  • Sync word — 12 bits of all 1s (0xFFF) marking the frame start
  • Header — 20 bits encoding MPEG version, layer, bitrate index, sample rate, padding, channel mode
  • Side information — 17 bytes (stereo) or 9 bytes (mono) specifying scale factor partitioning and Huffman table selections
  • Main data — the Huffman-coded spectral coefficients, potentially borrowing bytes from previous frames (bit reservoir)

The bit reservoir is a critical mechanism. Frames encoding simple passages may not need their full byte allocation, so unused bytes carry forward for complex passages to borrow. This means a single frame's encoded data can physically reside in a previous frame's byte range — which is why cutting MP3 files on arbitrary frame boundaries can corrupt audio.

ID3 tags and metadata

ID3v1 appends a fixed 128-byte block at the file's end. It's limited: 30 characters per field, a genre byte index, no album art. ID3v2 prepends a variable-length header before the first audio frame, supporting:

  • APIC — embedded album artwork (JPEG or PNG, no size limit in spec, but players struggle above 500 KB)
  • USLT — unsynchronized lyrics with language code
  • CHAP — chapter markers with start/end timestamps and optional embedded images
  • TXXX — arbitrary key-value text pairs for custom metadata

ID3v2.4 supports UTF-8 natively. ID3v2.3 (more widely supported) defaults to UTF-16 for non-Latin text.

The gapless playback problem

MP3 frames always contain 1152 samples. When the source audio length isn't a perfect multiple of 1152 samples, the encoder pads the final frame with silence. Additionally, the encoder introduces a priming delay (typically 576 samples for LAME) at the start. This padding creates audible gaps between consecutive tracks — a problem for live albums, classical suites, and DJ mixes.

LAME solves this by writing delay and padding values into a Xing/LAME header stored in the first frame. Decoders that read this header (iTunes, foobar2000, mpv) trim the padding for seamless playback. Decoders that ignore it (many car stereos, cheap Bluetooth speakers) insert a brief silence between tracks.

Joint stereo

MP3 supports four channel modes: stereo, joint stereo, dual channel, and mono. Joint stereo is the default in most encoders because it exploits inter-channel redundancy. It operates in two sub-modes:

Mid/Side stereo encodes the sum (L+R) and difference (L-R) channels instead of left and right independently. When both channels carry similar content (centered vocals, for instance), the difference signal is near-zero and compresses efficiently, freeing bits for the mid channel.

Intensity stereo (used only at low bitrates) preserves the spectral envelope of high-frequency content while encoding only the combined energy, relying on the ear's poor spatial resolution above ~2 kHz. At 128 kbps and above, LAME disables intensity stereo and uses only mid/side.

Limitations

Generation loss compounds. Each decode-edit-reencode cycle applies quantization noise. After 3–4 generations at 128 kbps, artifacts become obvious. Always edit from the source WAV or FLAC, not from an MP3.

No multichannel support. The MPEG-1 spec defines only mono and stereo. 5.1 surround requires MPEG-2 Layer III extensions (the .mp3 extension is technically overloaded), and virtually no consumer player supports it.

No lossless mode. MP3 is inherently lossy. For archival, FLAC or ALAC preserve every sample while still compressing 40–60%.

No native ReplayGain. Volume normalization relies on non-standard tags (ID3v2 TXXX fields or APEv2 tags). Not all players honor them.

When to choose MP3 over alternatives

MP3 wins on one axis: universal decode support. Every device manufactured in the last 20 years plays MP3 natively. If your audience includes firmware-limited hardware (car head units, elevator speakers, embedded PA systems), MP3 at V0 or 320 kbps is the safe choice.

For everything else, alternatives outperform it. AAC-LC at 128 kbps matches MP3 at 192 kbps in listening tests. Opus at 96 kbps rivals MP3 at 256 kbps, especially for speech. FLAC provides lossless compression at roughly 60% of WAV size. MP3's technical ceiling was set in 1993 — newer codecs have three decades of psychoacoustic research built on top of it.

.MP3 compared to alternatives

.MP3 compared to alternative formats
Formats Criteria Winner
.MP3 vs .AAC
Audio quality at 128 kbps
AAC's improved MDCT windowing and stereo coding produce 20-30% better perceived quality than MP3 at equivalent bitrates, particularly below 128 kbps where MP3's subband filter bank introduces audible pre-echo artifacts.
AAC wins
.MP3 vs .FLAC
Audio fidelity
FLAC is lossless — bit-perfect reproduction of the original audio. MP3 discards masked frequencies and applies lossy quantization. FLAC files are 4-5x larger per minute of audio.
FLAC wins
.MP3 vs .OGG VORBIS
Hardware compatibility
MP3 plays on every audio device manufactured since 2000, including car stereos, DAPs, and budget Bluetooth speakers. OGG Vorbis support is limited to software players and some modern hardware.
MP3 wins
.MP3 vs .OPUS
Compression efficiency
Opus outperforms MP3 at all bitrates, achieving transparent quality at 96-128 kbps where MP3 requires 192-256 kbps. Opus also handles voice and music in a single codec with adaptive switching.
OPUS wins

Technical reference

MIME Type
audio/mpeg
Magic Bytes
FF FB Frame sync. Also FF F3, FF F2. Files with ID3 tag start with 49 44 33.
Developer
Fraunhofer Society / ISO
Year Introduced
1993
Open Standard
Yes — View specification
00000000FFFB ..

Frame sync. Also FF F3, FF F2. Files with ID3 tag start with 49 44 33.

Binary Structure

MP3 is a frame-based format with no global header or container index. Files optionally begin with an ID3v2 tag block (magic: 49 44 33 / 'ID3'), followed by a sequence of independent audio frames. Each frame starts with a 4-byte header containing a 12-bit sync word (0xFFF), MPEG version, layer, protection bit, bitrate index, sample rate index, padding, channel mode, and mode extension. Frame payloads contain Huffman-coded MDCT coefficients. An optional Xing/Info header in the first audio frame stores VBR metadata (total frames, total bytes, seek table) for duration calculation and seeking. ID3v1 tags (128 bytes, magic: 54 41 47 / 'TAG') may appear at the file tail.

OffsetLengthFieldExampleDescription
0x00 3 bytes ID3v2 Magic 49 44 33 (ID3) Present only if file has ID3v2 tags. If absent, audio frames begin at byte 0.
0x03 1 byte ID3v2 Version 04 (ID3v2.4) Major version: 03 = ID3v2.3, 04 = ID3v2.4. ID3v2.4 adds native UTF-8 support.
0x06 4 bytes ID3v2 Tag Size Syncsafe integer Tag body size in syncsafe encoding (7 bits per byte). Excludes the 10-byte header itself.
after ID3 2 bytes Frame Sync Word FF FB FF FB = MPEG-1 Layer III, no CRC. FF FA = with CRC. FF F3 = MPEG-2 Layer III.
sync+2 1 byte Bitrate / Sample Rate 90 Upper 4 bits = bitrate index, next 2 bits = sample rate index, then padding and private bits.
EOF-128 128 bytes ID3v1 Tag 54 41 47 (TAG) Optional legacy metadata block. Fixed Latin-1 encoding, 30-char fields. Deprecated.
1987Fraunhofer IIS begins EUREKA project EU147 — psychoacoustic compression research starts1991MPEG-1 Audio Layer III codec finalized by ISO/IEC MPEG working group1993ISO/IEC 11172-3 published — MP3 officially standardized as part of MPEG-1 Audio1997Winamp 1.0 released, triggering mass consumer adoption of MP3 for digital music1999Napster launches peer-to-peer MP3 sharing; reaches 80 million users by 20012001Apple iPod launches with native MP3 support; iTunes establishes MP3 import pipeline2017Fraunhofer IIS terminates MP3 patent licensing — all key patents expired globally
Convert MP3 to WAV (PCM uncompressed) ffmpeg
ffmpeg -i input.mp3 -c:a pcm_s16le -ar 44100 output.wav

-c:a pcm_s16le selects signed 16-bit little-endian PCM. -ar 44100 resamples to CD-standard 44.1 kHz. Omit -ar to preserve the source sample rate.

Convert MP3 to AAC at 192 kbps ffmpeg
ffmpeg -i input.mp3 -c:a aac -b:a 192k -movflags +faststart output.m4a

-c:a aac uses FFmpeg's built-in AAC encoder. -b:a 192k sets target bitrate. -movflags +faststart moves the moov atom to the file start for streaming.

Convert MP3 to high-quality OGG Vorbis ffmpeg
ffmpeg -i input.mp3 -c:a libvorbis -q:a 6 output.ogg

-c:a libvorbis selects the Vorbis encoder. -q:a 6 sets VBR quality level 6, targeting approximately 192 kbps with dynamic bitrate allocation per frame.

Convert MP3 to FLAC (lossless archive) ffmpeg
ffmpeg -i input.mp3 -c:a flac -compression_level 8 output.flac

-c:a flac selects the FLAC encoder. -compression_level 8 uses high compression (range 0-12); level 8 balances size reduction with encoding speed.

Read MP3 frame headers and ID3 tags (Python) other
import sys, struct

def read_id3v2(f):
    header = f.read(10)
    if header[:3] != b'ID3':
        f.seek(0)
        return None
    major, minor, flags = header[3], header[4], header[5]
    size_bytes = header[6:10]
    size = ((size_bytes[0] & 0x7F) << 21 | (size_bytes[1] & 0x7F) << 14 |
            (size_bytes[2] & 0x7F) << 7  | (size_bytes[3] & 0x7F))
    print(f'ID3v2.{major} tag: {size} bytes')
    f.seek(size, 1)  # skip past ID3 block
    return size

def read_mp3_frame(f):
    sync = f.read(2)
    if len(sync) < 2 or sync[0] != 0xFF or (sync[1] & 0xE0) != 0xE0:
        return False
    header = struct.unpack('>H', sync)[0]
    mpeg_ver = (sync[1] >> 3) & 0x3
    layer    = (sync[1] >> 1) & 0x3
    bitrate_idx = (f.read(1)[0] >> 4) & 0xF
    print(f'MP3 frame: MPEG{["2.5","?","2","1"][mpeg_ver]} Layer {["?","III","II","I"][layer]} sync_word=0x{sync.hex().upper()}')
    return True

with open(sys.argv[1], 'rb') as f:
    read_id3v2(f)
    read_mp3_frame(f)

Reads the 10-byte ID3v2 header using syncsafe integer decoding, then locates the first audio frame by checking for the 0xFF sync byte and parsing MPEG version and layer from the second byte's bitfields.

MP3 WAV transcode lossless DAWs and audio editors (Pro Tools, Logic Pro, Audacity) require uncompressed PCM input. Decoding MP3 to WAV produces a lossless representation of the decoded audio with no further quality loss during conversion.
MP3 FLAC transcode lossless FLAC perfectly preserves the decoded MP3 audio with zero additional loss, making it the preferred archival container for existing MP3 collections where the original lossless source is unavailable.
MP3 OGG transcode lossy Vorbis in OGG is fully patent-free with no licensing restrictions. At equivalent perceptual quality, Vorbis achieves 15-20% smaller files than MP3 at bitrates above 128 kbps — useful for open-source game audio and web applications.
MP3 AAC transcode lossy AAC delivers equivalent perceived quality at 70-80% of the MP3 bitrate due to improved psychoacoustic modeling. AAC is the native codec for Apple devices, YouTube, and most streaming platforms.
MP3 M4A transcode lossy M4A wraps AAC audio in an MPEG-4 container, adding iTunes/Apple Music compatibility, chapter support, and high-resolution embedded artwork — features that raw .aac files lack.
LOW

Attack Vectors

  • ID3 tag buffer overflow
  • MP3 frame header bitfield exploit
  • Malicious ID3 artwork payload

Mitigation:

Reference MP3 encoder producing the highest-quality VBR and CBR output
FFmpeg tool
Universal media framework with libmp3lame encoding and native MP3 decoding
Cross-platform media player with native MP3 decoding and ID3 tag display
foobar2000 tool
Lightweight Windows audio player with gapless MP3 playback and ReplayGain
Audacity tool
Free, open-source audio editor supporting MP3 import/export via LAME
ID3.org spec
Official specification for ID3v1 and ID3v2 metadata tags in MP3 files