Internal structure of an ID3 v2.3 MP3 File

These diagrams are my attempt to visualise the internal structure of an mp3 file using ID3 version 2.3 tags. They are based on what I have read on the Internet. I confess that I did not always fully understand what I was reading, so should you find that I have anything wrong, please contact me and I will happily correct it.

To fully understand what is going on you may need to do some homework. You will need to know about these topics. (Or you can just use the dll without worrying too much about what is happening inside.)

  • Littlendian and Bigendian numbers
  • SyncSafe Integers
  • Text encoding and BOMs (byte order marks)

Overall Structure

This diagrams shows the overall structure of a MP3 file using ID3 version 2.3 metadata. In this context 'Tag' refers to the block of the file containing all of the v2.3 metadata. Beware - this is potentially confusing as normally 'tag' refers to a simple item of metadata, for example the artit's name. But not here.

Diagram of overall structure of MP3 file goes here.

Structure Of The Header Information Block

There is the main header, and, optionally, an extended header. (None of the files in my collection had an extended header.)

Diagram of header information block goes here.

Structure Of The Tag

Remember that in an mp3 file the tag is a block of the file holding all the metadata - i.e all of the things we commonly refer to as tags: artist, title, etc.

The tag is made up of frames, plus, optionally, padding. We are most interested in text frames these hold the information about our music, one frame per item. There are other types of frame, most of which we can ignore.

An MP3 Tag.

Structure of a v2.3 ID3 MP3 Frame

A frame holds a single piece of information about the file.

An MP3 Frame.

Structure of a Text Frame

These are the frames that hold the textual data describing the track: artist='Prince', track='Purple Rain' etc.

Diagram of a text frame goes here.

Structure of a COMM Frame

We may also be interested in COMMENT frames. These may hold proprietory binary data, for example added by iTunes, or simple textual comments. I have chosen to process textual comments, whilst ignoring binary comments.

I could not find an online explanantion of the internal format of these that I was fully able to understand and that corresponded to what I saw in the handful of test files that I examined. The diagram below is my best guess, but take it with a 'pinch of salt'.

Diagram of a COMM frame goes here.

I hope these diagrams help.

If you know more than me, and are aware that any of these diagrams are incorrect, please contact me and I will amend them.