Skip to content

UTF-16 surrogate pairs not supported in MP3 ID3 tags #396

@clem-dickey

Description

@clem-dickey

Exiftool does not properly handle surrogate pairs in MP3 ID3 tag values.

ID3.pm asks Exiftool to decode 2-byte Unicode values as 'UCS2', but Charset.pm will only recognized surrogate pairs from Unicode values labeled as 'UTF16'. ID3v2.2 added support for 2-byte Unicode and calls out Unicode 2.0, which includes surrogate pairs. ID3.pm should ask Exiftool to decode 2-byte Unicode as 'UTF16', thereby enabling surrogate pair handling. (A more aggressive solution might be to change Charset.pm to recognized surrogate pairs in UCS2, but this could break code expecting surrogate code points to be invalid in UCS2, as was true for Unicode 1.0.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions