Skip to content

gmail get: ISO-2022-JP encoded emails show as garbled text (U+FFFD) #126

@tomochang

Description

@tomochang

Description

gog gmail get <messageId> produces garbled output (U+FFFD replacement characters) for emails encoded in ISO-2022-JP. This is a common encoding for Japanese business emails.

Reproduction

# Any email with Content-Type: text/plain; charset=iso-2022-jp
gog gmail get <messageId>

Expected: Readable Japanese text
Actual: 水野様���������

Analysis

  • The raw email (via --format raw) contains correct base64-encoded data
  • Python's email library decodes it correctly when using the charset from the MIME headers
  • The issue is that gogcli's body extraction doesn't handle charset conversion from ISO-2022-JP to UTF-8
  • JSON output (--json) has the same problem — the garbled text is already in the body field

Workaround

Fetch with --format raw --json, then decode the message.raw field externally:

import base64, email
raw_bytes = base64.urlsafe_b64decode(raw_b64)
msg = email.message_from_bytes(raw_bytes)
for part in msg.walk():
    if part.get_content_type() == 'text/plain':
        charset = part.get_content_charset()
        body = part.get_payload(decode=True).decode(charset)

Environment

  • gogcli: v0.9.0 (99d9575 2026-01-22)
  • macOS (Apple Silicon)
  • Email source: Gmail via Google Workspace

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions