Fix Haskell lexer: handle escape sequences in character literals by mvanhorn · Pull Request #3069 · pygments/pygments

mvanhorn · 2026-03-26T07:43:46Z

Summary

Fixes incorrect tokenization of Haskell escape character literals like '\n', '\t', '\\'.

Why this matters

The root-state regex '[^\\]' only matched character literals containing a single non-backslash character. Escape sequences like '\n' were split across tokens - '\ became Keyword.Type and n' became Name - producing wrong highlighting.

Changes

pygments/lexers/haskell.py line 57: Added pattern '\\.' to match escape character literals. Placed after the existing non-escape pattern so simple chars like 'a' still match first.

Testing

Verified all common cases tokenize as Token.Literal.String.Char:

'\n' (newline), '\t' (tab), '\\' (backslash), 'a' (simple), 'A' (uppercase)

Fixes #1795

This contribution was developed with AI assistance (Claude Code).

The root-state pattern for character literals only matched single non-backslash characters like 'a'. Escape sequences like '\n', '\t', '\\' were incorrectly tokenized as Keyword.Type + Name fragments. Added a pattern for '\.' (backslash + any char) to match escape character literals, placed after the existing non-escape pattern. Fixes pygments#1795 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

birkenfeld · 2026-03-26T19:24:33Z

Thanks for the PR, can you add a test case?

Covers '\n', '\t', '\\', and 'a' tokenization as Literal.String.Char. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mvanhorn · 2026-03-26T22:43:47Z

Added in 0088b19 - snippet test covering '\n', '\t', '\\', and 'a' all tokenizing as Literal.String.Char.

birkenfeld · 2026-03-27T11:00:29Z

Looks like outputs for the existing tests need to be adjusted as well.

Update expected token outputs for example.hs and Sudoku.lhs to reflect the new escape character literal tokenization.

mvanhorn · 2026-03-27T18:54:12Z

Regenerated the golden outputs for example.hs and Sudoku.lhs in 35edbc8. All Haskell tests passing locally.

birkenfeld · 2026-03-28T06:15:34Z

LGTM now, thanks!

mvanhorn · 2026-03-30T16:38:12Z

Thanks for the merge!

test: add snippet test for escape character literals

0088b19

Covers '\n', '\t', '\\', and 'a' tokenization as Literal.String.Char. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: regenerate Haskell and LHS golden outputs

35edbc8

Update expected token outputs for example.hs and Sudoku.lhs to reflect the new escape character literal tokenization.

birkenfeld merged commit e3a3c54 into pygments:master Mar 28, 2026
15 checks passed

Anteru added this to the 2.20.0 milestone Mar 29, 2026

Anteru added the A-lexing area: changes to individual lexers label Mar 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Haskell lexer: handle escape sequences in character literals#3069

Fix Haskell lexer: handle escape sequences in character literals#3069
birkenfeld merged 3 commits intopygments:masterfrom
mvanhorn:osc/1795-fix-haskell-char-escape

mvanhorn commented Mar 26, 2026

Uh oh!

birkenfeld commented Mar 26, 2026

Uh oh!

mvanhorn commented Mar 26, 2026

Uh oh!

birkenfeld commented Mar 27, 2026

Uh oh!

mvanhorn commented Mar 27, 2026

Uh oh!

birkenfeld commented Mar 28, 2026

Uh oh!

Uh oh!

mvanhorn commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mvanhorn commented Mar 26, 2026

Summary

Why this matters

Changes

Testing

Uh oh!

birkenfeld commented Mar 26, 2026

Uh oh!

mvanhorn commented Mar 26, 2026

Uh oh!

birkenfeld commented Mar 27, 2026

Uh oh!

mvanhorn commented Mar 27, 2026

Uh oh!

birkenfeld commented Mar 28, 2026

Uh oh!

Uh oh!

mvanhorn commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants