I haven't quite figured out what's happening here, so please excuse the vague title, but the output of KotlinLexer.get_tokendefs() certainly seems to be broken.
Reproducable script.
Run the following code (for example with uv run main.py or how ever you manage your dependencies):
# /// script
# dependencies = [
# "pygments==2.19.2",
# ]
# ///
from pygments.lexers.jvm import KotlinLexer
print(KotlinLexer.get_tokendefs())
And it will print something that looks like something in binary format (screenshot here because GitHub won't accept invalid utf-8 bytes here):
As you can see up until the highlighted Token.Literal.Number everything looks fine but below are random looking bytes.
The corresponding line in the KotlinLexer would be the parsing of identifiers:
# Identifiers
(r'' + kt_id + r'((\?[^.])?)', Name) # additionally handle nullable types
I haven't quite figured out what's happening here, so please excuse the vague title, but the output of
KotlinLexer.get_tokendefs()certainly seems to be broken.Reproducable script.
Run the following code (for example with
uv run main.pyor how ever you manage your dependencies):And it will print something that looks like something in binary format (screenshot here because GitHub won't accept invalid utf-8 bytes here):
As you can see up until the highlighted
Token.Literal.Numbereverything looks fine but below are random looking bytes.The corresponding line in the KotlinLexer would be the parsing of identifiers: