Skip to content

[core] fix jadx.plugins.input.java.data.ConstPoolReader.parseString()…#2654

Merged
skylot merged 2 commits intoskylot:masterfrom
wech71:fix_for_ConstPoolReader.parseString_EncodedStrings_jvm4.4.7
Oct 11, 2025
Merged

[core] fix jadx.plugins.input.java.data.ConstPoolReader.parseString()…#2654
skylot merged 2 commits intoskylot:masterfrom
wech71:fix_for_ConstPoolReader.parseString_EncodedStrings_jvm4.4.7

Conversation

@wech71
Copy link
Copy Markdown
Contributor

@wech71 wech71 commented Oct 11, 2025

Description

fixes "TODO: parse modified UTF-8" in ConstPoolReader.parseString to follow jvms-4.4.7 rules for encoding annotation strings in class files

(This allows Kotlin Annotation-plugin to work with the Protobuffer byte array encoded as a String in the Kotlin Metadata of Kotlin 2.0 or newer)

(I apologize if this is incorrect now as you said, I should not merge to the master branch, but I could not find or push to a new branch in skylot/jadx. I do not have much experience with github)

wech71 and others added 2 commits October 7, 2025 10:27
… error with Kotlin Annotation of byte array as a String to follow jvms-4.4.7 rules for encoding annotation strings in class files
@skylot
Copy link
Copy Markdown
Owner

skylot commented Oct 11, 2025

@wech71 thanks 👍
I move method into other class to simplify test and fixed code style with autoformatter.
Although, I was not able to trigger branch which decodes 6-byte char, looks like my sample chars was splitted into two 3-byte chars. It is not urgent, so we may add a test for that case later.

@skylot
Copy link
Copy Markdown
Owner

skylot commented Oct 11, 2025

I apologize if this is incorrect now as you said, I should not merge to the master branch

You do everything right. I was trying to say, that branch in your fork shouldn't be a master. Because on Github master branch is write protected, and I will not be able to add fixes commit, like I do.

@skylot skylot merged commit 0f495af into skylot:master Oct 11, 2025
4 checks passed
@wech71
Copy link
Copy Markdown
Contributor Author

wech71 commented Oct 12, 2025

thanks for accepting my changes :-)

looks like my sample chars was splitted into two 3-byte chars.

hm, I just had another look at the spec and it seems I made a small mistake. The check for 6-byte encoding according 4.12. must be done before 3-byte encoding 4.8, because both use the same upper nibble of first byte and the same two bits for byte 2 and 3.

This way the code currently always incorrectly enters 3-byte-encoding.

Either the order of the check for 6-byte and 3-byte encoding has to be changed, or an easy fix woud be to change the line

if( (x & 0xF0) == 0xE0 && (y & 0xC0) == 0x80 && (z & 0xC0) == 0x80) {

to

if( (x & 0xF0) == 0xE0 && (y & 0xC0) == 0x80 && (z & 0xC0) == 0x80    
     && (x != 0xED || (y & 0xF0) != 0xA0) ) {

so the 3-byte -special case will continue to the if 6-byte check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants