Skip to content

Conversation

@masakielastic
Copy link
Contributor

mbstring miscaliculate the codepoint when converting string from GB18030 to UTF-32. This pull request add missing missing parentheses.

// http://icu-project.org/docs/papers/gb18030.html#h7
// uFirst = 0x10000;
// bFirst = [0x90, 0x30, 0x81, 0x30];

int linear(byte bytes[4]) {
    return ((bytes[0]*10+bytes[1])*126+bytes[2])*10+bytes[3];
}

 int mapToUnicode(byte bytes[4]) {
    int lin=linear(bytes);
    for each range {
        if(linear(bFirst)<=lin&lt=linear(bLast)) {
            // range found
            return uFirst+(lin-linear(bFirst));
        }
    }
    // the byte sequence is not in any known range
    return error;
}

@jpauli jpauli added the Bug label Feb 19, 2015
@smalyshev
Copy link
Contributor

merged

@smalyshev smalyshev closed this Mar 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants