add string case conversion functions: lower and upper by luadebug · Pull Request #7235 · xmake-io/xmake

luadebug · 2026-01-16T23:33:33Z

Resolves: #7234

gemini-code-assist · 2026-01-16T23:33:48Z

Summary of Changes

Hello @luadebug, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the string manipulation capabilities of the system by integrating new lower and upper functions. These additions provide a standardized way to change the case of strings, with a focus on internationalization by correctly handling various character encodings, thereby improving the flexibility and global compatibility of string operations.

Highlights

New String Case Conversion Functions: Introduced lower and upper functions for string objects, allowing conversion of strings to their lowercase and uppercase equivalents.
Unicode Support: The new case conversion functions are implemented with robust support for multi-byte character sets, specifically handling UTF-8 input by converting to wide characters (UTF-16 on Windows, UTF-32 on other systems) for conversion, and then back to UTF-8.
New Source File: A new C source file, core/src/xmake/string/case.c, has been added to encapsulate the logic for these string case conversions.
Unit Tests: Comprehensive unit tests have been added in tests/modules/string/test.lua to verify the correctness of the lower and upper functions, including tests for both ASCII and non-ASCII (Cyrillic) characters.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces string.lower() and string.upper() functions for case conversion, with support for Unicode characters. The implementation correctly handles character set conversions.

My review focuses on improving robustness and maintainability. I've identified potential integer overflow vulnerabilities in memory allocation calculations that could lead to security issues with very large input strings. I've also suggested refactoring the main logic to simplify error handling and improve code readability.

The added tests are good and cover both ASCII and non-ASCII strings.

core/src/xmake/string/case.c

luadebug · 2026-01-16T23:35:48Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces lower and upper string case conversion functions, which is a valuable addition. The implementation correctly handles Unicode by converting strings to a wide-character format before processing, and this is validated with tests for both ASCII and non-ASCII character sets. I have provided one suggestion to refactor the core logic in case.c to improve its readability, maintainability, and memory efficiency. Overall, the changes are well-implemented.

core/src/xmake/string/case.c

luadebug · 2026-01-17T03:09:14Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces string.lower() and string.upper() functions for case conversion of strings, with full UTF-8 support. This is achieved by adding the utf8proc library as a new dependency. The implementation is solid, correctly handling UTF-8 characters and even preserving invalid byte sequences. The integration into the build system and the addition of comprehensive test cases are well done. I have one minor suggestion to improve code hygiene in the new implementation file.

core/src/xmake/string/case.c

waruqi · 2026-01-17T04:06:17Z

Please do not include third-party dependencies. xmake strictly controls the inclusion of third-party libraries and may gradually remove some existing third-party library dependencies in the future. tbox has built-in encoding handling and cross-platform support; we should improve tbox as much as possible to improve and implement these features.

luadebug · 2026-01-17T04:09:02Z

Please do not include third-party dependencies. xmake strictly controls the inclusion of third-party libraries and may gradually remove some existing third-party library dependencies in the future. tbox has built-in encoding handling and cross-platform support; we should improve tbox as much as possible to improve and implement these features.

OK. What is a solution for this situation?

waruqi · 2026-01-17T04:28:35Z

Supporting full character set conversion seems like a huge undertaking, potentially increasing the xmake binary by over 1-2MB.

https://github.com/JuliaStrings/utf8proc/blob/e5e799221b45bbb90f5fdc5c69b6b8dfbf017e78/utf8proc_data.c (2.25MB)

We should improve the charset module of tbox to provide tolower/toupper interfaces.

Then, for some system platforms, we can use existing system APIs to support it (e.g., Windows CharUpperBuffW/CharLowerBuffW, Apple NSString uppercaseString/lowercaseString, CFStringLowercase). If no existing system API support is available, we can fall back to our own simplified implementation, only providing conversions for some commonly used language character sets.

This might be quite complex and time-consuming. If you can't implement it in a short time, you can leave it for now. I'll try to support it when I have more time later.

luadebug · 2026-01-17T17:42:56Z

/gemini review

gemini-code-assist

Code Review

This pull request adds string.lower() and string.upper() functions with Unicode support. The implementation is well-designed, with an optimization for ASCII strings and correct handling of Unicode characters via wide-character conversion. The approach to managing locales across different platforms is thorough. However, I've found a potential bug in the setlocale-based implementation where a failure to save the current locale could lead to it not being restored, which could have side effects. I've included a suggestion to make this more robust. The new tests provide good coverage for various character sets.

core/src/xmake/string/case.c

luadebug · 2026-01-17T18:01:20Z

wait for this patch. tboox/tbox#300

then we can use tb_charset_utf8_tolower/tb_charset_utf8_toupper, it will call tb_wcslwr/tb_wcsupr to do it.

It makes some optimizations. Since 90% of the time the strings are ASCII strings, executing the Unicode tolower/toupper completely every time would significantly impact performance, so we don't need to do that.

tb_long_t tb_charset_utf8_tolower(tb_char_t* s, tb_size_t n)
{
    // check
    tb_assert_and_check_return_val(s, -1);

    // try ascii tolower first
    tb_char_t* p = s;
    tb_char_t* e = s + n;
    while (p < e && *p)
    {
        if ((*p) & 0x80) break; 
        *p = tb_tolower(*p);
        p++;
    }
    if (p == e || !*p) return p - s;

    // convert the suffix to wchar_t
    tb_long_t   r = -1;
    tb_size_t   wn = e - p + 1;
    tb_wchar_t  wb[256];
    tb_wchar_t* w = (wn <= 256)? wb : (tb_wchar_t*)tb_malloc(wn * sizeof(tb_wchar_t));
    if (w)
    {
        // to lower
        if (tb_mbstowcs(w, p, wn) != -1)
        {
            tb_wcslwr(w);
            r = tb_wcstombs(p, w, wn);
            if (r != -1) r += (p - s);
        }

        // free it
        if (w != wb) tb_free(w);
    }
    return r;
}

Looks cool, would be nice to have it clean without windows.h or some appleframework?

SirLynix · 2026-01-17T18:21:25Z

I don't know if tb_wchar_t maps to wchar_t but beware on Windows wchar_t is UTF-16 and has surrogate pairs, which means a single wchar_t may be half a character.

waruqi · 2026-01-18T01:45:51Z

I don't know if tb_wchar_t maps to wchar_t but beware on Windows wchar_t is UTF-16 and has surrogate pairs, which means a single wchar_t may be half a character.

Therefore, I would prioritize using the CharUpperW API to implement complete string conversion, as is the case on Apple's platform. A while-loop + towupper is the fallback solution. @luadebug

luadebug · 2026-01-18T01:59:15Z

I don't know if tb_wchar_t maps to wchar_t but beware on Windows wchar_t is UTF-16 and has surrogate pairs, which means a single wchar_t may be half a character.

Therefore, I would prioritize using the CharUpperW API to implement complete string conversion, as is the case on Apple's platform. A while-loop + towupper is the fallback solution. @luadebug

Well sounds good as we have them included at any case.

waruqi · 2026-01-18T03:02:18Z

We also need to update tbox.config.h for each platform to add detection results such as wcsupr and towlower. Usually only towlower and towupper exist.

waruqi · 2026-01-18T03:13:52Z

wait for this patch #7237

waruqi · 2026-01-18T04:03:42Z

rebase dev and try it again.

waruqi · 2026-01-18T04:07:13Z

core/src/xmake/string/case.c

+    tb_size_t i;
+    for (i = 0; i < size; i++) {
+        // Stop at first non-ascii char
+        if ((tb_byte_t)str[i] >= 0x80) break;


we can use tb_charset_utf8_toupper directly https://github.com/tboox/tbox/blob/73028bf9165aece5ec9863124fda67500a005e24/src/tbox/charset/utf8.c#L258

waruqi · 2026-01-18T04:09:18Z

core/src/xmake/string/case.c

+/* //////////////////////////////////////////////////////////////////////////////////////
+ * implementation
+ */
+tb_int_t xm_string_lower(lua_State* lua) {


please move to lower.c and upper.c file

waruqi · 2026-01-18T04:24:40Z

like this

/* string.upper
 *
 * @param str       the string
 *
 * @code
 *      local result = string.upper(str)
 * @endcode
 */
tb_int_t xm_string_toupper(lua_State *lua) {
    // check
    tb_assert_and_check_return_val(lua, 0);

    // get string
    size_t           size = 0;
    tb_char_t const* cstr = luaL_checklstring(lua, 1, &size);
    tb_check_return_val(cstr, 0);

    // empty?
    if (!size) {
        lua_pushstring(lua, "");
        return 1;
    }

    // copy string to buffer
    tb_char_t* buffer = tb_malloc_bytes(size + 1);
    if (buffer) {
        tb_memcpy(buffer, cstr, size);
        buffer[size] = '\0';

        // to upper
        tb_long_t real_size = tb_charset_utf8_toupper(buffer, size);
        
        // push result
        if (real_size >= 0) {
            lua_pushlstring(lua, buffer, real_size);
        } else {
            lua_pushlstring(lua, cstr, size);
        }
        tb_free(buffer);
    } else {
        lua_pushnil(lua);
    }
    return 1;
}

add string case conversion functions: lower and upper

3cf298b

luadebug mentioned this pull request Jan 16, 2026

Lowercase non-English characters via xmake l string.lower? #7234

Closed

gemini-code-assist bot reviewed Jan 16, 2026

View reviewed changes

core/src/xmake/string/case.c Outdated Show resolved Hide resolved

gemini-code-assist bot reviewed Jan 16, 2026

View reviewed changes

core/src/xmake/string/case.c Outdated Show resolved Hide resolved

windows issue?

5af9fbb

luadebug marked this pull request as draft January 17, 2026 00:24

luadebug added 11 commits January 17, 2026 03:26

Assume we can skip EMOJIS?

424d7de

assume we check char

d83c306

try use setlocale

b5f93e0

refactor: replace setlocale with tb_setlocale for locale initialization

e7ab18b

test macos

4fcbf73

fix: adjust character range check in xm_string_case function

501ca61

try use utf8proc

bd366e3

fixup

6be7cac

feat: add support for fetching utf8proc compiler and linker flags

453362a

add utf8proc submod

c570d73

Fixup

09a3ad6

luadebug marked this pull request as ready for review January 17, 2026 02:46

re

2d5b9f7

gemini-code-assist bot reviewed Jan 17, 2026

View reviewed changes

core/src/xmake/string/case.c Outdated Show resolved Hide resolved

cleaning

389070f

luadebug added 3 commits January 17, 2026 07:36

cls

5e34b6a

cls

5f03e06

fixup

ee2f9ea

luadebug added 8 commits January 17, 2026 18:24

Check ascii

69c0e35

Try to prioritize "en_US.UTF-8"

6ace7f7

try build netbsd

769cd6b

fixup

e075b6c

fixup

33bc0b8

fixup

a264107

re

77c48eb

fixup

b11183e

gemini-code-assist bot reviewed Jan 17, 2026

View reviewed changes

core/src/xmake/string/case.c Outdated Show resolved Hide resolved

luadebug added 2 commits January 18, 2026 05:00

test commit

3946b4e

test

197a78f

luadebug added 2 commits January 18, 2026 07:03

Merge branch 'xmake-io:dev' into case

1a1bd32

Update case.c

e50e95c

waruqi reviewed Jan 18, 2026

View reviewed changes

luadebug added 4 commits January 18, 2026 07:29

try

92a5909

fixup

119b525

Update engine.c

60b461f

re

a1c0eae

waruqi added this to the v3.0.7 milestone Jan 18, 2026

waruqi merged commit d570511 into xmake-io:dev Jan 18, 2026
35 of 36 checks passed

Uh oh!

Conversation

luadebug commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Jan 16, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

luadebug commented Jan 16, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

luadebug commented Jan 17, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

waruqi commented Jan 17, 2026

Uh oh!

luadebug commented Jan 17, 2026

Uh oh!

waruqi commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luadebug commented Jan 17, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

luadebug commented Jan 17, 2026

Uh oh!

SirLynix commented Jan 17, 2026

Uh oh!

waruqi commented Jan 18, 2026

Uh oh!

luadebug commented Jan 18, 2026

Uh oh!

waruqi commented Jan 18, 2026

Uh oh!

waruqi commented Jan 18, 2026

Uh oh!

waruqi commented Jan 18, 2026

Uh oh!

waruqi Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

waruqi Jan 18, 2026

Choose a reason for hiding this comment

Uh oh!

waruqi commented Jan 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

luadebug commented Jan 16, 2026 •

edited

Loading

waruqi commented Jan 17, 2026 •

edited

Loading