Skip to content

reuse rand32#6908

Merged
waruqi merged 3 commits intodevfrom
hash
Oct 9, 2025
Merged

reuse rand32#6908
waruqi merged 3 commits intodevfrom
hash

Conversation

@waruqi
Copy link
Member

@waruqi waruqi commented Oct 9, 2025

  • Before adding new features and new modules, please go to issues to submit the relevant feature description first.
  • Write good commit messages and use the same coding conventions as the rest of the project.
  • Please commit code to dev branch and we will merge into master branch in feature
  • Ensure your edited codes with four spaces instead of TAB.

  • 增加新特性和新模块之前,请先到issues提交相关特性说明,经过讨论评估确认后,再进行相应的代码提交,避免做无用工作。
  • 编写友好可读的提交信息,并使用与工程代码相同的代码规范,代码请用4个空格字符代替tab缩进。
  • 请提交代码到dev分支,如果通过,我们会在特定时间合并到master分支上。
  • 为了规范化提交日志的格式,commit消息,不要用中文,请用英文描述。

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @waruqi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This PR replaces hash.rand64() with hash.rand32() in package build directory generation, CMakeLists key generation, and unity build file unique ID creation, likely to optimize for smaller random numbers where 32-bit is sufficient.

Highlights

  • Intent: This pull request aims to replace the usage of hash.rand64() with hash.rand32() in several parts of the codebase. The likely intent is to use a smaller, 32-bit random number generator where a 64-bit one is not strictly necessary, potentially for performance or resource efficiency, or simply because 32-bit randomness provides sufficient uniqueness for the given contexts.
  • Changes: The changes consistently replace hash.rand64() with hash.rand32() across three files:
  • xmake/core/package/package.lua: The builddir generation for both local and non-local packages now uses hash.rand32() instead of hash.rand64() to create unique build directory names.
  • xmake/plugins/project/cmake/cmakelists.lua: The key generation for target_objectfiles_%s and output_%s in CMakeLists output is updated to use hash.rand32().
  • xmake/rules/c++/unity_build/unity_build.lua: The unique ID generated for unity build files (e.g., unity_ prefix) now utilizes hash.rand32().
  • Reviewer Activity: No specific reviewer activity has been recorded or provided for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces hash.rand64() with hash.rand32() in several places to generate unique identifiers. While this might seem like a minor change, it significantly increases the risk of hash collisions, which can lead to subtle and hard-to-debug build failures. The hash.lua file itself contains a warning for rand32(): "it is easy to trigger hash conflicts".

My review focuses on the correctness and robustness implications of this change. I've identified several areas where this increased collision risk is problematic, especially in concurrent build environments and for large projects. I've marked one of the issues as critical as it directly impacts the correctness of unity builds.

I recommend reverting these changes to use hash.rand64() unless there is a compelling and documented performance reason for this switch that outweighs the risk to build stability.

sourcefile = path.relative(sourcefile, path.directory(sourcefile_unity))
if uniqueid then
unityfile:print("#define %s %s", uniqueid, "unity_" .. hash.rand64())
unityfile:print("#define %s %s", uniqueid, "unity_" .. hash.rand32())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This change is particularly risky. In a unity build, this unique ID is generated for each source file to prevent symbol collisions. For large projects with thousands of source files, the probability of a 32-bit hash collision becomes significant (see the Birthday Problem).

A collision here would lead to subtle and difficult-to-debug compilation errors (e.g., redefinition of static symbols). Given the explicit warning in hash.lua about rand32 being prone to conflicts, using hash.rand64() is much safer and necessary for correctness in this context.

                unityfile:print("#define %s %s", uniqueid, "unity_" .. hash.rand64())

local name = self:name():lower():gsub("::", "_")
local rootdir = path.join(config.builddir({absolute = true}), ".packages", name:sub(1, 1):lower(), name, self:version_str())
builddir = path.join(rootdir, "cache", "build_" .. hash.rand64())
builddir = path.join(rootdir, "cache", "build_" .. hash.rand32())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Switching from hash.rand64() to hash.rand32() for generating build directory names significantly increases the probability of hash collisions. The hash.lua file itself notes for rand32: "it is easy to trigger hash conflicts".

This could lead to intermittent build failures in environments with high concurrency (like CI/CD systems) where multiple builds for the same package might occur simultaneously. A 64-bit random number provides a much stronger guarantee of uniqueness. The same concern applies to line 835.

Was there a specific performance reason for this change? If not, it would be safer to stick with rand64() to avoid potential race conditions.

            builddir = path.join(rootdir, "cache", "build_" .. hash.rand64())


local has_links = #target:objectfiles() > objectfiles_set:size()
local key = target:name() .. "_" .. hash.rand64()
local key = target:name() .. "_" .. hash.rand32()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using hash.rand32() instead of hash.rand64() to generate a unique key for CMake's add_library increases the risk of collisions. While the key is prefixed with the target name, a collision is still possible and could lead to confusing build errors in the generated project. Given the warning about rand32 in hash.lua ("it is easy to trigger hash conflicts"), hash.rand64() is a much safer choice for ensuring uniqueness.

    local key = target:name() .. "_" .. hash.rand64()

-- @see https://gitlab.kitware.com/cmake/cmake/-/issues/17802
--
local key = target:name() .. "_" .. hash.rand64()
local key = target:name() .. "_" .. hash.rand32()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

As with the other changes, using hash.rand32() for this custom command's output key introduces an unnecessary risk of collision. A key collision in the generated CMake script could cause hard-to-diagnose build issues. To ensure robustness, hash.rand64() should be used here as well.

        local key = target:name() .. "_" .. hash.rand64()

@waruqi waruqi added this to the v3.0.4 milestone Oct 9, 2025
@waruqi waruqi merged commit 3a2c668 into dev Oct 9, 2025
46 of 50 checks passed
@waruqi waruqi deleted the hash branch October 9, 2025 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant