ci: Use ccache to cache build objects for speeding up building#6104
ci: Use ccache to cache build objects for speeding up building#6104
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #6104 +/- ##
=========================================
- Coverage 79.1% 79.1% -0.0%
=========================================
Files 836 836
Lines 71257 71257
Branches 8306 8303 -3
=========================================
- Hits 56395 56389 -6
- Misses 14862 14868 +6 🚀 New features to boost your workflow:
|
That's not true that ccache dir is outside of the working directory |
|
Also, please take a look at Clio's implementation; many of the features can be implemented here as well:
|
Logging into our macOS GitHub runner shows that the ccache dir is the user's |
I reluctantly moved the ccache CMake changes back into a separate file, simply for similarity with Clio. However, I don't see in what way this helps keep things more modular, since in my view it just increases the number of files in the directory (albeit by just 1) and it moves 5 lines of compiler-related statements into a different file than where all the compiler-related definitions are stored. I further looked at the ccache set up in Clio and noticed that upload+download are always enabled together, and generally not enabled for pushes into the develop and release branches or the nightly run; I simplified this by just only enabling ccache for PR commits. This also avoids me having to pass variables between multiple levels of actions. Note that I use the Also note that sanitizers are not yet enabled, but from the sanitizers PR I recall that they will also be part of the config name. However, @mathbunnyru, I noticed you completely disabled ccache for sanitizers - what's the motivation for doing so? Finally, what's the motivation for not using ccache for commits into the develop or release branches? It is just to be extra safe or is there an actual known risk with ccache not producing the correct build objects? |
Alright, I see now that the |
| # However, we do not enable ccache for events targeting the master or a | ||
| # release branch, to protect against the rare case that the output | ||
| # produced by ccache is not identical to a regular compilation. |
There was a problem hiding this comment.
There is another, and in my opinion more important, reason to disable ccache for any important executions in this implementation.
Correct me if I'm wrong, but anyone who is able to run something in our VPC can have write access to the cache.
This includes all heavy runners, in all repos/forks and includes former employees, who once set runners, and now have them.
Attack: poison redis ccache, and replace the correct object file with the one which has malicious code.
And you can even do it in a private fork, and it will be difficult to trace where the attack came from (and will probably be too late).
I don't know how PRs are set-up for external contributors, but if they are run automatically, then attacker will be able to poison cache from PR, without setting up runners at all.
Note: actions/cache doesn't have this issue, because it only has write access when the GITHUB_TOKEN has the write access.
Important implication would be: binaries built/tested in PRs can't be trusted at all.
And if someone deploys from a branch, this can't also be trusted.
What do you think about this?
I understand this problem is not exactly about these lines, and more about approach in general, but GitHub doesn't allow to reply to general comments in a thread-like fashion, so I left it as comment on these lines.
There was a problem hiding this comment.
Yes, the risk of cache poisoning is why I disable the cache completely for all release-related commits. Indeed, a ramification of this is that no one should ever deploy from a non-release branch, as you pointed out.
The way the deploy pipelines are set up (currently in GitLab based on a mirror, but they will soon be moved to this repo) is that release packages are only built from the release branch. So if someone wants to do the wrong thing - build from a non-release branch - then they have to make more effort to manually create and then deploy these packages. The risk & ability to do so, however, will not meaningfully increase as a result of this change, since it requires an insider who already has the necessary permissions to do damage right now. Note that as part of moving the packaging pipeline over to here I'm planning to leverage "environments" that require an additional approval before a package is released and/or deployed - giving another chance to sanity check.
As for external bad actors who intentionally try to poison the cache via a PR, the pipelines are configured to require a maintainer to approve them to run. A maintainer should never let them run without checking the code first. If a maintainer lets these pipelines run, I suspect that the risk of cache poisoning might be similar when using the GitHub cache, since the GITHUB_TOKEN presumably would get the maintainer's permissions. There's a risk of approved contributors with write access whose pipelines will automatically run for PRs, but as long as releases are not built with the cache enabled we should be fine.
There was a problem hiding this comment.
If a maintainer lets these pipelines run, I suspect that the risk of cache poisoning might be similar when using the GitHub cache, since the GITHUB_TOKEN presumably would get the maintainer's permissions.
I don't think it works this way.
There is a restriction that GitHub implements when running from PR: https://docs.github.com/en/actions/reference/workflows-and-actions/dependency-caching#restrictions-for-accessing-a-cache
So, the PR will only be able to write to the cache scope of the PR, so no one else outside of this PR can be affected.
While with this solution, there is no such restriction
There was a problem hiding this comment.
Ack, I recognize the differences. The question is whether the risk is acceptable. I confirmed with our platform team that critical nodes (UNL validator, hubs) run the binary from packages produced from the release branch. The XRPL is designed to handle malicious nodes, so even if a regular node is updated with a tampered package it won't affect the stability of the network. I'll discuss it further with them and you offline to see if there are other risk vectors.
Co-authored-by: Ayaz Salikhov <mathbunnyru@users.noreply.github.com>
mathbunnyru
left a comment
There was a problem hiding this comment.
For some reason, ccache didn't work for ubuntu-jammy-gcc-12-arm64-debug: https://github.com/XRPLF/rippled/actions/runs/20467375572/job/59486838790?pr=6104
Could you please check why?
|
@mathbunnyru the updated ccache version has resulted in both the coverage and ubuntu-jammy builds successfully using the cache 😄 |
| description: "The CMake target to build." | ||
| type: string | ||
| required: true | ||
| type: string |
There was a problem hiding this comment.
I highly suggest in the future to make refactoring changes in a separate PR, especially when it touches completely unrelated code.
This:
- allows to fix it in all the places, and make it consistent everywhere
- keeps current PR more focused
- makes it easier to review, and the refactoring change can be merged much faster
|
Could you please check why |
I have no obvious explanation yet but I'll keep digging. If you're ok with it then we can go ahead and merge this PR, and I'll figure it out afterwards. I'm also suspecting there may be additional configs that have the same issue and we won't discover until the nightly run kicks those off. |
Let's update images to the one with ccache built from source (it might help, but might not). |
There was a problem hiding this comment.
Ah, I think this solution brought a few problems back:
- Who is responsible for cleaning the cache? If no one does it, won't it grow without limit, exhausting the disk space?
- Stats won't show what matters, because they will always add up. If you build let's say 150 files, you want to see how many of these files hit/missed the cache. Instead, I think it would show how many of all the runs in all commits and PRs this configuration so far hit/missed the cache (and this information is not useful)
I'm sorry to bring this up late; I've just thought about it.
Ccache is configured to limit the cache size of each namespace to 5GB (the default), and I configured the remote HTTP cache (based on Bazel Remote) to have a maximum size of slightly less than the disk size I provided it (~1TB). Together they will limit the amount of data that can be cached, with LRU data being removed. In practice I don't think we'll hit the remote cache disk size, because we are running under 200 configurations. We can chat further about what stats are useful to show, and if we resetting it first like you've done with the Clio pipelines gives us more valuable info. However, let's get this merged first so we can start taking advantage of faster build times. |
Nice, thanks 👍
Ok, let's go ahead and merge this |
refactor: Update Conan dependencies: protobuf and grpc (#5589) This PR updates protobuf and grpc to their latest versions. The latest protobuf version no longer requires patches, so we can use it directly from the official Conan Center Index, while the latest grpc still needed a patch, which was added to our own Conan Center Index fork in XRPLF/conan-center-index#8. cleanup docs: Infer version of Conan dependency to export (#6112) This change updates a script in the documentation to automatically infer the version of a patched Conan dependency from the conan.lock file. chore: Use updated secp256k1 recipe (#6118) This change updates the secp256k1 recipe that defines the SECP256K1_STATIC, so it no longer needs to be defined in the code here. Running the Conan update script also updated two other recipes in the lock file. chore: Clean up .gitignore and .gitattributes (#6001) The .gitignore and .gitattributes files contain references to files and directories that the current build no longer produces, so this change removes obsolete entries in these files, and does some general reorganizing of the remaining entries. chore: Fix docs readme and cmake (#6122) This change removes the unused `with_docs` option and fixes the README instructions on how to build the `docs` target. removed amendment changes added limit to reply size Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com> minor clean-up Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com> refactor: clean up `RPCHelpers` (#5684) This PR cleans up `RPCHelpers.h` and `RPCHelpers.cpp`. It splits out all the fetch-ledger functions to a new set of files, `RPCLedgerHelpers.h`/`RPCLedgerHelpers.cpp`, and moves the general-API functions to `ApiVersion.h`. There is no functionality change. refactor: rename `LedgerInfo` to `LedgerHeader` (#6136) This PR renames `LedgerInfo` to `LedgerHeader`. Namely, `LedgerInfo` was already an alias for `LedgerHeader`, and the comments next to the alias suggested that it would make sense to rename it, since that makes it clearer what it is. refactor: rename info() to header() (#6138) This change renames all the `info()` functions to `header()`, since they return `LedgerHeader` structs. It also renames the underlying variables from `info_` to `header_`. refactor: Rename `rippled` binary to `xrpld` (#5983) Per [XLS-0095](https://xls.xrpl.org/xls/XLS-0095-rename-rippled-to-xrpld.html), we are taking steps to rename ripple(d) to xrpl(d). This change modifies the binary name from `rippled` to `xrpld`, and creates a symlink named `rippled` that points to the `xrpld` binary. Note that #5975 renamed any references to `rippled` in the CMake files and their contents, but explicitly maintained the `rippled` binary name by adding an exception. This change now undoes this exception and adds an explicit symlink instead. subscription test was failing, so trying with longer timeout, Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com> Update src/xrpld/overlay/detail/PeerImp.cpp update comparison operator. Co-authored-by: Ed Hennis <ed@ripple.com> Update src/xrpld/overlay/detail/PeerImp.cpp combine strings Co-authored-by: Ed Hennis <ed@ripple.com> refactor: Move JobQueue and related classes into xrpl.core module (#6121) refactor: Rename `ripple` namespace to `xrpl` (#5982) This change renames all occurrences of `namespace ripple` and `ripple::` to `namespace xrpl` and `xrpl::`, respectively, as well as the names of test suites. It also provides a script to allow developers to replicate the changes in their local branch or fork to avoid conflicts. chore: Fix some typos in comments (#6082) ci: Update shared actions (#6147) The latest update to `cleanup-workspace`, `get-nproc`, and `prepare-runner` moved the action to the repository root directory, and also includes some ccache changes. In response, this change updates the various shared actions to the latest commit hash. refactor: remove `Json::Object` and related files/classes (#5894) `Json::Object` and related objects are not used at all, so this change removes `include/xrpl/json/Object.h` and all downstream files. There are a number of minor downstream changes as well. Full list of deleted classes and functions: * `Json::Collections` * `Json::Object` * `Json::Array` * `Json::WriterObject` * `Json::setArray` * `Json::addObject` * `Json::appendArray` * `Json::appendObject` The last helper function, `copyFrom`, seemed a bit more complex and was actually used in a few places, so it was moved to `LedgerToJson.h` instead of deleting it. Set version to 3.2.0-b0 (#6153) ci: Remove superfluous build directory creation (#6159) This change modifies the build directory structure from `build/build/xxx` or `.build/build/xxx` to just `build/xxx`. Namely, the `conanfile.py` has the CMake generators build directory hardcoded to `build/generators`. We may as well leverage the top-level build directory without introducing another layer of directory nesting. fix: Remove cryptographic libs from libxrpl Conan package (#6163) * fix: rm crypto libs and fix protobuf path * update/rm comments chore: Pin ruamel.yaml<0.19 in pre-commit-hooks (#6166) See pre-commit/pre-commit-hooks#1229 for more details. Revert "chore: Pin ruamel.yaml<0.19 in pre-commit-hooks (#6166)" (#6167) This reverts commit 0f23ad8. refactor: Rename `rippled.cfg` to `xrpld.cfg` (#6098) This change renames all occurrences of `rippled.cfg` to `xrpld.cfg`. It also provides a script to allow developers to replicate the changes in their local branch or fork to avoid conflicts. For the time being it maintains support for `rippled.cfg` as config file, if `xrpld.cfg` does not exist. test: add more tests for `ledger_entry` RPC (#5858) This change adds some basic tests for all the `ledger_entry` helper functions, so each ledger entry type is covered. There are further some minor refactors in `parseAMM` to provide better error messages. Finally, to improve readability, alphabetization was applied in the helper functions. ci: Use ccache to cache build objects for speeding up building (#6104) Right now, each pipeline invocation builds the source code from scratch. Although compiled Conan dependencies are cached in a remote server, the source build objects are not. We are able to further speed up our builds by leveraging `ccache`. This change enables caching of build objects using `ccache` on Linux, macOS, and Windows. ci: Move variable into right place (#6179) This change moves the `enable_ccache` variable in the `on-trigger.yml` file to the correct location. refactor: Fix typos in comments, configure cspell (#6164) This change sets up a `cspell `configuration and fixes lots of typos in comments. There are no other code changes. refactor: Fix spelling issues in private/local variables and functions (#6182) This change fixes several typos in private/local variables and private functions. There is no functionality change. refactor: Fix spelling issues in all variables/functions (#6184) This change fixes many typos in comments, variables, and public functions. There is no functionality change. refactor: Remove unused credentials signature hash prefix (#6186) This change removes the unused credentials signature hash prefix from `HashPrefix.h`. fix: Reorder Batch Preflight Errors (#6176) This change fixes #6058. refactor: Fix typos, enable cspell pre-commit (#5719) This change fixes the last of the spelling issues, and enables the pre-commit (and CI) check for spelling. There are no functionality changes, but it does rename some enum values. ci: Use updated prepare-runner in actions and worfklows (#6188) This change updates the XRPLF pre-commit workflow and prepare-runner action to their latest versions. For naming consistency the prepare-runner action changed the disable_ccache variable into enable_ccache, which matches our naming. docs: Fix minor spelling issues in comments (#6194) fix: Truncate thread name to 15 chars on Linux (#5758) This change: * Truncates thread names if more than 15 chars with `snprintf`. * Adds warnings for truncated thread names if `-DTRUNCATED_THREAD_NAME_LOGS=ON`. * Add a static assert for string literals to stop compiling if > 15 chars. * Shortens `Resource::Manager` to `Resource::Mngr` to fix the static assert failure. * Updates `CurrentThreadName_test` unit test specifically for Linux to verify truncation. VaultClawback: Burn shares of an empty vault (#6120) - Adds a mechanism for the vault owner to burn user shares when the vault is stuck. If the Vault has 0 AssetsAvailable and Total, the owner may submit a VaultClawback to reclaim the worthless fees, and thus allow the Vault to be deleted. The Amount must be left off (unless the owner is the asset issuer), specified as 0 Shares, or specified as the number of Shares held. chore: Change `/Zi` to `/Z7` for ccache, remove debug symbols in CI (#6198) As the `/Zi` compiler flag is unsupported by ccache, this change switches it to `/Z7` instead. For CI runs all debug info is omitted. fix: Inner batch transactions never have valid signatures (#6069) - Introduces amendment `fixBatchInnerSigs` - Update Batch unit tests - Fix all the Env instantiations to _use_ the "features" parameter. - testInnerSubmitRPC runs with Batch enabled and disabled. - Add a test to testInnerSubmitRPC for a correctly signed tx incorrectly using the tfInnerBatchTxn flag. - Generalize the submitAndValidate lambda in testInnerSubmitRPC. - With the fix amendment, a transaction never reaches the transaction engine (Transactor and derived classes.) - Test submitting a pseudo-transaction. Stopped before reaching the transaction engine, but with different errors. - The tests verify that without the amendment, a transaction with tfInnerBatchTxn is immediately rejected. Without the amendment, things are safe. The amendment just makes things safer and more future-proof. chore: Pin pre-commit hooks to commit hashes (#6205) This change updates and pins the Black and CSpell pre-commit hooks. refactor: Remove unnecessary version number and options in cmake find_package (#6169) This change removes unnecessary version numbers in the OpenSSL and Boost `find_package` CMake statements. An unnecessary OpenSSL definition is removed, while Conan options for SSL are updated to disable insecure ciphers. Moreover, the statements are now ordered alphabetically and more logically. ci: Update actions/images to use cmake 4.2.1 and conan 2.24.0 (#6209) fix: Update Conan lock file with changed OpenSSL recipe (#6211) This change updates the `conan.lock` file with a changed OpenSSL recipe that contains a fix regarding options passed to the compiler Improve and fix bugs in Lending Protocol (#6102) - Spec: XLS-66 Fix overpayment asserts (#6084) MPTTester::operator() parameter should be std::int64_t - Originally defined as uint64_t, but the testIssuerLoan() test called it with a negative number, causing an overflow to a very large number that in some circumstances could be silently cast back to an int64_t, but might not be. I believe this is UB, and we don't want to rely on that. Review feedback from @Tapanito: overpayment value change - In overpayment results, the management fee was being calculated twice: once as part of the value change, and as part of the fees paid. Exclude it from the value change. Fix Overpayment Calculation (#6087) - Adds additional unit tests to cover math calculations. - Removes unused methods. Review feedback from @shawnxie999: even more rounding - Round the initial total value computation upward, unless there is 0-interest. - Rename getVaultScale to getAssetsTotalScale, and convert one incorrect computation to use it. - Use adjustImpreciseNumber for LossUnrealized. - Add some logging to computeLoanProperties. Fix LoanBrokerSet debtMaximum limits (#6116) Fix some minor bugs in Lending Protocol (#6101) - add nodiscard to unimpairLoan, and check result in LoanPay - add a check to verify that issuer exists - improve LoanManage error code for dust amounts Check permissions in LoanSet and LoanPay (#6108) Disallow pseudo accounts to be Destination for LoanBrokerCoverWithdraw (#6106) Ensure vault asset cap is not exceeded (#6124) Fix Overpayment ValueChange calculation in Lending Protocol (#6114) - Adds loan state to LoanProperties. - Cleans up computeLoanProperties. - Fixes missing management fee from overpayment. fix: Enable LP Deposits when the broker is the asset issuer (#6119) * Replace accountHolds with accountSpendable when checking for account funds in VaultDeposit and LoanBrokerCoverDeposit Add a few minor changes (#6158) - Updates or fixes a couple of things I noticed while reviewing changes to the spec. - Rename sfPreviousPaymentDate to sfPreviousPaymentDueDate. - Make the vault asset cap check added in #6124 a little more robust: 1. Check in preflight if the vault is _already_ over the limit. 2. Prevent overflow when checking with the loan value. (Subtract instead of adding, in case the values are near maxint. Both return the same result. Also add a unit test so each case is covered. Add minimum grace period validation (#6133) Fix bugs: frozen pseudo-account, and FLC cutoff (#6170) refactor: Rename raw state to theoretical state (#6187) Check if a withdrawal amount exceeds any applicable receiving limit. (#6117) Fix overpayment result calculation (#6195) Address review feedback from Lending Protocol re-review (#6161) --------- Co-authored-by: Gregory Tsipenyuk <gregtatcam@users.noreply.github.com> Co-authored-by: Bronek Kozicki <brok@incorrekt.com> Co-authored-by: Vito Tumas <5780819+Tapanito@users.noreply.github.com> Co-authored-by: Shawn Xie <35279399+shawnxie999@users.noreply.github.com> Co-authored-by: Jingchen <a1q123456@users.noreply.github.com> Expand Number to support the full integer range (#6025) - Refactor Number internals away from int64 to uint64 & a sign flag - ctors and accessors use `rep`. Very few things expose `internalrep`. - An exception is "unchecked" and the new "normalized", which explicitly take an internalrep. But with those special control flags, it's easier to distinguish and control when they are used. - For now, skip the larger mantissas in AMM transactions and tests - Remove trailing zeros from scientific notation Number strings - Update tests. This has the happy side effect of making some of the string representations _more_ consistent between the small and large mantissa ranges. - Add semi-automatic rounding of STNumbers based on Asset types - Create a new SField metadata enum, sMD_NeedsAsset, which indicates the field should be associated with an Asset so it can be rounded. - Add a new STTakesAsset intermediate class to handle the Asset association to a derived ST class. Currently only used in STNumber, but could be used by other types in the future. - Add "associateAsset" which takes an SLE and an Asset, finds the sMD_NeedsAsset fields, and associates the Asset to them. In the case of STNumber, that both stores the Asset, and rounds the value immediately. - Transactors only need to add a call to associateAsset _after_ all of the STNumbers have been set. Unfortunately, the inner workings of STObject do not do the association correctly with uninitialized fields. - When serializing an STNumber that has an Asset, round it before serializing. - Add an override of roundToAsset, which rounds a Number value in place to an Asset, but without any additional scale. - Update and fix a bunch of Loan-related tests to accommodate the expanded Number class. --------- Co-authored-by: Vito <5780819+Tapanito@users.noreply.github.com> Change LendingProtocol feature and dependencies to supported (#6146) minor code review changes Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com> test: Replace `failed` string in Vault test case (#6214) The word `failed` in the test case makes it hard to search through the test logs when an actual test failure occurs, so this change renames the word to just `fail` instead. test: Suppress "parse failed" message in Batch tests (#6207) test: Use gtest instead of doctest (#6216) This change switches over the doctest framework to the gtest framework. ci: Add sanitizers to CI builds (#5996) This change adds support for sanitizer build options in CI builds workflow. Currently `asan+ubsan` is enabled, while `tsan+ubsan` is left disabled as more changes are required. added unit test Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com> updated levelization Signed-off-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com> Improve ledger_entry lookups for fee, amendments, NUNL, and hashes (#5644) These "fixed location" objects can be found in multiple ways: 1. The lookup parameters use the same format as other ledger objects, but the only valid value is true or the valid index of the object: - Amendments: "amendments" : true - FeeSettings: "fee" : true - NegativeUNL: "nunl" : true - LedgerHashes: "hashes" : true (For the "short" list. See below.) 2. With RPC API >= 3, using special case values to "index", such as "index" : "amendments". Uses the same names as above. Note that for "hashes", this option will only return the recent ledger hashes / "short" skip list. 3. LedgerHashes has two types: "short", which stores recent ledger hashes, and "long", which stores the flag ledger hashes for a particular ledger range. - To find a "long" LedgerHashes object, request '"hashes" : <ledger sequence>'. <ledger sequence> must be a number that evaluates to an unsigned integer. - To find the "short" LedgerHashes object, request "hashes": true as with the other fixed objects. The following queries are all functionally equivalent: - "amendments" : true - "index" : "amendments" (API >=3 only) - "amendments" : "7DB0788C020F02780A673DC74757F23823FA3014C1866E72CC4CD8B226CD6EF4" - "index" : "7DB0788C020F02780A673DC74757F23823FA3014C1866E72CC4CD8B226CD6EF4" Finally, whether the object is found or not, if a valid index is computed, that index will be returned. This can be used to confirm the query was valid, or to save the index for future use. ci: remove 'master' branch as a trigger (#6234) This change removes the `master` branch as a trigger for the CI pipelines, and updates comments accordingly. It also fixes the pre-commit workflow, so it will run on all release branches.
High Level Overview of Change
This change enables caching of build objects using
ccacheon Linux, macOS, and Windows.Context of Change
Right now, each pipeline invocation builds the source code from scratch. Although compiled Conan dependencies are cached in a remote server, the source build objects are not. We are able to further speed up our builds by leveraging
ccache.Type of Change
.gitignore, formatting, dropping support for older tooling)