Skip to content

在1s内多次操作accesskey问题 #5483#5490

Merged
spaceluke merged 1 commit intoapolloconfig:masterfrom
yyfyfyang:fix/issue_5483
Nov 6, 2025
Merged

在1s内多次操作accesskey问题 #5483#5490
spaceluke merged 1 commit intoapolloconfig:masterfrom
yyfyfyang:fix/issue_5483

Conversation

@yyfyfyang
Copy link
Copy Markdown
Contributor

@yyfyfyang yyfyfyang commented Nov 5, 2025

What's the purpose of this PR

解决在1s内有多个accesskey发生变更,或者在同一秒创建并操作accesskey,会因为数据库精度只到秒导致缓存中的值不更新的问题

Which issue(s) this PR fixes:

Fixes #5483

Brief changelog

  1. 对于缓存更新中accesskey的dataChangeLastModifiedTime的判断由大于改为了大于等于
  2. 对于是否删除缓存中的accesskey的判断由新加入的accesskey的dataChangeLastModifiedTime是否大于当前的dataChangeLastModifiedTime改为了新加入的accesskey的dataChangeLastModifiedTime是否大于等于当前的dataChangeLastModifiedTime
  3. 在hasmore的时候手动增加lastTimeScanned 1s,来规避可能出现的无限循环

Summary by CodeRabbit

  • Bug Fixes

    • Improved access-key synchronization to avoid missing records during rapid updates by using bounded time-range scans, adding time-drift protections, and ensuring cache updates treat equal-or-newer timestamps as changes.
  • Tests

    • Added and updated tests covering date-range based access-key retrieval and scan behavior; simplified some test flows to focus on initial scan expectations.
  • Documentation

    • Changelog entry noting the fix for operating an AccessKey multiple times within one second.

@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Nov 5, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Nov 5, 2025

Walkthrough

The PR adds a bounded, ordered repository query for AccessKey modifications, updates the cache-scanning logic to use inclusive-exclusive time windows with drift handling and late-chunk processing, changes cache-merge to treat equal timestamps as updates, adds/updates tests, and adds a changelog entry.

Changes

Cohort / File(s) Summary
Repository: new query + test
apollo-biz/src/main/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepository.java, apollo-biz/src/test/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepositoryTest.java
Added findFirst500ByDataChangeLastModifiedTimeGreaterThanEqualAndDataChangeLastModifiedTimeLessThanOrderByDataChangeLastModifiedTimeAsc(Date start, Date end) and an integration test validating a 1-second inclusive-exclusive range and expected result.
Service: cache scanning & merge logic
apollo-configservice/src/main/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCache.java, apollo-configservice/src/test/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCacheTest.java
Replaced unbounded scan with bounded [lastTimeScanned, currentTime) queries, added currentTime capture, applied drift-adjustment to lastTimeScanned, added late-chunk fetch for same-timestamp entries and +1000ms advancement, and changed merge comparison to use >= (include equal timestamps). Test simplified to match new query.
Changelog
CHANGES.md
Added entry: "Fix: Operate the AccessKey multiple times within one second".

Sequence Diagram(s)

sequenceDiagram
    participant CacheSvc as AccessKeyServiceWithCache
    participant Repo as AccessKeyRepository

    rect rgb(240,230,210)
    Note over CacheSvc,Repo: Bounded scan using [lastTimeScanned, currentTime)
    CacheSvc->>CacheSvc: currentTime = now()
    CacheSvc->>Repo: findFirst500ByDataChangeLastModifiedTimeGreaterThanEqualAndDataChangeLastModifiedTimeLessThanOrderBy...(lastTimeScanned, currentTime)
    Repo-->>CacheSvc: batch [t1...tn]
    alt batch non-empty and possibly more at tn
        CacheSvc->>CacheSvc: lastTimeScanned = tn + 1000ms
        CacheSvc->>Repo: findFirst500ByDataChangeLastModifiedTimeEqualOrderBy...(tn)
        Repo-->>CacheSvc: extra batch (same timestamp tn)
        CacheSvc->>CacheSvc: lastTimeScanned = tn + 1000ms
    else no more results in window
        CacheSvc->>CacheSvc: lastTimeScanned = currentTime
    end
    Note right of CacheSvc: mergeAccessKeys uses compareTo(...) >= 0 to update cache
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Focus areas:
    • Inclusive-exclusive window correctness and test time boundaries.
    • lastTimeScanned drift-adjustment and +1000ms advancement for same-timestamp batches.
    • mergeAccessKeys change to >=: verify it doesn't overwrite newer entries.
    • Simplified test coverage: ensure edge cases (many entries with same timestamp) remain tested.

Poem

🐰 I hopped through times, one-second bounds I kept,
Caught same ticks twice so no secrets leapt,
Equal moments now wake caches bright,
Windows snug, no midnight flight,
A little hop, and all is kept. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title references issue #5483 and uses Chinese to indicate it addresses multiple AccessKey operations within 1 second, directly corresponding to the main problem being fixed.
Linked Issues check ✅ Passed The PR implements all objectives from #5483: changes cache update condition from > to >= to handle same-second modifications, updates deletion logic similarly, and adds 1-second increments to prevent infinite loops when fetching AccessKey records with identical timestamps.
Out of Scope Changes check ✅ Passed All changes directly address the cache synchronization issue in #5483 caused by database time precision at the second level, with no unrelated modifications detected.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
apollo-configservice/src/test/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCacheTest.java (1)

84-92: Consider expanding test coverage for the race condition fix.

The test now only verifies initialization with an empty cache. Given that this PR fixes a critical race condition involving same-second operations, consider adding test cases for:

  • Multiple AccessKey updates within the same second
  • Batch boundary scenarios (exactly 500 records)
  • Cache merge behavior when timestamps are equal
  • Time window advancement and lastTimeScanned progression
  • Behavior when records exist at the currentTime boundary
apollo-biz/src/test/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepositoryTest.java (1)

87-107: Test data correctly aligns with expected boundaries.

Verification confirms the test data is properly set up:

  • Record ID 4 exists at exactly 2019-12-19 13:44:21 with matching appId "100004458" and secret "4003c4d7783443dc9870932bebf3b7fe"
  • No other records fall within the [2019-12-19 13:44:21, 2019-12-19 13:44:22) range (ID 3 at 13:44:19 is before; ID 2 at 14:46:35 is after)
  • The inclusive/exclusive boundary semantics are properly validated

The optional suggestions for additional test cases remain valid:

  • Test cases for empty result sets
  • Test cases for multiple records within the range
  • Explicit boundary timestamp validation (e.g., records at exactly the exclusive end boundary)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3086b57 and 20b0493.

📒 Files selected for processing (4)
  • apollo-biz/src/main/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepository.java (1 hunks)
  • apollo-biz/src/test/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepositoryTest.java (1 hunks)
  • apollo-configservice/src/main/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCache.java (3 hunks)
  • apollo-configservice/src/test/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCacheTest.java (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build (8)
🔇 Additional comments (5)
apollo-biz/src/main/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepository.java (1)

36-37: LGTM! Date range query method correctly implements inclusive-start, exclusive-end semantics.

The new method enables proper time-window queries with GreaterThanEqual for the start boundary and LessThan for the end boundary, which is essential for fixing the same-second race condition.

apollo-configservice/src/main/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCache.java (4)

149-149: Good fix: Bounding the scan window prevents race conditions.

Introducing currentTime at the start of the scan ensures that the query window [lastTimeScanned, currentTime) remains fixed throughout the batch processing loop, preventing records from being missed if they're modified during the scan.


153-154: Critical fix: Inclusive start boundary prevents missing same-second updates.

The change from GreaterThan to GreaterThanEqual for the start boundary is essential for fixing the race condition. This ensures that when multiple AccessKey operations occur within the same second, subsequent scans using that timestamp as lastTimeScanned will still retrieve those records.


188-193: Key fix: Treating equal timestamps as updates resolves the same-second race condition.

Changing from after() to compareTo(...) >= 0 is essential for the fix. Previously, when an AccessKey was modified within the same second as a previous modification, the cache wouldn't update because timestamps were equal. Now, equal timestamps trigger cache updates, ensuring consistency.

The logic correctly:

  1. Adds the new AccessKey to cache (line 186)
  2. Removes the old entry if timestamps are equal or newer (line 190)
  3. Logs the change for debugging (line 191)

167-177: Timestamp precision is correctly handled; the +1 second advancement is safe.

The review concern is addressed by the codebase design:

  • Database schema: DataChange_LastTime uses MySQL/H2 timestamp type, which has second-level precision (no fractional seconds)
  • Java entity: The dataChangeLastModifiedTime field is java.util.Date, retrieved from the timestamp column with milliseconds zeroed
  • Record lifecycle: All timestamps are created via new Date() in entity lifecycle hooks (@PrePersist, @PreUpdate), stored as whole-second values
  • Result: Records can only have whole-second timestamps; advancing by 1000ms (line 174) safely moves to the next valid timestamp boundary without skipping records

The PR description's assumption ("数据库时间精度仅到秒") is verified and correct. The code logic is sound.

lastTimeScanned);
.findFirst500ByDataChangeLastModifiedTimeGreaterThanEqualAndDataChangeLastModifiedTimeLessThanOrderByDataChangeLastModifiedTimeAsc(
lastTimeScanned, currentTime);
if (CollectionUtils.isEmpty(accessKeys)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果下面有对hasMore的判断的话,这边是不是不需要这个判空逻辑了

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

正好一起更新lastTimeScanned = currentTime;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的这里的判空可以去掉了

@@ -146,11 +146,12 @@ private void rebuildAccessKeyCache() {

private void loadNewAndUpdatedAccessKeys() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我担心分布式部署可能存在时间戳漂移的现象,可能出现向前写入的情况,每次进入这个方法能不能够将lastTimeScanned手动-1s,来提高系统容错呢

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

想了想这个地方是可以的,是应该这样做,这样也是不会出现死循环的

@spaceluke
Copy link
Copy Markdown
Member

spaceluke commented Nov 5, 2025

记得在 CHANGES.md 加一下这个改动 :)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apollo-configservice/src/test/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCacheTest.java (1)

67-92: Restore test coverage for the timing-related fixes.

The test now only verifies the initial empty state and no longer exercises the core scenarios this PR aims to fix: multiple AccessKey operations within one second, drift handling, and the updated cache merge logic (>= comparison).

Consider restoring or adding tests that verify:

  1. Multiple AccessKey updates within the same second are properly cached
  2. Drift handling works correctly (the -1 second adjustment)
  3. The >= comparison logic in mergeAccessKeys handles same-timestamp updates
  4. The hasMore scenario with +1000ms advancement

These scenarios are critical given the bug was timing-related and intermittent.

🧹 Nitpick comments (1)
apollo-configservice/src/main/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCache.java (1)

166-181: Clarify edge case handling when lastTimeScanned exceeds currentTime.

After line 177, lastTimeScanned may exceed currentTime (e.g., if currentTime is mid-second like 13:44:21.5, and lastTimeScanned advances to 13:44:22). The next iteration would query an invalid range [lastTimeScanned, currentTime) where start > end, returning 0 results and causing the loop to exit with lastTimeScanned set to currentTime.

While this is self-correcting, consider adding a comment or an explicit check after line 177 to make this edge case handling clearer:

       lastTimeScanned = new Date(lastTimeScanned.getTime() + 1000);
+      // If advanced past currentTime, next iteration will return empty and exit loop
     } else {
       lastTimeScanned = currentTime;
     }

Alternatively, add an explicit bounds check:

       lastTimeScanned = new Date(lastTimeScanned.getTime() + 1000);
+      if (lastTimeScanned.after(currentTime)) {
+        lastTimeScanned = currentTime;
+        hasMore = false;
+      }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 603cdb4 and 69ece43.

📒 Files selected for processing (5)
  • CHANGES.md (1 hunks)
  • apollo-biz/src/main/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepository.java (1 hunks)
  • apollo-biz/src/test/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepositoryTest.java (1 hunks)
  • apollo-configservice/src/main/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCache.java (2 hunks)
  • apollo-configservice/src/test/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCacheTest.java (1 hunks)
🔇 Additional comments (5)
CHANGES.md (1)

24-24: LGTM!

The changelog entry clearly describes the fix and follows the established format.

apollo-biz/src/main/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepository.java (1)

36-37: LGTM!

The new repository method correctly implements a bounded time range query with start-inclusive and end-exclusive semantics, which properly supports the scanning algorithm's need to process AccessKeys in controlled time windows.

apollo-biz/src/test/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepositoryTest.java (1)

87-107: LGTM!

The integration test effectively validates the new bounded range query method with a precise 1-second window, correctly asserting both the count and content of results within the inclusive-exclusive range.

apollo-configservice/src/main/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCache.java (2)

149-154: Verify performance impact of drift handling.

The drift handling creates a 1-second overlap on every scan (except the first), meaning AccessKeys modified in the previous ~1 second will be re-processed each time. While this correctly handles distributed clock skew, it could impact performance if AccessKey operations are frequent.

Consider monitoring the performance impact in production, especially tracking:

  1. Number of AccessKeys re-processed per scan
  2. Scan duration when there are many operations
  3. Cache churn from repeated merges

The approach is correct for correctness, but performance characteristics should be validated.


191-195: LGTM! This is the key fix for same-second updates.

Changing the comparison from strictly greater than to greater-or-equal ensures that when multiple AccessKey operations occur within the same second (resulting in identical database timestamps), the later-scanned update properly replaces the cached entry. This directly addresses the reported issue where AccessKey changes within 1 second were not properly cached.

Instant instantEnd =
LocalDateTime.of(2019, 12, 19, 13, 44, 22).atZone(ZoneId.systemDefault()).toInstant();
Date dateEnd = Date.from(instantEnd);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

还存在一些code style上的问题,可以运行 mvn spotless:apply 来对代码仓库进行格式化

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

好的,我本地执行下

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P.S. 可能需要jdk17的环境才能运行这个插件(它的版本兼容做的太差了...)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我用jdk11执行完以后没有报错,但是也没有任何更新

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂ok,这说明你的代码是符合 我们项目的 code-style 的,那就没问题啦~

我检查了下我们项目的 code-style 配置,目前确实没有对换行做一些管控:
<setting id="org.eclipse.jdt.core.formatter.number_of_empty_lines_to_preserve" value="3"/>

Copy link
Copy Markdown
Member

@spaceluke spaceluke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 6, 2025
@spaceluke spaceluke merged commit d683dad into apolloconfig:master Nov 6, 2025
8 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Nov 6, 2025
@nobodyiam nobodyiam added this to the 2.5.0 milestone Feb 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

App开启AccessKey后不带Accesskey也可以访问Apollo配置

3 participants