在1s内多次操作accesskey问题 #5483#5490
Conversation
WalkthroughThe PR adds a bounded, ordered repository query for AccessKey modifications, updates the cache-scanning logic to use inclusive-exclusive time windows with drift handling and late-chunk processing, changes cache-merge to treat equal timestamps as updates, adds/updates tests, and adds a changelog entry. Changes
Sequence Diagram(s)sequenceDiagram
participant CacheSvc as AccessKeyServiceWithCache
participant Repo as AccessKeyRepository
rect rgb(240,230,210)
Note over CacheSvc,Repo: Bounded scan using [lastTimeScanned, currentTime)
CacheSvc->>CacheSvc: currentTime = now()
CacheSvc->>Repo: findFirst500ByDataChangeLastModifiedTimeGreaterThanEqualAndDataChangeLastModifiedTimeLessThanOrderBy...(lastTimeScanned, currentTime)
Repo-->>CacheSvc: batch [t1...tn]
alt batch non-empty and possibly more at tn
CacheSvc->>CacheSvc: lastTimeScanned = tn + 1000ms
CacheSvc->>Repo: findFirst500ByDataChangeLastModifiedTimeEqualOrderBy...(tn)
Repo-->>CacheSvc: extra batch (same timestamp tn)
CacheSvc->>CacheSvc: lastTimeScanned = tn + 1000ms
else no more results in window
CacheSvc->>CacheSvc: lastTimeScanned = currentTime
end
Note right of CacheSvc: mergeAccessKeys uses compareTo(...) >= 0 to update cache
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
apollo-configservice/src/test/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCacheTest.java (1)
84-92: Consider expanding test coverage for the race condition fix.The test now only verifies initialization with an empty cache. Given that this PR fixes a critical race condition involving same-second operations, consider adding test cases for:
- Multiple AccessKey updates within the same second
- Batch boundary scenarios (exactly 500 records)
- Cache merge behavior when timestamps are equal
- Time window advancement and
lastTimeScannedprogression- Behavior when records exist at the
currentTimeboundaryapollo-biz/src/test/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepositoryTest.java (1)
87-107: Test data correctly aligns with expected boundaries.Verification confirms the test data is properly set up:
- Record ID 4 exists at exactly
2019-12-19 13:44:21with matchingappId"100004458" andsecret"4003c4d7783443dc9870932bebf3b7fe"- No other records fall within the
[2019-12-19 13:44:21, 2019-12-19 13:44:22)range (ID 3 at 13:44:19 is before; ID 2 at 14:46:35 is after)- The inclusive/exclusive boundary semantics are properly validated
The optional suggestions for additional test cases remain valid:
- Test cases for empty result sets
- Test cases for multiple records within the range
- Explicit boundary timestamp validation (e.g., records at exactly the exclusive end boundary)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
apollo-biz/src/main/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepository.java(1 hunks)apollo-biz/src/test/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepositoryTest.java(1 hunks)apollo-configservice/src/main/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCache.java(3 hunks)apollo-configservice/src/test/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCacheTest.java(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build (8)
🔇 Additional comments (5)
apollo-biz/src/main/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepository.java (1)
36-37: LGTM! Date range query method correctly implements inclusive-start, exclusive-end semantics.The new method enables proper time-window queries with
GreaterThanEqualfor the start boundary andLessThanfor the end boundary, which is essential for fixing the same-second race condition.apollo-configservice/src/main/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCache.java (4)
149-149: Good fix: Bounding the scan window prevents race conditions.Introducing
currentTimeat the start of the scan ensures that the query window[lastTimeScanned, currentTime)remains fixed throughout the batch processing loop, preventing records from being missed if they're modified during the scan.
153-154: Critical fix: Inclusive start boundary prevents missing same-second updates.The change from
GreaterThantoGreaterThanEqualfor the start boundary is essential for fixing the race condition. This ensures that when multiple AccessKey operations occur within the same second, subsequent scans using that timestamp aslastTimeScannedwill still retrieve those records.
188-193: Key fix: Treating equal timestamps as updates resolves the same-second race condition.Changing from
after()tocompareTo(...) >= 0is essential for the fix. Previously, when an AccessKey was modified within the same second as a previous modification, the cache wouldn't update because timestamps were equal. Now, equal timestamps trigger cache updates, ensuring consistency.The logic correctly:
- Adds the new AccessKey to cache (line 186)
- Removes the old entry if timestamps are equal or newer (line 190)
- Logs the change for debugging (line 191)
167-177: Timestamp precision is correctly handled; the +1 second advancement is safe.The review concern is addressed by the codebase design:
- Database schema:
DataChange_LastTimeuses MySQL/H2timestamptype, which has second-level precision (no fractional seconds)- Java entity: The
dataChangeLastModifiedTimefield isjava.util.Date, retrieved from thetimestampcolumn with milliseconds zeroed- Record lifecycle: All timestamps are created via
new Date()in entity lifecycle hooks (@PrePersist, @PreUpdate), stored as whole-second values- Result: Records can only have whole-second timestamps; advancing by 1000ms (line 174) safely moves to the next valid timestamp boundary without skipping records
The PR description's assumption ("数据库时间精度仅到秒") is verified and correct. The code logic is sound.
| lastTimeScanned); | ||
| .findFirst500ByDataChangeLastModifiedTimeGreaterThanEqualAndDataChangeLastModifiedTimeLessThanOrderByDataChangeLastModifiedTimeAsc( | ||
| lastTimeScanned, currentTime); | ||
| if (CollectionUtils.isEmpty(accessKeys)) { |
There was a problem hiding this comment.
如果下面有对hasMore的判断的话,这边是不是不需要这个判空逻辑了
There was a problem hiding this comment.
正好一起更新lastTimeScanned = currentTime;
| @@ -146,11 +146,12 @@ private void rebuildAccessKeyCache() { | |||
|
|
|||
| private void loadNewAndUpdatedAccessKeys() { | |||
There was a problem hiding this comment.
我担心分布式部署可能存在时间戳漂移的现象,可能出现向前写入的情况,每次进入这个方法能不能够将lastTimeScanned手动-1s,来提高系统容错呢
There was a problem hiding this comment.
想了想这个地方是可以的,是应该这样做,这样也是不会出现死循环的
|
记得在 CHANGES.md 加一下这个改动 :) |
603cdb4 to
69ece43
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apollo-configservice/src/test/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCacheTest.java (1)
67-92: Restore test coverage for the timing-related fixes.The test now only verifies the initial empty state and no longer exercises the core scenarios this PR aims to fix: multiple AccessKey operations within one second, drift handling, and the updated cache merge logic (>= comparison).
Consider restoring or adding tests that verify:
- Multiple AccessKey updates within the same second are properly cached
- Drift handling works correctly (the -1 second adjustment)
- The >= comparison logic in mergeAccessKeys handles same-timestamp updates
- The hasMore scenario with +1000ms advancement
These scenarios are critical given the bug was timing-related and intermittent.
🧹 Nitpick comments (1)
apollo-configservice/src/main/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCache.java (1)
166-181: Clarify edge case handling when lastTimeScanned exceeds currentTime.After line 177,
lastTimeScannedmay exceedcurrentTime(e.g., if currentTime is mid-second like 13:44:21.5, and lastTimeScanned advances to 13:44:22). The next iteration would query an invalid range [lastTimeScanned, currentTime) where start > end, returning 0 results and causing the loop to exit withlastTimeScannedset tocurrentTime.While this is self-correcting, consider adding a comment or an explicit check after line 177 to make this edge case handling clearer:
lastTimeScanned = new Date(lastTimeScanned.getTime() + 1000); + // If advanced past currentTime, next iteration will return empty and exit loop } else { lastTimeScanned = currentTime; }Alternatively, add an explicit bounds check:
lastTimeScanned = new Date(lastTimeScanned.getTime() + 1000); + if (lastTimeScanned.after(currentTime)) { + lastTimeScanned = currentTime; + hasMore = false; + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CHANGES.md(1 hunks)apollo-biz/src/main/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepository.java(1 hunks)apollo-biz/src/test/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepositoryTest.java(1 hunks)apollo-configservice/src/main/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCache.java(2 hunks)apollo-configservice/src/test/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCacheTest.java(1 hunks)
🔇 Additional comments (5)
CHANGES.md (1)
24-24: LGTM!The changelog entry clearly describes the fix and follows the established format.
apollo-biz/src/main/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepository.java (1)
36-37: LGTM!The new repository method correctly implements a bounded time range query with start-inclusive and end-exclusive semantics, which properly supports the scanning algorithm's need to process AccessKeys in controlled time windows.
apollo-biz/src/test/java/com/ctrip/framework/apollo/biz/repository/AccessKeyRepositoryTest.java (1)
87-107: LGTM!The integration test effectively validates the new bounded range query method with a precise 1-second window, correctly asserting both the count and content of results within the inclusive-exclusive range.
apollo-configservice/src/main/java/com/ctrip/framework/apollo/configservice/service/AccessKeyServiceWithCache.java (2)
149-154: Verify performance impact of drift handling.The drift handling creates a 1-second overlap on every scan (except the first), meaning AccessKeys modified in the previous ~1 second will be re-processed each time. While this correctly handles distributed clock skew, it could impact performance if AccessKey operations are frequent.
Consider monitoring the performance impact in production, especially tracking:
- Number of AccessKeys re-processed per scan
- Scan duration when there are many operations
- Cache churn from repeated merges
The approach is correct for correctness, but performance characteristics should be validated.
191-195: LGTM! This is the key fix for same-second updates.Changing the comparison from strictly greater than to greater-or-equal ensures that when multiple AccessKey operations occur within the same second (resulting in identical database timestamps), the later-scanned update properly replaces the cached entry. This directly addresses the reported issue where AccessKey changes within 1 second were not properly cached.
| Instant instantEnd = | ||
| LocalDateTime.of(2019, 12, 19, 13, 44, 22).atZone(ZoneId.systemDefault()).toInstant(); | ||
| Date dateEnd = Date.from(instantEnd); | ||
|
|
There was a problem hiding this comment.
还存在一些code style上的问题,可以运行 mvn spotless:apply 来对代码仓库进行格式化
There was a problem hiding this comment.
P.S. 可能需要jdk17的环境才能运行这个插件(它的版本兼容做的太差了...)
There was a problem hiding this comment.
我用jdk11执行完以后没有报错,但是也没有任何更新
There was a problem hiding this comment.
😂ok,这说明你的代码是符合 我们项目的 code-style 的,那就没问题啦~
我检查了下我们项目的 code-style 配置,目前确实没有对换行做一些管控:
<setting id="org.eclipse.jdt.core.formatter.number_of_empty_lines_to_preserve" value="3"/>

What's the purpose of this PR
解决在1s内有多个accesskey发生变更,或者在同一秒创建并操作accesskey,会因为数据库精度只到秒导致缓存中的值不更新的问题
Which issue(s) this PR fixes:
Fixes #5483
Brief changelog
Summary by CodeRabbit
Bug Fixes
Tests
Documentation