fix: db.Iterate can miss records, can return records twice #9657

Merged
mfenniak merged 1 commit from mfenniak/forgejo:fix-iterate into forgejo 2025-10-12 21:47:32 +02:00
Member

Fixes #9644.

Rewrites db.Iterate so that it performs DB queries in this format:

  • First: SELECT ...columns... FROM table ORDER BY id LIMIT ...buffer-size...
  • Subsequent buffer fills: adding a WHERE id > ...last-id-from-previous...

This approach:

  • Prevents records from being missed or returned twice
  • Returns records in a predictable order
  • Should be faster, by virtue of using database indexes on the primary key to perform the query
  • Doesn't rely on any unpredictable database behaviour when using LIMIT and OFFSET without an ORDER BY
  • (Downside: does require reflection to read field values off Go structures for the primary key value)

Expands the automated tests to include the predicted failure case identified in #9644, which verified the previous broken behaviour, as well as verifying that the cond parameter is applied which was previously not covered by test automation.

Checklist

The contributor guide contains information that will be helpful to first time contributors. There also are a few conditions for merging Pull Requests in Forgejo repositories. You are also welcome to join the Forgejo development chatroom.

Tests

  • I added test coverage for Go changes...
    • in their respective *_test.go for unit tests.
    • in the tests/integration directory if it involves interactions with a live Forgejo server.
  • I added test coverage for JavaScript changes...

Documentation

  • I created a pull request to the documentation to explain to Forgejo users how to use this change.
  • I did not document these changes and I do not expect someone else to do it.

Release notes

  • I do not want this change to show in the release notes.
  • I want the title to show in the release notes with a link to this pull request.
  • I want the content of the release-notes/<pull request number>.md to be be used for the release notes instead of the title.

Release notes

  • Bug fixes
    • PR: db.Iterate can miss records, can return records twice
Fixes #9644. Rewrites `db.Iterate` so that it performs DB queries in this format: - First: `SELECT ...columns... FROM table ORDER BY id LIMIT ...buffer-size...` - Subsequent buffer fills: adding a `WHERE id > ...last-id-from-previous...` This approach: - Prevents records from being missed or returned twice - Returns records in a predictable order - Should be faster, by virtue of using database indexes on the primary key to perform the query - Doesn't rely on any unpredictable database behaviour when using `LIMIT` and `OFFSET` without an `ORDER BY` - (Downside: does require reflection to read field values off Go structures for the primary key value) Expands the automated tests to include the predicted failure case identified in #9644, which verified the previous broken behaviour, as well as verifying that the `cond` parameter is applied which was previously not covered by test automation. ## Checklist The [contributor guide](https://forgejo.org/docs/next/contributor/) contains information that will be helpful to first time contributors. There also are a few [conditions for merging Pull Requests in Forgejo repositories](https://codeberg.org/forgejo/governance/src/branch/main/PullRequestsAgreement.md). You are also welcome to join the [Forgejo development chatroom](https://matrix.to/#/#forgejo-development:matrix.org). ### Tests - I added test coverage for Go changes... - [x] in their respective `*_test.go` for unit tests. - [ ] in the `tests/integration` directory if it involves interactions with a live Forgejo server. - I added test coverage for JavaScript changes... - [ ] in `web_src/js/*.test.js` if it can be unit tested. - [ ] in `tests/e2e/*.test.e2e.js` if it requires interactions with a live Forgejo server (see also the [developer guide for JavaScript testing](https://codeberg.org/forgejo/forgejo/src/branch/forgejo/tests/e2e/README.md#end-to-end-tests)). ### Documentation - [ ] I created a pull request [to the documentation](https://codeberg.org/forgejo/docs) to explain to Forgejo users how to use this change. - [x] I did not document these changes and I do not expect someone else to do it. ### Release notes - [ ] I do not want this change to show in the release notes. - [x] I want the title to show in the release notes with a link to this pull request. - [ ] I want the content of the `release-notes/<pull request number>.md` to be be used for the release notes instead of the title. <!--start release-notes-assistant--> ## Release notes <!--URL:https://codeberg.org/forgejo/forgejo--> - Bug fixes - [PR](https://codeberg.org/forgejo/forgejo/pulls/9657): <!--number 9657 --><!--line 0 --><!--description ZGIuSXRlcmF0ZSBjYW4gbWlzcyByZWNvcmRzLCBjYW4gcmV0dXJuIHJlY29yZHMgdHdpY2U=-->db.Iterate can miss records, can return records twice<!--description--> <!--end release-notes-assistant-->
mfenniak force-pushed fix-iterate from f336ebd824
Some checks failed
issue-labels / backporting (pull_request_target) Has been skipped
issue-labels / cascade (pull_request_target) Has been skipped
testing / frontend-checks (pull_request) Successful in 1m1s
issue-labels / release-notes (pull_request_target) Has been skipped
requirements / merge-conditions (pull_request) Successful in 3s
testing / backend-checks (pull_request) Failing after 1m41s
testing / test-unit (pull_request) Has been skipped
testing / test-e2e (pull_request) Has been skipped
testing / test-mysql (pull_request) Has been skipped
testing / test-pgsql (pull_request) Has been skipped
testing / test-sqlite (pull_request) Has been skipped
testing / test-remote-cacher (redis) (pull_request) Has been skipped
testing / test-remote-cacher (valkey) (pull_request) Has been skipped
testing / test-remote-cacher (garnet) (pull_request) Has been skipped
testing / test-remote-cacher (redict) (pull_request) Has been skipped
testing / security-check (pull_request) Has been skipped
to 9db0aace82
Some checks failed
requirements / merge-conditions (pull_request) Successful in 1s
issue-labels / release-notes (pull_request_target) Has been skipped
testing / frontend-checks (pull_request) Successful in 1m7s
testing / backend-checks (pull_request) Failing after 3m2s
testing / test-unit (pull_request) Has been skipped
testing / test-e2e (pull_request) Has been skipped
testing / test-mysql (pull_request) Has been skipped
testing / test-pgsql (pull_request) Has been skipped
testing / test-sqlite (pull_request) Has been skipped
testing / test-remote-cacher (redis) (pull_request) Has been skipped
testing / test-remote-cacher (valkey) (pull_request) Has been skipped
testing / test-remote-cacher (garnet) (pull_request) Has been skipped
testing / test-remote-cacher (redict) (pull_request) Has been skipped
testing / security-check (pull_request) Has been skipped
2025-10-12 19:20:21 +02:00
Compare
mfenniak force-pushed fix-iterate from 9db0aace82
Some checks failed
requirements / merge-conditions (pull_request) Successful in 1s
issue-labels / release-notes (pull_request_target) Has been skipped
testing / frontend-checks (pull_request) Successful in 1m7s
testing / backend-checks (pull_request) Failing after 3m2s
testing / test-unit (pull_request) Has been skipped
testing / test-e2e (pull_request) Has been skipped
testing / test-mysql (pull_request) Has been skipped
testing / test-pgsql (pull_request) Has been skipped
testing / test-sqlite (pull_request) Has been skipped
testing / test-remote-cacher (redis) (pull_request) Has been skipped
testing / test-remote-cacher (valkey) (pull_request) Has been skipped
testing / test-remote-cacher (garnet) (pull_request) Has been skipped
testing / test-remote-cacher (redict) (pull_request) Has been skipped
testing / security-check (pull_request) Has been skipped
to 4893d468c1
All checks were successful
requirements / merge-conditions (pull_request) Successful in 1s
issue-labels / release-notes (pull_request_target) Has been skipped
testing / frontend-checks (pull_request) Successful in 1m9s
testing / backend-checks (pull_request) Successful in 3m5s
testing / test-unit (pull_request) Successful in 6m44s
testing / test-remote-cacher (redis) (pull_request) Successful in 2m38s
testing / test-remote-cacher (redict) (pull_request) Successful in 2m37s
testing / test-remote-cacher (garnet) (pull_request) Successful in 2m38s
testing / test-remote-cacher (valkey) (pull_request) Successful in 2m42s
testing / test-mysql (pull_request) Successful in 19m15s
testing / test-sqlite (pull_request) Successful in 24m13s
testing / test-pgsql (pull_request) Successful in 26m49s
testing / security-check (pull_request) Successful in 1m1s
testing / test-e2e (pull_request) Successful in 18m40s
2025-10-12 19:29:22 +02:00
Compare
earl-warren left a comment

Perfect.

Perfect.
@ -17,2 +19,3 @@
batchSize := setting.Database.IterateBufferSize
sess := GetEngine(ctx)
table, err := TableInfo(&dummy)
Contributor

Note to self: I had doubts about TableInfo being sensitive to the kind of database used but it is just about introspecting the struct.

Note to self: I had doubts about TableInfo being sensitive to the kind of database used but it is just about introspecting the struct.
earl-warren marked this conversation as resolved
@ -19,0 +36,4 @@
}
}
if pkStructFieldName == "" {
return fmt.Errorf("iterate unable to identify struct field for PK %s", pkDbName)
Contributor

s/PK/primary key/

s/PK/primary key/
Author
Member

Yes, good catch. 4893d468c1..85177276ed

Yes, good catch. https://codeberg.org/forgejo/forgejo/compare/4893d468c18e4ae3abec411d8e832b5a7d3129f1..85177276edb89061c26734ac13d759ff9533905f
mfenniak marked this conversation as resolved
mfenniak force-pushed fix-iterate from 4893d468c1
All checks were successful
requirements / merge-conditions (pull_request) Successful in 1s
issue-labels / release-notes (pull_request_target) Has been skipped
testing / frontend-checks (pull_request) Successful in 1m9s
testing / backend-checks (pull_request) Successful in 3m5s
testing / test-unit (pull_request) Successful in 6m44s
testing / test-remote-cacher (redis) (pull_request) Successful in 2m38s
testing / test-remote-cacher (redict) (pull_request) Successful in 2m37s
testing / test-remote-cacher (garnet) (pull_request) Successful in 2m38s
testing / test-remote-cacher (valkey) (pull_request) Successful in 2m42s
testing / test-mysql (pull_request) Successful in 19m15s
testing / test-sqlite (pull_request) Successful in 24m13s
testing / test-pgsql (pull_request) Successful in 26m49s
testing / security-check (pull_request) Successful in 1m1s
testing / test-e2e (pull_request) Successful in 18m40s
to 85177276ed
All checks were successful
testing / frontend-checks (pull_request) Successful in 1m5s
testing / backend-checks (pull_request) Successful in 3m15s
testing / test-unit (pull_request) Successful in 6m27s
testing / test-remote-cacher (redis) (pull_request) Successful in 2m24s
testing / test-remote-cacher (valkey) (pull_request) Successful in 2m22s
testing / test-remote-cacher (garnet) (pull_request) Successful in 2m23s
testing / test-remote-cacher (redict) (pull_request) Successful in 2m23s
testing / test-mysql (pull_request) Successful in 18m5s
testing / test-sqlite (pull_request) Successful in 23m30s
testing / test-e2e (pull_request) Successful in 18m54s
testing / test-pgsql (pull_request) Successful in 27m1s
testing / security-check (pull_request) Successful in 2m13s
milestone / set (pull_request_target) Successful in 4s
requirements / merge-conditions (pull_request) Successful in 2s
issue-labels / cascade (pull_request_target) Has been skipped
issue-labels / backporting (pull_request_target) Successful in 29s
issue-labels / release-notes (pull_request_target) Successful in 1m10s
2025-10-12 20:39:40 +02:00
Compare
mfenniak merged commit 79f6f8e508 into forgejo 2025-10-12 21:47:32 +02:00
mfenniak deleted branch fix-iterate 2025-10-12 21:47:38 +02:00
Where does that come from? The following is a preview of the release notes for this pull request, as they will appear in the upcoming release. They are derived from the content of the `release-notes/9657.md` file, if it exists, or the title of the pull request. They were also added at the bottom of the description of this pull request for easier reference.

This message and the release notes originate from a call to the release-notes-assistant.

@@ -37,2 +37,10 @@
 - [x] I want the title to show in the release notes with a link to this pull request.
 - [ ] I want the content of the `release-notes/<pull request number>.md` to be be used for the release notes instead of the title.
+
+<!--start release-notes-assistant-->
+
+## Release notes
+<!--URL:https://codeberg.org/forgejo/forgejo-->
+- Bug fixes
+  - [PR](https://codeberg.org/forgejo/forgejo/pulls/9657): <!--number 9657 --><!--line 0 --><!--description ZGIuSXRlcmF0ZSBjYW4gbWlzcyByZWNvcmRzLCBjYW4gcmV0dXJuIHJlY29yZHMgdHdpY2U=-->db.Iterate can miss records, can return records twice<!--description-->
+<!--end release-notes-assistant-->

Release notes

  • Bug fixes
    • PR: db.Iterate can miss records, can return records twice
<details> <summary>Where does that come from?</summary> The following is a preview of the release notes for this pull request, as they will appear in the upcoming release. They are derived from the content of the `release-notes/9657.md` file, if it exists, or the title of the pull request. They were also added at the bottom of the description of this pull request for easier reference. This message and the release notes originate from a call to the [release-notes-assistant](https://code.forgejo.org/forgejo/release-notes-assistant). ```diff @@ -37,2 +37,10 @@ - [x] I want the title to show in the release notes with a link to this pull request. - [ ] I want the content of the `release-notes/<pull request number>.md` to be be used for the release notes instead of the title. + +<!--start release-notes-assistant--> + +## Release notes +<!--URL:https://codeberg.org/forgejo/forgejo--> +- Bug fixes + - [PR](https://codeberg.org/forgejo/forgejo/pulls/9657): <!--number 9657 --><!--line 0 --><!--description ZGIuSXRlcmF0ZSBjYW4gbWlzcyByZWNvcmRzLCBjYW4gcmV0dXJuIHJlY29yZHMgdHdpY2U=-->db.Iterate can miss records, can return records twice<!--description--> +<!--end release-notes-assistant--> ``` </details> <!--start release-notes-assistant--> ## Release notes <!--URL:https://codeberg.org/forgejo/forgejo--> - Bug fixes - [PR](https://codeberg.org/forgejo/forgejo/pulls/9657): <!--number 9657 --><!--line 0 --><!--description ZGIuSXRlcmF0ZSBjYW4gbWlzcyByZWNvcmRzLCBjYW4gcmV0dXJuIHJlY29yZHMgdHdpY2U=-->db.Iterate can miss records, can return records twice<!--description--> <!--end release-notes-assistant-->
jasewolf referenced this pull request from a commit 2025-11-16 14:02:47 +01:00
Sign in to join this conversation.
No reviewers
No labels
arch
riscv64
backport/v1.19
backport/v1.20
backport/v1.21/forgejo
backport/v10.0/forgejo
backport/v11.0/forgejo
backport/v12.0/forgejo
backport/v13.0/forgejo
backport/v14.0/forgejo
backport/v15.0/forgejo
backport/v7.0/forgejo
backport/v8.0/forgejo
backport/v9.0/forgejo
breaking
bug
bug
confirmed
bug
duplicate
bug
needs-more-info
bug
new-report
bug
reported-upstream
code/actions
code/api
code/auth
code/auth/faidp
code/auth/farp
code/email
code/federation
code/git
code/migrations
code/packages
code/wiki
database
MySQL
database
PostgreSQL
database
SQLite
dependency-upgrade
dependency
Chi
dependency
Chroma
dependency
F3
dependency
ForgeFed
dependency
garage
dependency
Gitea
dependency
Golang
Discussion
duplicate
enhancement/feature
forgejo/accessibility
forgejo/branding
forgejo/ci
forgejo/commit-graph
forgejo/documentation
forgejo/furnace cleanup
forgejo/i18n
forgejo/interop
forgejo/moderation
forgejo/privacy
forgejo/release
forgejo/scaling
forgejo/security
forgejo/ui
Gain
High
Gain
Nice to have
Gain
Undefined
Gain
Very High
good first issue
i18n/backport-stable
impact
large
impact
medium
impact
small
impact
unknown
Incompatible license
issue
closed
issue
do-not-exist-yet
issue
open
manual test
Manually tested during feature freeze
OS
FreeBSD
OS
Linux
OS
macOS
OS
Windows
problem
QA
regression
release blocker
Release Cycle
Feature Freeze
release-blocker
v7.0
release-blocker
v7.0.1
release-blocker
v7.0.2
release-blocker
v7.0.3
release-blocker
v7.0.4
release-blocker
v8.0.0
release-blocker/v9.0.0
run-all-playwright-tests
run-end-to-end-tests
stage
2-research
stage
3-design
stage
4-implementation
test
manual
test
needed
test
needs-help
test
not-needed
test
present
untested
User research - time-tracker
valuable code
worth a release-note
User research - Accessibility
User research - Blocked
User research - Community
User research - Config (instance)
User research - Errors
User research - Filters
User research - Future backlog
User research - Git workflow
User research - Labels
User research - Moderation
User research - Needs input
User research - Notifications/Dashboard
User research - Rendering
User research - Repo creation
User research - Repo units
User research - Security
User research - Settings (in-app)
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
forgejo/forgejo!9657
No description provided.