model/query/submissions: avoid RETURNING * where possible by alxndrsn · Pull Request #1544 · getodk/central-backend

alxndrsn · 2025-07-11T07:33:23Z

INSERT...RETURNING * is a very convenient SQL structure, but it's often used to return values which are already available to the calling code. When this happens, there is unnecessary overhead to database communications.

There is risk to introducing this change, as:

new columns may be added to an INSERT/UPDATE statement without noticing that they might also be excluded from the generated list of RETURNING columns
a BEFORE INSERT trigger might change supplied values when returned
supplied value might be coerced into a different type when returned

Related: getodk/central#1170

What has been done to verify that this works as intended?

Why is this the best possible solution? Were any other approaches considered?

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

Does this change require updates to the API documentation? If so, please update docs/api.yaml as part of this PR.

Before submitting this PR, please make sure you have:

run make test and confirmed all checks still pass OR confirm CircleCI build passes
verified that any code from external sources are properly credited in comments or that everything is internally sourced

`INSERT...RETURNING *` is a very convenient SQL structure, but it's often used to return values which are already available to the calling code. When this happens, there is unnecessary overhead to database communications. There is risk to introducing this change, as: 1. new columns may be added to an INSERT statement without noticing that they might also be exluded from the generated list of RETURNING columns 2. a BEFORE INSERT trigger might change supplied values when returned 3. supplied value might be coerced into a different type when returned

sadiqkhoja · 2025-07-30T02:01:18Z

lib/model/query/submissions.js

 const _defInsert = (id, partial, formDefId, actorId, root, deviceId, userAgent) => sql`insert into submission_defs ("submissionId", xml, "formDefId", "instanceId", "instanceName", "submitterId", "localKey", "encDataAttachmentName", "signature", "createdAt", root, current, "deviceId", "userAgent")
  values (${id}, ${sql.binary(partial.xml)}, ${formDefId}, ${partial.instanceId}, ${partial.def.instanceName}, ${actorId}, ${partial.def.localKey}, ${partial.def.encDataAttachmentName}, ${partial.def.signature}, clock_timestamp(), ${root}, true, ${deviceId}, ${userAgent})
-  returning *`;
+  returning ${_defInsertReturnFields}`;


how about we just return submissionId and clock_timestamp - only values that are generated by database.

Generally database only generates surrogate keys using pg_get_serial_sequence and timestamp through the application

how about we just return submissionId and clock_timestamp

That's what's happening, just in a less direct way.

Do you think the code would be simpler if we declared fields we do want returned rather than fields we don't? It would certainly make this code shorter. Maybe a neater approach would be to mark fields as autogenerated when declaring Frames? E.g.

https://github.com/getodk/central-backend/blob/master/lib/model/frames/submission.js#L89-L100

Could become:

table('submission_defs', 'def'), 'id', autogenerated, 'submissionId', 'formDefId', 'submitterId', readable, 'localKey', 'encDataAttachmentName', 'signature', 'createdAt', readable, autogenerated,

I love the idea of autogenerated attribute in the Frame.

lib/model/query/submissions.js

brontolosone · 2025-07-30T11:34:41Z

When this happens, there is unnecessary overhead to database communications.

That's true, but I was hoping to find it measurably true as well, so I set out to measure a performance difference on the submission path.
But for small submissions (1.7 KB) I can't detect a difference, and neither can I for larger submissions (215 KB).
In that light, is it still worth the risk to change this pattern? Or do we need better perf testing data than what comes rolling out of my ad-hoc perf testing strategy? Do we need to see a tangible improvement or would we also merge this purely for efficiency-aesthetics?

alxndrsn · 2025-07-30T13:53:23Z

Do we need to see a tangible improvement or would we also merge this purely for efficiency-aesthetics?

I'd rather there's a demonstrable benefit, as it complicates the code. Maybe we should discuss scale.

sadiqkhoja · 2025-07-31T19:11:37Z

It should be demonstrable quite easily if application and database are running on different machines as it is for all our cloud servers.

yanokwa · 2025-08-12T13:09:58Z

@brontolosone Can you test on dev to see if there was a demonstrable benefit?

brontolosone · 2025-09-02T05:14:53Z

On the benchmarks I've shown you a couple of weeks ago in the pure-sql submission insertion path discussion, this branch showed reduced peak memory usage (roughly halved) inside PostgreSQL compared to mainline, but no throughput speedups. So there's a demonstrable benefit albeit in a benchmark which wasn't particularly designed to measure the impact of this exact thing, and with an N of only 2. We could wait for more & better benchmarks but... the benchmark results make logical sense (smaller resultsets to be kept in buffers in PostgreSQL -> lower memory usage) and it's generally a good idea not to do useless work and it's not a very invasive change, so... I'd be happy to approve it? But it's drafted currently.

yanokwa · 2025-09-02T16:26:26Z

@alxndrsn Let's get this ready for review and merge. Less memory usage is a good win.

alxndrsn · 2025-09-03T10:19:33Z

@alxndrsn Let's get this ready for review and merge. Less memory usage is a good win.

👍 I think ideally there would be a less-manual notation required for this - see #1544 (comment)

alxndrsn · 2025-09-05T17:55:03Z

Closed for #1606; re-adding reviewers there.

alxndrsn and others added 7 commits July 11, 2025 07:28

Merge branch 'master' into retunring-no-stars

b0d78a1

Merge branch 'master' into retunring-no-stars

7d3d8ad

manual unjoiner

a5ad682

pass thru xml

76c5976

repass xml

2b020d5

fix

69a548c

alxndrsn changed the title ~~model/query: avoid RETURNING * where possible~~ model/query/submissions: avoid RETURNING * where possible Jul 29, 2025

sadiqkhoja reviewed Jul 30, 2025

View reviewed changes

Merge branch 'master' into retunring-no-stars

c6eccb6

brontolosone self-requested a review September 2, 2025 05:13

alxndrsn mentioned this pull request Sep 5, 2025

Submissions.newVersion(): reduce repeated data #1606

Merged

2 tasks

alxndrsn closed this Sep 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model/query/submissions: avoid RETURNING * where possible#1544

model/query/submissions: avoid RETURNING * where possible#1544
alxndrsn wants to merge 8 commits intogetodk:masterfrom
alxndrsn:retunring-no-stars

alxndrsn commented Jul 11, 2025 •

edited

Loading

Uh oh!

sadiqkhoja Jul 30, 2025

Uh oh!

alxndrsn Jul 30, 2025 •

edited

Loading

Uh oh!

sadiqkhoja Jul 31, 2025

Uh oh!

Uh oh!

brontolosone commented Jul 30, 2025

Uh oh!

alxndrsn commented Jul 30, 2025

Uh oh!

sadiqkhoja commented Jul 31, 2025

Uh oh!

yanokwa commented Aug 12, 2025 •

edited

Loading

Uh oh!

brontolosone commented Sep 2, 2025

Uh oh!

yanokwa commented Sep 2, 2025

Uh oh!

alxndrsn commented Sep 3, 2025

Uh oh!

alxndrsn commented Sep 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

alxndrsn commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What has been done to verify that this works as intended?

Why is this the best possible solution? Were any other approaches considered?

How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?

Does this change require updates to the API documentation? If so, please update docs/api.yaml as part of this PR.

Before submitting this PR, please make sure you have:

Uh oh!

sadiqkhoja Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

alxndrsn Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sadiqkhoja Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

brontolosone commented Jul 30, 2025

Uh oh!

alxndrsn commented Jul 30, 2025

Uh oh!

sadiqkhoja commented Jul 31, 2025

Uh oh!

yanokwa commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brontolosone commented Sep 2, 2025

Uh oh!

yanokwa commented Sep 2, 2025

Uh oh!

alxndrsn commented Sep 3, 2025

Uh oh!

alxndrsn commented Sep 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alxndrsn commented Jul 11, 2025 •

edited

Loading

alxndrsn Jul 30, 2025 •

edited

Loading

yanokwa commented Aug 12, 2025 •

edited

Loading