OpenRosa attachment update auditing#1646
Conversation
- Make inserting a blob which already exists a true no-op
(saves DB churn; the sha update of the replaced approach caused a row copy)
- Remove comment related to force-a-write-so-we-can-return-the-id
(as we do it differently now)
- Remove comment about avoiding a TOCTOU issue by sending the file over
(potentially unnecessarily).
1. There was almost a TOCTOU issue anyway, but it was elided by
touching every used row, which makes concurrent deleters/updaters
of that row block until our transaction commit.
2. Avoiding sending the file over *and* avoiding a TOCTOU should be
doable with a row lock or advisory lock.
…sequent POSTs (continuations of attachment uploads). Fixes getodk/central#1426
b9cf49b to
b240423
Compare
|
Just checking in about this PR with an eye toward regression testing coming up. @ktuite, do you think you'll have time to review this? No worries if not, we could also find a different reviewer. |
| ) | ||
| )); | ||
| }); | ||
|
|
There was a problem hiding this comment.
This test currently fails because it has 3 events rather than 2, and in the extra event, blobNow is null.
it('should handle resubmitting partial submission with all attachments', testService((service) =>
service.login('alice', (asAlice) =>
asAlice.post('/v1/projects/1/forms?publish=true')
.set('Content-Type', 'application/xml')
.send(testData.forms.binaryType)
.expect(200)
.then(() => asAlice.post('/v1/projects/1/submission')
.set('X-OpenRosa-Version', '1.0')
.attach('xml_submission_file', Buffer.from(testData.instances.binaryType.both), { filename: 'data.xml' })
.attach('my_file1.mp4', Buffer.from('this is test file one'), { filename: 'my_file1.mp4' })
.expect(201)
.then(() => asAlice.get('/v1/audits?action=submission.attachment.update')
.expect(200))
.then(({ body }) => {
body[0].details.name.should.equal('my_file1.mp4');
})
)
.then(() => asAlice.post('/v1/projects/1/submission')
.set('X-OpenRosa-Version', '1.0')
.attach('xml_submission_file', Buffer.from(testData.instances.binaryType.both), { filename: 'data.xml' })
.attach('my_file1.mp4', Buffer.from('this is test file one'), { filename: 'my_file1.mp4' })
.attach('here_is_file2.jpg', Buffer.from('this is test file two'), { filename: 'here_is_file2.jpg' })
.expect(201)
.then(() => asAlice.get('/v1/audits?action=submission.attachment.update')
.expect(200))
.then(({ body }) => {
body.length.should.equal(2);
body[0].details.name.should.equal('here_is_file2.jpg');
body[1].details.name.should.equal('my_file1.mp4');
})
)
)
));
After poking around, was inspired to write another test:
it.only('should handle resubmitting partial submission with contents of attachment changed', testService((service) =>
service.login('alice', (asAlice) =>
asAlice.post('/v1/projects/1/forms?publish=true')
.set('Content-Type', 'application/xml')
.send(testData.forms.binaryType)
.expect(200)
.then(() => asAlice.post('/v1/projects/1/submission')
.set('X-OpenRosa-Version', '1.0')
.attach('xml_submission_file', Buffer.from(testData.instances.binaryType.both), { filename: 'data.xml' })
.attach('my_file1.mp4', Buffer.from('this is test file one'), { filename: 'my_file1.mp4' })
.expect(201)
.then(() => asAlice.get('/v1/audits?action=submission.attachment.update')
.expect(200))
.then(({ body }) => {
body[0].details.name.should.equal('my_file1.mp4');
})
)
.then(() => asAlice.post('/v1/projects/1/submission')
.set('X-OpenRosa-Version', '1.0')
.attach('xml_submission_file', Buffer.from(testData.instances.binaryType.both), { filename: 'data.xml' })
.attach('my_file1.mp4', Buffer.from('file one contents have changed'), { filename: 'my_file1.mp4' })
.attach('here_is_file2.jpg', Buffer.from('this is test file two'), { filename: 'here_is_file2.jpg' })
.expect(201)
.then(() => asAlice.get('/v1/audits?action=submission.attachment.update')
.expect(200))
.then(({ body }) => {
body.length.should.equal(3);
body[0].details.name.should.equal('here_is_file2.jpg');
body[1].details.name.should.equal('my_file1.mp4');
body[2].details.name.should.equal('my_file1.mp4');
body[1].details.newBlobId.should.not.equal(body[2].details.newBlobId);
})
)
)
));
| ).then(linkResults => | ||
| // auditlog the updates (only those which were added or of which the content changed) | ||
| linkResults | ||
| .filter(el => el.blobBefore !== el.blobNow) |
There was a problem hiding this comment.
If you keep the query in attach() the same, then blobs that you tried to attach but were already set will have a null blobNow, which you could then skip in the logging. Or you could tweak the query.
| .filter(el => el.blobBefore !== el.blobNow) | |
| .filter(el => el.blobNow && el.blobBefore !== el.blobNow) |
|
Very nice, thank you for taking a look! ✨
I also think it's always been that way, but it's definitely a bit funny. There's a related issue filed at getodk/central#706. |


Closes getodk/central#1436
In OpenRosa one can make a submission with 0 or more of its attachments attached in a single POST. In subsequent POSTs one can add the remaining attachments.
In contrast to what the API documentation states, the attachment additions borne from the first submission were actually auditlogged. But, subsequent POSTs' attachment updates were not auditlogged.
This PR makes it so that subsequent OR attachment set additions are also logged.
Additionally, it avoids a bit of needless DB churn that was taking place when binary content was already present.
What has been done to verify that this works as intended?
Tests. The first commit is a test that fails, the later commits prevent the failure.
Why is this the best possible solution? Were any other approaches considered?
The least labour intensive alternative would be to just correct the docs to reflect the current behaviour, rather than amending the behaviour. But it just so happens that I need to have attachment update audit events from OpenRosa subsequent-submissions, and it's generally a good thing that those are logged just the same as attachment additions taking place through the REST API.
How does this change affect users? Describe intentional changes to behavior and behavior that could have accidentally been affected by code changes. In other words, what are the regression risks?
I think everyone will be happy.
Does this change require updates to the API documentation? If so, please update docs/api.yaml as part of this PR.
Done.
Before submitting this PR, please make sure you have:
make testand confirmed all checks still pass OR confirm CircleCI build passes