Non locking and async API #459

sliverc · 2016-11-18T16:36:34Z

Goal is when a write happens on api reading should still be possible and multiple processes should run simultaneously.

To accomplish this following needed to change:

collection factory needs to be created for each routine (no more global access to it)
removing of read and write mutexes
making database changes atomic per resource using batch
per resource locking
new task api introduced which supports ability for async calls
all lengthy processes are async and handled through the task api

Example how the new task API works:

(1) POST /api/repos/:name/snapshots ...
Snapshot repository. This will return status code 202 to
indicate that process is accepted but not finished yet.
Additionally will task id be returned to follow status of
running task.

(2) GET /api/tasks/.../output
Get log of currently running task

(3) GET /api/tasks/.../wait
Waits till task is finished

This also solves the problem with mirror api integration as task can run in the background (as mentioned in #166 ).

Mirror api integration based on this PR see #573

As this is a lengthy PR once its core concept is agreed upon I will be happy to update the documentation as well.

Checklist

unit-test added (if change is algorithm)
functional test added/updated (if change is functional)
man page updated (if applicable)
bash completion updated (if applicable)
documentation updated
author name in AUTHORS

smira · 2017-03-16T21:38:33Z

I think allowing concurrent DB access is not enough - we need to make sure package pool and published repositories are safe to be used concurrently (which is not the case today), so DB lock protects today implicitly package pool/published repos.

sliverc · 2017-03-17T12:14:43Z

Yes this is not enough. Our approach to this has actually developed over time. Now we do a locking on a task basis and each task can define what resources it requires.

I haven't gotten to update the PR... will do so when I find time but if you wanna have a look see master branch of our fork:
https://github.com/adfinis-forks/aptly/

smira · 2017-03-17T19:55:04Z

That sounds interesting, and I think locking on per-resource level might be the way to go here...

sliverc · 2017-05-22T11:59:55Z

@smira
I have finally gotten around to update this PR. I have updated the description to better explain what this PR does.

In general are there two core concepts:

locking per resource
task api for async job support

Let me know what you think.

smira · 2017-05-22T20:05:59Z

Thanks, I'll schedule it for post-1.1.0 as this looks scary enough for planned June release (already a lot of changes queued).

sliverc · 2017-05-23T07:02:57Z

I agree properly better to postpone it and polish it a bit more. What I want to note is though that we already use this code in production and it proved to be stable.
If you have any comments on the implementation let me know so I can have a look.

sliverc · 2017-08-10T11:02:01Z

@smira I have updated PR so it should cleanly apply to master again. As version 1.1.0 is now released it would be great if you could comment on this PR what you think and have we shall move forward to possible integrate into the next version. I am happy to adjust the code as needed.

hsitter · 2018-04-30T10:58:48Z

I love the idea.

Considering this changes return values of existing endpoints I'd say this needs some compatibility adjustments.

Specifically, everything should get an additional new route /api/v2/..., the existing routes in /api/... internally call the new v2 handlers but instead of returning the task they wait for it to finish and return the result. Meaning POST /api/repos/:name/snapshots continues to behave the way it does right now in production. When sending to the new v2 endpoints POST /api/v2/repos/:name/snapshots you get the task id instead. So, old endpoints stay the same, new v2 endpoints change to tasks.

Food for thought: I am wondering if a websocket isn't the more "modern" and flexible approach to this. i.e. instead of getting a taskid the client gets a websocket URI, then the client connects to the socket to get task updates from it. Calling /wait with a super long HTTP GET timeout seems somewhat underwhelming. Also, with a socket it'd probably be very easy to implement "streaming" of the output log of the task on the client side. Whereas right now you either have to GET /output numerous times while the task is running (getting mostly the same data over and over again), or have no output until /wait returns and get it all at once.

Taking it a step further, I could imagine this enabling us to put the CLI ontop of the API, so you could actually operate a remote aptly instance via your local CLI! Well, one can dream 😸

sliverc · 2018-04-30T11:13:55Z

@apachelogger Totally agree that we should make this change in a backwards compatible fashion and the way you outlined should be straight forward.

I love the idea of web sockets... I think though we should make using web sockets on the api only an additional feature and not required per default though so simple REST client script are still possible.

This way we can also get the async api merged and add web sockets in a future release (which would then also let the CLI use the API).

codecov · 2018-05-17T13:29:38Z

Codecov Report

Merging #459 into master will increase coverage by 0.3%.
The diff coverage is 65.66%.

@@            Coverage Diff            @@
##           master     #459     +/-   ##
=========================================
+ Coverage   60.64%   60.94%   +0.3%     
=========================================
  Files          50       54      +4     
  Lines        6276     6422    +146     
=========================================
+ Hits         3806     3914    +108     
- Misses       2033     2081     +48     
+ Partials      437      427     -10

Impacted Files	Coverage Δ
deb/changes.go	`59.91% <ø> (+0.17%)`	⬆️
context/context.go	`11.72% <0%> (-0.4%)`	⬇️
deb/publish.go	`64.45% <100%> (+0.78%)`	⬆️
deb/package_collection.go	`54.83% <100%> (+3.32%)`	⬆️
deb/local.go	`92.78% <100%> (+5.65%)`	⬆️
task/task.go	`100% <100%> (ø)`
database/leveldb.go	`79.24% <100%> (-2.12%)`	⬇️
task/resources.go	`28.57% <28.57%> (ø)`
task/output.go	`51.85% <51.85%> (ø)`
task/list.go	`75.75% <75.75%> (ø)`
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9a704de...52d4336. Read the comment docs.

this is needed so concurrent reads and writes are possible.

This way db usage is safe.

Progress is not safe so for api its always nil and code needs to take care of this

smira · 2019-07-29T20:08:35Z

I wonder if it could be HTTP/2 Push events in the end.

I want to look into this more, but I don't see why exactly we can't make async return optional? So that sync/async options are both available. In the end in the local network environment, there's nothing unusual about setting HTTP response timeout to say 10 minutes and waiting for API completion in one call. Long polling, async response makes more sense when API is not local and we can't control timeouts on the way to aptly service endpoint.

smira · 2019-07-29T20:50:04Z

I would like to try to see if it's possible to split into a smaller PRs and commit them one by one, with thorough review for each one.

This is spin-off of changes from aptly-dev#459. Transactions are not being used yet, but batches are updated to work with the new API. `database/` package was refactored to split abstract interfaces and implementation via goleveldb. This should make it easier to implement new database types.

This is spin-off of changes from #459. Transactions are not being used yet, but batches are updated to work with the new API. `database/` package was refactored to split abstract interfaces and implementation via goleveldb. This should make it easier to implement new database types.

For any action which is multi-step (requires updating more than 1 DB key), use transaction to make update atomic. Also pack big chunks of updates (importing packages for importing and mirror updates) into single transaction to improve aptly performance and get some isolation. Note that still layers up (Collections) provide some level of isolation, so this is going to shine with the future PRs to remove collection locks. Spin-off of aptly-dev#459

For any action which is multi-step (requires updating more than 1 DB key), use transaction to make update atomic. Also pack big chunks of updates (importing packages for importing and mirror updates) into single transaction to improve aptly performance and get some isolation. Note that still layers up (Collections) provide some level of isolation, so this is going to shine with the future PRs to remove collection locks. Spin-off of #459

Part of PR aptly-dev#459 This prepares for more methods to be exposed via the API.

Part of PR #459 This prepares for more methods to be exposed via the API.

sliverc force-pushed the rm_api_locking branch from 2427f58 to 830d12e Compare November 28, 2016 16:02

sliverc force-pushed the rm_api_locking branch from 30507e0 to 508366c Compare January 10, 2017 12:39

sliverc mentioned this pull request Feb 13, 2017

Preparing PR for upstream of new tasks api adfinis-forks/aptly#23

Open

sliverc force-pushed the rm_api_locking branch from 508366c to 3b3b0a0 Compare May 22, 2017 11:55

sliverc changed the title ~~Non Locking API~~ Non locking and async API May 22, 2017

sliverc mentioned this pull request May 22, 2017

Add mirror api #573

Closed

6 tasks

smira added the 1.2.0-Maybe label May 22, 2017

sliverc force-pushed the rm_api_locking branch from 3b3b0a0 to d42ffd0 Compare May 30, 2017 10:58

sliverc force-pushed the rm_api_locking branch 2 times, most recently from c05f09b to e46568c Compare July 7, 2017 07:27

sliverc force-pushed the rm_api_locking branch from e46568c to 744a2d9 Compare August 10, 2017 10:48

sliverc force-pushed the rm_api_locking branch 2 times, most recently from 40dc02b to 4a249e5 Compare October 9, 2017 10:35

sliverc force-pushed the rm_api_locking branch from 4a249e5 to bf1d5b5 Compare November 17, 2017 11:58

sliverc mentioned this pull request Nov 20, 2017

Progress details for publish api #675

Closed

6 tasks

sliverc force-pushed the rm_api_locking branch 3 times, most recently from 5a34d47 to 9d7ef2c Compare December 1, 2017 08:20

smira added 1.3.0-Maybe and removed 1.2.0-Maybe labels Dec 5, 2017

sliverc force-pushed the rm_api_locking branch from 9d7ef2c to 5a5ac23 Compare December 18, 2017 09:50

sliverc force-pushed the rm_api_locking branch from 5a5ac23 to 4463fae Compare May 17, 2018 13:07

sliverc force-pushed the rm_api_locking branch from 6145cef to 4a2aeac Compare May 22, 2018 07:24

sliverc mentioned this pull request May 22, 2018

Add db cleanup api #742

Closed

6 tasks

sliverc force-pushed the rm_api_locking branch from 4a2aeac to 172056c Compare June 20, 2018 12:37

sliverc modified the milestone: 1.4.0 Jun 20, 2018

sliverc force-pushed the rm_api_locking branch from 172056c to c389d88 Compare June 20, 2018 12:45

Oliver Sauder added 7 commits July 6, 2018 15:06

every go routine needs to have its own collection factory

bd682c5

this is needed so concurrent reads and writes are possible.

Removed obsolete RWMutexes

ea9ee1c

Database changes of resources need to be atomic

5d50e9d

db batch may not be a global resource

88c8058

This way db usage is safe.

Solving progress not safe issue for api

8a2d1f7

Progress is not safe so for api its always nil and code needs to take care of this

Fixed not running tests

d55de74

Add task api and resource locking ability

52d4336

sliverc force-pushed the rm_api_locking branch from c389d88 to 52d4336 Compare July 6, 2018 13:06

smira mentioned this pull request Aug 1, 2019

Refactor database code to support standalone batches, transactions. #861

Merged

6 tasks

smira mentioned this pull request Aug 9, 2019

Consistently use transactions to update database #866

Merged

6 tasks

smira added a commit to smira/aptly-fork that referenced this pull request Sep 3, 2019

Fix issues with progress == nil causing panics

1fe531e

Part of PR aptly-dev#459 This prepares for more methods to be exposed via the API.

smira mentioned this pull request Sep 3, 2019

Fix issues with progress == nil causing panics #871

Merged

sliverc pushed a commit that referenced this pull request Sep 3, 2019

Fix issues with progress == nil causing panics

769e984

Part of PR #459 This prepares for more methods to be exposed via the API.

lbolla closed this Jan 28, 2022

Uh oh!

Non locking and async API #459

Non locking and async API #459

Uh oh!

Conversation

sliverc commented Nov 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

smira commented Mar 16, 2017

Uh oh!

sliverc commented Mar 17, 2017

Uh oh!

smira commented Mar 17, 2017

Uh oh!

sliverc commented May 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smira commented May 22, 2017

Uh oh!

sliverc commented May 23, 2017

Uh oh!

sliverc commented Aug 10, 2017

Uh oh!

hsitter commented Apr 30, 2018

Uh oh!

sliverc commented Apr 30, 2018

Uh oh!

codecov bot commented May 17, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

smira commented Jul 29, 2019

Uh oh!

smira commented Jul 29, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sliverc commented Nov 18, 2016 •

edited

Loading

sliverc commented May 22, 2017 •

edited

Loading

codecov bot commented May 17, 2018 •

edited

Loading