Skip to content

Conversation

@cbuto
Copy link
Contributor

@cbuto cbuto commented Mar 4, 2022

Closes #546.

This should resolve the data races seen when running the test suite with the -race flag.

I took some of the work/feedback from #549 (thanks @nerdeveloper!) and added some additional changes to ensure we are locking when access the index file and cache entries.

I still would like to spend more time on the performance aspect to see if adjusting the locks could help.

Overall this change as is has a slight negative effect on median response time and a extremely positive effect the p95 response time. I assume the p95 response without these changes is fairly high due to data races between requests.


Here are the results from the very small load tests I ran locally.

  1. Start chartmuseum locally
rm -rf /tmp/chartmuseum-loadtest-*  
make build-mac && ./bin/darwin/amd64/chartmuseum --storage local --storage-local-rootdir /tmp
  1. Run locust with 10 users/10 spawn rate

Here are the test run branches:

Test 8 = main
Test 9 = fix/cache-race
Test 10 = main
Test 11 = fix/cache-race

Screen Shot 2022-03-04 at 6 12 57 PM

@cbuto cbuto self-assigned this Mar 4, 2022
@cbuto cbuto marked this pull request as ready for review March 6, 2022 10:52
@cbuto cbuto requested review from jdolitsky and scbizu March 6, 2022 10:52
@cbuto cbuto added this to the v0.15.0 milestone Mar 6, 2022
@scbizu
Copy link
Contributor

scbizu commented Mar 7, 2022

wow , looks good .

I will spend some more time to read over this PR 👀

However , it will conflict with #549 ?

@cbuto
Copy link
Contributor Author

cbuto commented Mar 7, 2022

wow , looks good .

I will spend some more time to read over this PR 👀

However , it will conflict with #549 ?

Thanks! No rush at all. But yeah builds off of #549 and the initial work @nerdeveloper did.

@scbizu
Copy link
Contributor

scbizu commented Mar 7, 2022

I read some of this code , and have a question: what we do is actually make the rough-grain lock (RegenerationLock) to be the fine-grain lock (RepoLock & IndexLock) ? But do we need to keep these rough-grain lock after we did this ?

And I see the response time should be reduced literally , and this optimization also related to the #396 .

@cbuto
Copy link
Contributor Author

cbuto commented Mar 8, 2022

I read some of this code , and have a question: what we do is actually make the rough-grain lock (RegenerationLock) to be the fine-grain lock (RepoLock & IndexLock) ? But do we need to keep these rough-grain lock after we did this ?

And I see the response time should be reduced literally , and this optimization also related to the #396 .

good call! I think you are right, I removed the RegenerationLock in 48b75c7 and it looks good. Some quick load test results showed no change but i don't think that lock was necessary anymore!

@scbizu
Copy link
Contributor

scbizu commented Mar 8, 2022

Looks much more comfortable now , I will review these all changes and make some tests on my mac this weekend .

Copy link
Contributor

@scbizu scbizu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code here should be really careful for both writing and reviewing XD

@scbizu
Copy link
Contributor

scbizu commented Mar 11, 2022

@jdolitsky hihi , Could you also join the reviewing , I think it is an important fix or refactor PR .

Copy link
Contributor

@scbizu scbizu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cbuto cbuto requested a review from nerdeveloper March 12, 2022 18:04
@nerdeveloper
Copy link
Member

Awesome work there @cbuto

@cbuto
Copy link
Contributor Author

cbuto commented Apr 7, 2022

after rebasing with #551, some new data races showed up while running the tests (it seems to be flakey)

Copy link
Contributor

@jdolitsky jdolitsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fantastic. thanks for addressing this (apologies for delay in review!)

cbuto added 5 commits April 8, 2022 15:05
Signed-off-by: Casey Buto <cbuto@d2iq.com>
Signed-off-by: Casey Buto <cbuto@d2iq.com>
Signed-off-by: Casey Buto <cbuto@d2iq.com>
Signed-off-by: Casey Buto <cbuto@d2iq.com>
Signed-off-by: Casey Buto <cbuto@d2iq.com>
@cbuto
Copy link
Contributor Author

cbuto commented Apr 8, 2022

latest race fixed in 3946435, I ran the tests quite a few times just to be sure and it looks good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix race conditions and add -race flag to test.sh

4 participants