-
Notifications
You must be signed in to change notification settings - Fork 189
pkg/aead: add explicit data race and locking to resolve #77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
katzdm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a unit test we could construct that would repro the data race when run with go test -race (and fail against HEAD), which we could then verify is fixed with this patch?
internal/pkg/aead/aead.go
Outdated
|
|
||
| // Decrypt a value using AES-CMAC-SIV | ||
| func (c *MiscreantCipher) Decrypt(joined []byte) ([]byte, error) { | ||
| aead, err := miscreant.NewAEAD(algorithmType, c.secret, miscreantNonceSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth pointing out that this seems to be a somewhat nontrivial initializer, which performs some block cipher encryption to derive its initial state, on each call.
https://github.com/miscreant/miscreant-go/blob/master/cmac/cmac.go#L45
My assumption is that sso_proxy is not particularly compute-bounded, but it could be interesting to throw this up in stage, and using Datadog to verify that we don't see an undue latency cost from these initializations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose instead of instantiating new objects we could just lock around a single cipher?
I wonder which would be more performant.
|
@katzdm Do you mean something like https://circleci.com/gh/buzzfeed/sso/463 which shows the data race? |
| } | ||
|
|
||
| // NewMiscreantCipher returns a new AES Cipher for encrypting values | ||
| func NewMiscreantCipher(secret []byte) (*MiscreantCipher, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a somewhat minor point, but it feels surprising for this type to be named MiscreantCipher, if the object in question (the Miscreant cipher) is only instantiated at Encrypt/Decrypt-time. Maybe it would be better if we keep the cipher.AEAD object owned by the MiscreantCipher class, and instead initialize a new MiscreantCipher during Marshal/Unmarsal. Might make it easier to reason about the types, in the future.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with changing the type name, but I think whatever we want to expose, we want to be go-routine safe. I would argue that this should include making both Encrypt and Decrypt go-routine safe which currently necessitates the instantiation of new objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just creating new objects in the Unmarshal and Marshal such that those are the only two methods that would be go-routine safe isn't sufficient for what we want to expose from this API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW the MiscreantCipher is an implementation of the Cipher interface (which all might be poorly named) and has Unmarshal and Marshal as well as Encrypt and Decrypt functions. This probably could be structured differently but we'd just need to reflect those changes in the MockCipher
502ca31 to
48d267d
Compare
| } | ||
| } | ||
|
|
||
| func TestCipherDataRace(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider annotating with a comment pointing to a GitHub issue describing the problem. Six months from now, we'll all forget why we added this test :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
internal/pkg/aead/aead_test.go
Outdated
| // TestCipherDataRace exercises a simple concurrency test for the MicscreantCipher. | ||
| // In https://github.com/buzzfeed/sso/pull/75 we investigated why, on random occasion, | ||
| // unmarshalling session states would fail, triggering users to get kicked out of | ||
| // authenticated states. We narrowed our suspicious to a data race that we uncovered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/suspicious/suspicions
Also, I don't think we should describe the miscreant library as "having a data race"; the library just doesn't make any attempt at thread-safety. It might be more accurate to phrase the situation as something like: "We eventually uncovered that sso implicitly assumed the underlying miscreant APIs to be thread-safe; in truth, they are only thread-compatible and require external synchronization."
bbc1525 to
b1beac2
Compare
|
this LGTM, lets squash + rebase and merge? |
b1beac2 to
63af567
Compare
|
@shrayolacrayon squashed into two commits + rebased. |
This adds an explicit concurrency test case for aead/miscreant to test for potential data races in the underlying cipher implementation.