ADO: Implement user permissions sync#48209
Conversation
95bf1c9 to
d7061d1
Compare
The Azure Dev Ops API expects some custom key-value pairs in the POST body for refreshing the token. This commit adds support for custom key-value pairs so that the azuredevops authz provider can pass them in the token refresher func.
This will be needed to create a new azuredevops authz provider from the site config. Just a new type has been added here. The other changes are alphabetical sorting of the other providers for future readers.
dc20e83 to
b892b26
Compare
In this commit we add some helpers and modify the client to support refreshing expired / close to expiry on the fly as part of the request.Do call path. We also rename azureDevOpsServicesURL to AZURE_DEV_OPS_API_URL because we will need it outside this package. Also we export HTTPError and modify it's Error() method to drop the body from the response because the body is raw HTML which pollutes the logs and error strings. Finally, we also add a helper GetRedirectURL that will be used in both ADO auth and authz providers.
When creating an external account we want to use AZURE_DEV_OPS_API_URL as the ServiceID when storing an entry in the user_external_accounts table because this is the URL used in the external service. We're not able to use s.ServiceID in the assignment because it would incorrectly assign the value of VISUAL_STUDIO_APP_URL to the service ID. The result of this would then be: During the permission sync, the ADO authz provider created uses AZURE_DEV_OPS_API_URL as the service ID. But the entry in the user_external_accounts_table would have the value of VISUAL_STUDIO_APP_URL as the ServiceID. Leading to a mismatch and the permissions syncing code falsely thinking that no external account exists for this authz provider. So why even use the VISUAL_STUDIO_APP_URL as the service ID in the auth provider? This is because that's the URL that is used for all oauth2 requests. While the API for product features like repos, orgs, user profiles is pointed to by the value of AZURE_DEV_OPS_API_URL leading to a deviation in expectations.
Instead of continuing to use the `authorization` pattern similar to other code hosts, it is simpler to add a boolean which has an explicit name and describes what it does better. The end result: we start to break away from the `authorization` pattern (some other code hosts can do that as well, maybe this is the start of it all) with an additional check for this new field in the write path of external_services.Create, where we check to ensure the correct value of `unrestricted` is written to the database. For existing auth providers, this is forwards compatible. While at the same time if they also wanted to get on the `enforcePermissions` way of life in the future, the change to `recalculateFields` will let them be backwards compatible. Also a small tweak to the doc string to make things less confusing.
90610ff to
7ff128d
Compare
This commit implements the permissions syncing for azuredevops code host connections.
7ff128d to
b0773d0
Compare
sashaostrikov
left a comment
There was a problem hiding this comment.
Great work! Left a bunch of comments, but nothing is critical, mostly related to how the code is written and not about the correctness of implementation :)
|
|
||
| l.Debug("skipping org", | ||
| log.String("org", org), | ||
| log.Int("http status code", httpErr.StatusCode), | ||
| ) | ||
|
|
||
| continue |
There was a problem hiding this comment.
I'd make this log a WARN (and change the message a bit), because this errors if not surfaced, should be properly logged at least.
Also, removing unnecessary empty lines
| l.Debug("skipping org", | |
| log.String("org", org), | |
| log.Int("http status code", httpErr.StatusCode), | |
| ) | |
| continue | |
| l.Warn("failed to list organization repositories", | |
| log.String("org", org), | |
| log.Int("http status code", httpErr.StatusCode), | |
| ) | |
| continue |
There was a problem hiding this comment.
same comment applies to below projects piece of logic
There was a problem hiding this comment.
Hmm. I wouldn't log it as a warn since that will show up on more or less every permissions sync - I'm guessing not all users have access to all orgs / projects. I'm keeping it as a DEBUG for times if we have to debug a permissions issue and we want to verify that this org or project is explicitly being skipped because of a 4xx error - that is the user does not have access to it.
There was a problem hiding this comment.
okay, in this case I would make the log message a bit more elaborate. seeing "skipping org" in logs would make me confused :)
There was a problem hiding this comment.
Good idea. I changed it to: user does not have access to this org. What do you think?
Also reduce code duplication.
| "enforcePermissions": { | ||
| "description": "A flag to enforce Azure DevOps repository access permissions", | ||
| "type": "boolean", | ||
| "default": false | ||
| }, |
There was a problem hiding this comment.
Please, add docs for this once you start making docs changes. This is a very big step forward, but also completely different to what we configure for other code hosts to enable permissions.
There was a problem hiding this comment.
Yes absolutely. I am going to add a doc on configuring Azure OAuth provider and will cover this in there. Is that what you were also suggesting?
There was a problem hiding this comment.
So this replaces the authorization: {} thing from the other code host connections, yes?
It kinda makes me nervous that we do this only for one new provider now. Somehow I'd rather have us introduce to every provider at the same time and keep backwards compatibility. But I guess we can still do that.
There was a problem hiding this comment.
I agree why it could make anyone nervous. When I was looking at this - the change turned out to be reasonably simple and easy to implement: https://github.com/sourcegraph/sourcegraph/pull/48209/files#diff-e64c2422ddc2f4da136695fcb608eee988e39ac2fb9ed1975646bf3f9bba7766R1612-R1615
Which gave me a high degree of confidence that this will not break for existing code hosts on the authorization: {} pattern. And it felt weird to continue to add on to this pattern as a result since the end user site-admin experience of configuring Azure DevOps with permisisons sync wouldn't be very intuitive.
Release timing wise this might be a better time to introduce this pattern limited to Azure DevOps and then refactor the existing ones on the new pattern after the upcoming release so we have sufficient time with testing and knowing if anything has gone wrong until the next release.
| // DEBUGGING NOTE: The token refresher (internal/oauthutil/token.go:newTokenRequest) | ||
| // adds some default key-value pairs to the body, some of which are eventually | ||
| // overridden by the values in this map. But some extra arg remain in the body that | ||
| // is sent in the request. This works for now, but if refreshing a token ever stops | ||
| // working for Azure Dev Ops this is a good place to start looking by writing a | ||
| // custom implementation that only sends the key-value pairs that the API expects. |
There was a problem hiding this comment.
Since you are aware of this problem, wouldn't it be a better architecture to actually do what you suggest? E.g. add a function on the context that would provide URL query parameters and default to the original one if it's not present?
There was a problem hiding this comment.
If by context you mean OAuthContext here, then from what I understand you're suggesting:
- Add a method on
OAuthContextlike:func (o *OAuthContext) AddExtraArgs() - Call that instead of reading from
CustomQueryParams?
If it's not a blocker, I would like to keep it as is - for now, since this PR is big enough and because we do a very similar workflow in the OAuth login by passing in extra args as part of the Exchange method call.
There was a problem hiding this comment.
Yes, that's what I had in mind. Not a blocker, just a suggestion for future
There was a problem hiding this comment.
Noted. I can follow up once this is merged. 👍
4d7abe7 to
5616b9c
Compare
|
Codenotify: Notifying subscribers in CODENOTIFY files for diff bb06945...c19633e.
|
sashaostrikov
left a comment
There was a problem hiding this comment.
At least all of my comments are resolved -- LGTM ![]()
kopancek
left a comment
There was a problem hiding this comment.
Approving to unblock, but left some comments for future PR maybe :)
| } | ||
|
|
||
| func (p *Provider) FetchRepoPerms(ctx context.Context, repo *extsvc.Repository, opts authz.FetchPermsOptions) ([]extsvc.AccountID, error) { | ||
| return nil, authz.ErrUnimplemented{Feature: "azuredevops.FetchRepoPerms"} |
There was a problem hiding this comment.
Can you add a comment about it to the code?
| } | ||
|
|
||
| func GetOAuthContext(refreshToken string) (*oauthutil.OAuthContext, error) { | ||
| for _, authProvider := range conf.SiteConfig().AuthProviders { |
There was a problem hiding this comment.
I still think the above ☝️ would be a nice improvement to the code.
| // DEBUGGING NOTE: The token refresher (internal/oauthutil/token.go:newTokenRequest) | ||
| // adds some default key-value pairs to the body, some of which are eventually | ||
| // overridden by the values in this map. But some extra arg remain in the body that | ||
| // is sent in the request. This works for now, but if refreshing a token ever stops | ||
| // working for Azure Dev Ops this is a good place to start looking by writing a | ||
| // custom implementation that only sends the key-value pairs that the API expects. |
There was a problem hiding this comment.
Yes, that's what I had in mind. Not a blocker, just a suggestion for future
| "description": "A flag to enforce Azure DevOps repository access permissions", | ||
| "type": "boolean", | ||
| "default": false | ||
| }, |
There was a problem hiding this comment.
GitHub won't allow me to comment further down, so I'm going to leave the comment here.
We can consider making the "what repos do I sync" part of the config a bit more expressive. Here's the scenario as I see it:
- There are organizations, projects, and repositories
- If you want to sync all repositories from an organization, you only specify that organization. It is redundant to be able to specify
org/projectif you've already specifiedorg - Ditto for
org/project/repository - So, construct the object in such a way that we get an object as follows:
"repositories": [
{
"org": "orgName", // required
"projects": [ // optional
{
"project": "projectName", // required
"repositoryIds": [ "1", "2" ] // optional
}
]
}
]So now you cannot specify redundant fields. You either specify an entire org, or you specify an org and some projects, or you specify an org and some projects and repo IDs for some projects. Or any combination of these.
Not a blocker, but I think "ergonomically" it makes a lot of sense.
There was a problem hiding this comment.
This is a good callout but I'm also not sure if the propsed structure would be less confusing since we nest org and projects under repositories - which feels wrong hierarchically since repos are part of an org and not the other way round.
We could however make our backend smarter to ignore projects if orgs have already been synced.
Let's make this a separate discussion and get others opinions.
There was a problem hiding this comment.
I mean I just called it repositories to give it a name. It's just a JSON key you can name it anything
There was a problem hiding this comment.
But yes, splitting the projects by org and dropping redundant ones is the other option
There was a problem hiding this comment.
Continuing the discussion here: https://github.com/sourcegraph/sourcegraph/discussions/48536
Co-authored-by: Milan Freml <kopancek@users.noreply.github.com>
| for _, c := range conns { | ||
| // The list of orgs and projects may have duplicates if there are multiple Azure DevOps code | ||
| // host connections that have the same project in their config. | ||
| // | ||
| // Add them to a map so that we may filter out duplicates before passing them over to the | ||
| // provider. | ||
| for _, name := range c.Orgs { | ||
| orgs[name] = struct{}{} | ||
| } | ||
|
|
||
| for _, name := range c.Projects { | ||
| projects[name] = struct{}{} | ||
| } | ||
| } |
There was a problem hiding this comment.
One thing I forgot to mention here: you need to make sure that you are only doing this for connections with authorization enabled, otherwise perms syncing will possibly be fetching a whole lot of redundant orgs and projects.
There was a problem hiding this comment.
Just saw this. Will fix in a follow up.
Originally created #48041 for early feedback. I've addressed those and created a new PR because it diverged a lot from the original branch.
Major progress on #47165. There's a little bit of follow up work required which would add more code to this PR - the implementation of
ValidationCheckon the authz provider. I will follow up with a new PR, but the absence of it does not prevent a good configuration of ADO from working.Note to reviewers
Not necessarily a big PR feature wise, ~500 lines is test code. But this does touch multiple modules that would make it hard to piece together for anyone not too familiar with the permissions syncing code. I've broken it down into sequential commits that should make sense when viewed in isolation for the most part. Each commit has a description describing the context and what to expect next. So it might be easier to view each commit at a time and then the PR as a whole.
Also note the usage of
authnandauthzin the commit messages.Test plan