Skip to content

[Decision Doc] Async Operations for Extensions #2574

@cwperks

Description

@cwperks

This issue provides 4 options for enabling the use-case of running jobs asynchronously for extensions along with their pros and cons. A full solution for extensions long-term could combine a few of the options presented in this document. The 4 options are:

  • Auth Tokens
  • API Keys
  • OAuth Tokens
  • Passing a token upon job invocation via Job Scheduler (Just in Time token)

See each respective section for more details of each solution and the appendix for additional discussion on service account tokens for extensions to interact with their own system indices.

Problem Statement

The first milestone for extensions includes the conversion of the Anomaly Detection (AD) backend from a plugin to an extension. AD has a requirement to run jobs on behalf of a user on a schedule to monitor indices in a cluster for a detector. In the plugin model, AD serializes the user (including roles and backend roles) upon detector creation and saves it as part of the detector metadata. When it comes time to run the detector, the AD plugin performs roles injection to evaluate the permissions of a dedicated user (called plugin) with the roles stored in the detector’s metadata. This will not work for AD running as an extension because an extension will be treated as third-party and roles information will not be shared with the extension unless running in a legacy plugin compatibility mode. There will be no plugin equivalent to roles injection in favor of the extension submitting requests to the OpenSearch cluster that contain an auth token that can be used to authenticate and authorize each request.

More generally, outside of the Anomaly Detection use-case, the extensions and security team needs to provide guidance for extensions developers on how to implement extensions that want to interact with the OpenSearch cluster asynchronously. When writing software that utilizes an OpenSearch client, developers will typically configure the client with a username and password that is defined in the internal user list of OpenSearch. This will not work for extensions. The password and other sensitive information will remain internal to OpenSearch and never shared with an extension. An alternative approach for asynchronous jobs needs to be developed.

Options

Option 1: Auth Tokens

Auth Tokens are tokens that: 1) Confer access to the cluster and are associated with a user, 2) have a defined lifetime (could also be indefinite, but that is discouraged) 3) have authorizations associated with them that are a subset of the creator of the token at the time the token is created, 4) come with a management suite of APIs to Grant, Revoke, List and Search.

Pros:

  • Changes to clients should not be required as it utilizes Bearer authentication, a recognized standard
  • Have scopes which limit their capabilities
  • Can utilize the existing JWT Authentication Backend in the security plugin

Cons:

  • Long-lived tokens are discouraged
  • UX considerations for token expiration scenarios
  • If roles can be designated for the token at time of creation, then it could lead to confusing user experiences if the underlying role’s permissions change and ramifications for a previously created token.

Option 2: API Keys

API Keys are: 1) A mechanism for services to authenticate and authorize, 2) can be indefinite or have a designated lifetime, 3) have authorizations associated with them that are a subset of the creator of the API Key, 4) may have additional restrictions on which APIs that ApiKey authentication is permissible for, 5) come with a management suite of APIs to Create, Invalidate, List and Search.

Pros:

  • Do not require rotation for expiry (could be a con as well)
  • Highly requested feature of the Security plugin: API-Key generation API is missing #1504
  • API Keys could be utilized to improve current plugin security by providing an off-ramp for stashing the ThreadContext
  • API Keys are considered simple compared to OAuth tokens

Cons

  • API Keys are not generally considered secure
  • If compromised, an API Key has an indefinite life and will confer access until explicitly revoked by an administrator
  • API Keys identify projects, not users
  • May require client modification to support ApiKey authentication

Option 3: OAuth Tokens

OAuth Tokens are: 1) Short-lived access tokens and longer-liver refresh tokens that can be used to act on behalf of a user asynchronously, 2) Access token is only valid for a short window and refresh token can be utilized to get new access token and a new refresh token, 3) if user grants the extension to act on their behalf then the extension can store the refresh token and use it asynchronously to get access tokens for the user, 4) the user will grant the extension to act on its behalf, 5) the user needs to re-grant the extension the ability to act on its behalf at time of expiration of the absolute lifetime of the tokens, 5) admin has the ability to revoke authorization based on extension

Pros:

  • Allows explicit consent by the user to share data with third-party
  • A refresh_token that changes periodically solves for issues with API Keys having indefinite access
  • Generally recommended as a best practice

Cons

  • More complicated than API Key security - TODO: expand upon this with scope of authorization_code flow and addition work items
  • Requires client modification to work with OAuth2 token workflows
  • Extensions developers need guidance on how to develop with OAuth tokens

Option 4: Passing a short-lived token to an extension through job scheduler invocation (Recommended)

(Needs further analysis) An alternative to all 3 options above would be to generate a token in core and forward the token to the extension when the job is invoked. Job Scheduler is moving to core and there is a possibility that job scheduler will be the mechanism for triggering jobs registered by extensions. If this is the case, then a token can be created when the job is invoked and passed to the extension. Just-in-time tokens will be used for handling REST Requests so this is a similar pattern of issuing a token exactly when it is needed.

Pros:

  • Similar to support for Extension REST Handlers and will allow for code re-use
  • No client modification and consistent behavior with Extension REST Handlers
  • More secure approach as core will determine how long the lifetime of the token forwarded to the extension will be

Cons

  • Long-living jobs may require longer-lived tokens
  • Still no mechanism for services to connect with cluster outside of client being registered with a username and password
  • Tokens are not revocable

Appendix

System Index Interaction

One of the challenges with converting plugins to extensions is the use-case of registering and interacting with system indices. Currently, plugins elevate their own privileges by stashing the ThreadContext and creating or writing to their system index in the cleared context. By clearing the context, plugins are able to in effect act as a superuser and bypass checks in the Security plugin. This will not work for extensions as they will initially be run out-of-process.

For extensions, there needs to be a mechanism to allow the extension to interact with the cluster as itself to create and write to its own indices for storage if the extension developer chooses to use OpenSearch for persistence. One solution for this could be Service Accounts and tokens for those accounts where the extension can bear a token that represents itself when making requests to the OpenSearch cluster. This token would give it hyper-localized privileges to only be able to interact with its own system indices and no others.

Additional Considerations

  • Tokens associated with a user should be automatically revoked when the user’s authorizations have changed or if the user’s password has changed. (May be too draconian - needs more thought/configuration)
  • To invalidate a JWT before expiry means that a list of invalidated tokens needs to be maintained to check to see if a token has been invalidated, otherwise the token is valid until expiry.

References

  1. API-Key generation API is missing #1504 - Request in the Security backlog for API Keys

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesttriagedIssues labeled as 'Triaged' have been reviewed and are deemed actionable.

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions