Skip to content

[PROPOSAL] Pair TransportService / ActionListener initialization to allow shutdown/restart hot plugging #174

@dbwiddis

Description

@dbwiddis

What/Why

What are you proposing?

Establish the capability to:

  • Add a new extension without restarting OpenSearch
  • Remove an active extension (and its dependencies) without restarting OpenSearch
  • Auto-reboot extensions to handle transport failures (or upon request by a user)

What users have asked for this feature?

In PR #172 (based on #171), we pulled test code out of the ExtensionsRunner. This exposed a flaw with our current "initialize and leave it running forever" setup and the need to include TransportService.stop() somewhere in the API.

What problems are you trying to solve?

While adding a stop/start method is easy enough, the bigger question is, when will that be called? Testing as in #172 is only one use case. But we can also integrate this with our longer term goal of hot plugging extensions.

As part of that effort we are dealing with the sequencing of initializing Extensions, and knowing when they are initialized. See issues #65 and #94, and possible usage in #149 and #151.

During some of my REST handler testing, I've ended up with a bug causing Transport to fail, with the only way to fix it restarting the extension (and restarting OpenSearch). And we want a future where we do not have to restart OpenSearch and can dynamically add or remove extensions on-the fly.

What is the developer experience going to be?

  • To add a new extension on the fly, start it up and then make a REST request to connect it: PUT /_extensions/add/uniqueID
  • To remove a new extension on the fly, start it up and then make a REST request to connect it: PUT /_extensions/remove/uniqueID
  • To (attempt to) reconnect to a problematic extension which executes the above two steps and sends a signal to the extension to restart itself: PUT /_extensions/restart/uniqueID

Are there any security considerations?

Extensions should require any shutdown/reboot requests come from the OpenSearch instance they already are (or have been) communicating with.

What is the user experience going to be?

A user can spin up a new extension on a new node (or same node) and issue a single REST request to activate it (and its dependencies).

A user can remove an extension, or restart it to upgrade to a new software version.

Why should it be built? Any reason not to?

There may be other ways to handle this functionality but connecting REST handlers to the transport service start/stop seems a great way to enable this functionality.

What will it take to execute?

We'll need to create a design/sequence diagram that clearly identifies the complete set of initialization steps and what is required to "uninitialize".

Any remaining open questions?

Dependencies are a big unknown here. This proposal primarily deals with extensions without dependencies.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions