Upgrade notes for new st2timersengine by lakshmi-kannan · Pull Request #749 · StackStorm/st2docs

lakshmi-kannan · 2018-06-20T14:36:07Z

DO NOT MERGE UNTIL StackStorm/st2#4180 is merged.

Related: StackStorm/st2-packages#564

Closes #766

lakshmi-kannan · 2018-06-20T14:58:24Z

Build break is fixed in #750. Orchestra runner apparently doesn't have runner_parameters.

* master: Not all runners (e.g. orchestra) have runner_parameters

cognifloyd

Here are some suggestions to make this flow a bit better. 👍

cognifloyd · 2018-06-20T19:51:42Z

docs/source/upgrade_notes.rst

+|st2| v2.9
+----------
+
+* |st2| timers used to be run as part of ``st2rulesengine`` process until versions older than ``v2.9``.


s/until/in/
"until versions older than" feels awkward to me. "in versions older than" would flow a bit better.

Maybe reword the first couple sentences (everything is the same after ``st2timersengine`` is the new...):

* |st2| timers moved from the ``st2rulesengine`` to the ``st2timersengine`` in ``v2.9``. Moving timers out of the rules engine allows scaling rules and timers independently. ``st2timersengine`` is the new process that schedules all the user timers. Please note that when upgrading from older versions, you will need to carefully accept changes to ``st2.conf`` file. Otherwise, you risk losing access to ``st2`` database in MongoDB.

cognifloyd · 2018-06-20T20:10:12Z

docs/source/upgrade_notes.rst

+    local_timezone = America/Los_Angeles
+    logging = conf/logging.timersengine.conf
+
+  Though ``timer`` section in config is supported for backward compatibility, it is recommended to


Possible alternate wording for this section:

We recommend renaming the ``timer`` config section to ``timersengine``. Though deprecated, using the ``timer`` section is still supported for backwards compatibility. In a future release, support for the ``timer`` section will be removed and ``timersengine`` will be the only way to configure timers.

lakshmi-kannan · 2018-06-21T14:10:25Z

@cognifloyd bdf4561. I dropped some "the" because it sounded like there is a specific one but in HA there are multiple rules engine. So the definite article seemed unnecessary. LMK what you think.

* master: (37 commits) Update roadmap with 2.8 release Fix typo. Fix invalid syntax. Generate winrm runner parameters tables. Use include instead of copy and paste. Update version to 2.9dev Add some docs on listing differently scoped datastore items. Update version info for release - 2.8.0 Some rewording and clarification. Clarify remote_user and remote_addr need to come in as CGI environment values and not as headers. Also add a note on the Upgrades page. Fix syntax. Add info on verifying that service has been started. Add a link. Add upgrade notes section for v2.8 release. Add the tags fields for actions And more info about the timezone format for core.st2.CronTimer trigger Replaced 'st2' to the macro that replaces to the product name of this as with others Fixed a minor typo in the webhooks page Update mistral.rst ...

arm4b · 2018-07-19T20:02:01Z

docs/source/reference/ha.rst

+``st2timersengine`` is responsible for scheduling all user specified timers. See
+:ref:`timers <ref-rule-timers>` for the specifics on setting up timers via rules.
+
+You have to have one active ``st2timersengine`` process running to schedule all timers. This is trivial to setup in Kubernetes so there is exactly one active container running ``st2timersengine`` process. Failover is handled natively by Kubernetes. In non Kubernetes deployments, external monitoring needs to setup and a new ``st2timersengine`` process needs to be spun up to address failover.


In HA doc I don't think we should mention any Kubernetes specifics, but have general descriptions as we do for all other services.

Yeah, I thought about it but then we need to say how to handle failover. Should we just leave out the k8s part?

This sounds like we don't have any way to promise timers in HA. I mean can't run at least 2+ instances.

Failover in K8s with running 1 single node is the same as running 1 timersengine service with systemd which will restart the process on failure.
While Kubernetes can "guarantee something", obviously container/process could be killed by whatever reasons and there is no other timersengine that will keep running.
What happens if no timersengine is available at the moment (say it's restarting), will the missed events be rescheduled or lost?

I think it's worth mentioning what happens in scenario when timersengine is not available, if we can't guarantee HA for it. Additionally, what's needed for timersengine to run properly (DB, MQ, anything else), as it's described for other services. Are there any other services that rely on timers functionality?

Per my understanding, the only purpose is that by extracting timers into a separated singleton service we can run 2+ instances of st2rulesengine?

While Kubernetes can "guarantee something", obviously container/process could be killed by whatever reasons and there is no other timersengine that will keep running.

Won't this be taken care of by Kubernetes though? Isn't that the whole point of using it?

This sounds like we don't have any way to promise timers in HA. I mean can't run at least 2+ instances.

Correct, this is what we decided to do with timers. If we decide to solve this, then we need to look at leader election which we intentionally decided to avoid. See https://github.com/StackStorm/discussions/issues/305

I think it's worth mentioning what happens in scenario when timersengine is not available, if we can't guarantee HA for it.

There is no A at that point leave alone HA :). It goes without saying IMO but I'll make that explicit.

Additionally, what's needed for timersengine to run properly (DB, MQ, anything else), as it's described for other services. Are there any other services that rely on timers functionality?

+1

Pod/container failure will lead to reschedule/restart. What happens in between is a downtime.

Related to that, we'll need to add /status for each st2 service https://github.com/StackStorm/k8s-st2/issues/5 so K8s can control the reschedule and knows whether the service is really alive or just sits there in any manner of deadlock, spinning the cycles and being actually non-responsive. See StackStorm/st2#4020

With no big A in HA, we still can haz High 😃

arm4b · 2018-07-19T20:25:22Z

Didn't know we have a doc for this. Good to find it 👍

Closes #766

arm4b · 2018-07-20T15:18:00Z

docs/source/reference/ha.rst

+
+You have to have exactly one active ``st2timersengine`` process running to schedule all timers.
+Having more than one active ``st2timersengine`` will result in duplicate timer events and therefore
+duplicate rule evaluations leading to duplicate workflows or actions.


Very important detail here was just documented 👍

arm4b · 2018-08-07T15:48:52Z

@Kami I guess you wanted to merge it, but closed instead?

Upgrade notes for new st2timersengine

97db24e

lakshmi-kannan added this to the 2.9.0 milestone Jun 20, 2018

lakshmi-kannan mentioned this pull request Jun 20, 2018

Initial commit to split timers to own process StackStorm/st2#4180

Merged

8 tasks

This was referenced Jun 20, 2018

Add new service st2timersengine StackStorm/puppet-st2#221

Closed

Add new service st2timersengine StackStorm/ansible-st2#196

Closed

Merge branch 'master' into k8s/split_timers

83cfac5

* master: Not all runners (e.g. orchestra) have runner_parameters

cognifloyd reviewed Jun 20, 2018

View reviewed changes

Fix language as per @cognifloyd's comments

bdf4561

Remove duplicate sentence

4cec6e5

Kami approved these changes Jun 21, 2018

View reviewed changes

Lakshmi Kannan added 2 commits July 19, 2018 15:01

Note in HA doc that st2timersengine is a separate process

0f2055f

arm4b suggested changes Jul 19, 2018

View reviewed changes

Update timers documentation to include info about failover

00a67df

arm4b reviewed Jul 20, 2018

View reviewed changes

arm4b approved these changes Jul 20, 2018

View reviewed changes

Add note about st2timersengine exiting if not configured

30e714a

Kami closed this Aug 7, 2018

Kami reopened this Aug 7, 2018

Merge branch 'master' into k8s/split_timers

d3edb00

Kami merged commit f3823fa into master Aug 8, 2018

Kami deleted the k8s/split_timers branch August 8, 2018 09:59

Uh oh!

Conversation

lakshmi-kannan commented Jun 20, 2018 • edited by arm4b Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lakshmi-kannan commented Jun 20, 2018

Uh oh!

cognifloyd left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lakshmi-kannan commented Jun 21, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lakshmi-kannan Jul 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arm4b commented Jul 19, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arm4b commented Aug 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lakshmi-kannan commented Jun 20, 2018 •

edited by arm4b

Loading

lakshmi-kannan Jul 19, 2018 •

edited

Loading

arm4b commented Aug 7, 2018 •

edited

Loading