Skip to content

Issue 25323: Fixed error catch and route handling v2#44800

Merged
igoristic merged 14 commits intoelastic:masterfrom
igoristic:ccs_check
Sep 16, 2019
Merged

Issue 25323: Fixed error catch and route handling v2#44800
igoristic merged 14 commits intoelastic:masterfrom
igoristic:ccs_check

Conversation

@igoristic
Copy link
Copy Markdown
Contributor

This is the second attempt, since my first iteration #43338 touched too many different aspects of the app.


Resolves #25323

In some cases we weren't catching monitoringClusters failures which resulted in a blank error screen:

I also removed some on route resolve triggers, since their controllers were already doing that same loading/logic


Steps to cause monitoringClusters() to timeout (on master):

  1. Start with a fresh ES/Kibana stack from master
  2. Enable Monitoring in Stack Monitoring
  3. Start a 2nd/independent ES instance (without security) eg:
bin/elasticsearch -E xpack.license.self_generated.type=trial -E xpack.monitoring.collection.enabled=true -E xpack.security.enabled=false
  1. Go to Dev Tools and add your second cluster and add the 2nd cluster (https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-remote-clusters.html)) something like:
PUT _cluster/settings
{
  "persistent": {
    "cluster": {
      "remote": {
        "cluster_two": {
          "seeds": [
            "127.0.0.1:9201"
          ],
          "skip_unavailable": true
        }
      }
    }
  }
}
  1. Go to Stack Monitoring and notice that Kibana will go to /no-data for about 30 seconds with a completely blank screen. After that you'll see a toast error that'll go away, but the blank screen will still be there

@igoristic igoristic added bug Fixes for quality problems that affect the customer experience review release_note:fix Team:Monitoring Stack Monitoring team v8.0.0 v7.4.0 labels Sep 4, 2019
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/stack-monitoring

@elasticmachine
Copy link
Copy Markdown
Contributor

💔 Build Failed

Copy link
Copy Markdown
Contributor

@chrisronline chrisronline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this is great! I really like how you decided to convert the no-data page to use the standard BaseController logic!

if (catchReason) {
this.reason = catchReason;
} else if (!this.isCollectionEnabledUpdating
&& !this.isCollectionEnabledUpdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I was working through the process of enabling monitoring while having some misconfigurations in my cluster settings. For example, I had monitoring disabled, but I also had a local exporter configured that was also disabled (a local exporter disabled will show an error on this page). When I clicked the blue button to enable monitoring, the spinner never left, but after some debugging I realized it was because this.isCollectionEnabledUpdated was set to true and never reset.

I don't know if the right change is to remove this line from the conditional, or maybe look into how it's mutated to understand if a change is necessary there, but I think we can do a little more to make a more seamless UX.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's only set in the catch but never reseted:

I will set it to false before the async function here:

this.updateModel({ isCollectionEnabledUpdating: true });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this solution quite works. I'm not seeing a change in behavior on my side. It's still stuck on true which is preventing the next startChecks() call

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we're testing different scenarios. What steps are taking to reproduce the sticky loading phase?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, so start off with a cluster that has no monitoring data and does not have monitoring enabled. Go to the dev tools and configure a local or remote exporter that is disabled, something like:

PUT _cluster/settings
{
  "transient": {
    "xpack.monitoring.exporters": { 
      "my_remote": {
        "type": "http",
        "enabled": false
      }
    }
  }
}

Go to the monitoring UI, see the blue enable button, and click it. My experience is it just keeps spinning and never shows me the next error (which is that all my exporters are disabled).

This is where I found the list of scenarios that would cause an error message: https://github.com/elastic/kibana/blob/master/x-pack/legacy/plugins/monitoring/server/lib/elasticsearch_settings/find_reason.js

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think the best solution would be to remove that if condition, and just let the checks happen

@chrisronline What do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me as long as it solves this problem and doesn't introduce any additional issues, 👍

@elasticmachine
Copy link
Copy Markdown
Contributor

💔 Build Failed

Copy link
Copy Markdown
Contributor

@chrisronline chrisronline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing an issue with the one of the earlier pieces of feedback. See #44800 (comment)

@elasticmachine
Copy link
Copy Markdown
Contributor

💔 Build Failed

Copy link
Copy Markdown
Contributor

@chrisronline chrisronline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! I had a couple of questions and one request. Thanks!


import Boom from 'boom';

export async function verifyCcsAvailability(req) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test for this functionality? Verify it works as expected, verify it works despite the value of skip_unavailable?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this test will be very useful, since we don't have any skip_unavailable conditions/logic here.

The only test that makes sense to me is around getData in https://github.com/elastic/kibana/pull/44800/files#diff-87ec11d43f1e502792408786cf726eb6R32 and different states/reasons it could yield. WDYT?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I was thinking a test for this because I had the thought to add a .skip_unavailable check, which you correctly mentioned was not necessary and would actually cause a problem. I don't feel strongly about it either way - just a thought

Copy link
Copy Markdown
Contributor

@chrisronline chrisronline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Nice work!

@elasticmachine
Copy link
Copy Markdown
Contributor

💚 Build Succeeded

@igoristic igoristic added v7.5.0 and removed review labels Sep 14, 2019
@elasticmachine
Copy link
Copy Markdown
Contributor

💚 Build Succeeded

@igoristic igoristic merged commit e7e42b7 into elastic:master Sep 16, 2019
igoristic added a commit to igoristic/kibana that referenced this pull request Sep 16, 2019
* Sample version

* Add to this page too

* Added base class and fixed some race conditions

* Addressed code review feedback

* Fixed sticky loading state

* Fixed unit tests
@igoristic
Copy link
Copy Markdown
Contributor Author

Backport:
7.x: f6096a6
7.4: 4bcd442

igoristic added a commit that referenced this pull request Sep 17, 2019
* Sample version

* Add to this page too

* Added base class and fixed some race conditions

* Addressed code review feedback

* Fixed sticky loading state

* Fixed unit tests
@igoristic igoristic deleted the ccs_check branch September 17, 2019 01:52
igoristic added a commit that referenced this pull request Sep 17, 2019
* Sample version

* Add to this page too

* Added base class and fixed some race conditions

* Addressed code review feedback

* Fixed sticky loading state

* Fixed unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Fixes for quality problems that affect the customer experience release_note:fix Team:Monitoring Stack Monitoring team v7.4.0 v7.5.0 v8.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Monitoring] Check for timeouts to ES requests

3 participants