Issue 25323: Fixed error catch and route handling#43338
Issue 25323: Fixed error catch and route handling#43338igoristic wants to merge 5 commits intoelastic:masterfrom
Conversation
|
Pinging @elastic/stack-monitoring |
💔 Build Failed |
|
After reading the description of this PR, I'm not sure I understand what the new behavior will be so it's hard to test this. Could you please update the description to further explain the change? |
💔 Build Failed |
|
@cachedout I added a step by step instructions, and clarified the current behavior vs the fixed/expected behavior. Hope it helps. Let me know if you have any questions, I'm happy to walk you through it |
|
|
||
| initSetupModeState($scope, $injector); | ||
|
|
||
| const setupMode = getSetupModeState(); |
There was a problem hiding this comment.
'/elasticsearch/nodes' route will do its own monitoringClusters() call through routeInitProvider and if that fails it will be stuck at a blank screen again. There is no point of sending it to the nodes route if the clusters call fails.
We are calling monitoringClusters() twice this way, but at least this way it has less chances of failing (after it has just succeeded). I figured this is fine for now, since this will go away once the setup becomes obsolete
| uiRoutes | ||
| .when('/no-data', { | ||
| template, | ||
| resolve: { |
There was a problem hiding this comment.
What's the thinking behind this page? I'm guessing it's to avoid doing the check that we might fail, but we still should be doing this check so we can do the redirect, if necessary.
There was a problem hiding this comment.
The redirect logic will still happen here: https://github.com/elastic/kibana/pull/43338/files#diff-87ec11d43f1e502792408786cf726eb6R96
We should avoid doing any kinds of requests in resolve, since it provides no template in which we can indicate a loading state to the user.
@chrisronline I agree 💯 Unfortunately there is no way to clear it without refreshing the state again. This is why I wanted to create something that will not be coupled with URL state. I was actually thinking of extending |
|
So I was rereading the original issue and I'm thinking we might be able to simplify and improve this fix. Instead of needing to incur even the first 30s of timeout, I think we might be able to use the cluster.remoteInfo api to do a sort of "pre check" in the server side code and then throw an appropriate error (which should result in a toast showing). This feels more closely aligned to some other checks we're doing on the server side and the amount of code is much less, plus we can avoid the first timeout all together! Do you mind opening a new PR and trying to get this approach working? I think it'll be nice to compare both solutions |
|
@igoristic Could you please comment on the plan for this? Are you going to open a new PR as @chrisronline as requested above or are you waiting on a review from me on this? |
|
@cachedout Sorry should have mentioned. Yes, I'm implementing It's working pretty well, just need to check a couple more cases. Should have it up soon |
💔 Build Failed |
|
This seems like a lot of code to solve this problem. I'm wondering if we can do something even simpler. What if add a verify check like we do for auth and if there is a remote cluster that is not found, we throw an error like the auth logic does. Any errors thrown in the server code will result in a toast on the client side. We might be able to get away with only changing a couple files if this approach does work. I think you made some efforts to remove some data fetching in WDYT? |
|
@chrisronline This might seems like a trivial issue with a "quick fix", but it's a little more complex and tightly coupled with the majority of our app. I like your suggestion, but I don't think it'll yield less code (or a better quality fix), just a chain less in The majority of the problem also came from route responses not catching any errors (thus the blank white screen), so it might be inefficient to add I might be missing your point though. Happy to zoom and discuss anytime 😄 |
|
@igoristic Check out this draft PR I just put up: #44297 I think this approach might also work, and might be easier to reason about since it follows a more common convention in our monitoring code base. Maybe we can bring in @cachedout for thoughts here too |
|
@chrisronline Thank you I really appreciate the draft PR! Let me poke at it a little before I make a decision |
|
This is an old iteration, please checkout the new PR at: #44800 |
💔 Build Failed |

EDIT: This is an old iteration, please checkout the new PR at: #44800
Resolves #25323
In some cases we weren't catching

monitoringClustersfailures which resulted in a blank error screen:The default route was set to
/no-datawhich didn't have any loading state, so I rerouted all the entry points to/loading.I also removed some on route
resolvetriggers, since their controllers were already doing that same loading/logicSteps to cause
monitoringClusters()to timeout (onmaster):/no-datafor about 30 seconds with a completely blank screen. After that you'll see a toast error that'll go away, but the blank screen will still be thereThe fixed code will go to
/loadingpage instead (so now there is a loading indicator) and then it will go to/no-data(w/ the relevant error details) after the timeoutTODO:
cluster.remoteInfopre-checkAppStateand added a temp state tokbnUrl