ui/db-console: surface more job metrics around reverting and retrying in the DBConsole Jobs Overview page#72291
Conversation
|
Some side notes:
2. Translating statuses (@vy-ton and @kevin-v-ngo)
4. Storybook (@Annebirzin and @kevin-v-ngo) Happy to help with any installation trouble! 5. Minor retry badge UI stuff (@Annebirzin) The separate line does make the row taller than other normal rows. Not sure if you would prefer to break with other badge styles, and have a squished retry badge that's smaller than the others? |
|
Run The snippet Andrew provided to write to jobs table as |
maryliag
left a comment
There was a problem hiding this comment.
Reviewed 11 of 14 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jocrl)
pkg/ui/workspaces/db-console/src/views/jobs/jobStatus.tsx, line 71 at r1 (raw file):
/> <div className="jobs-table__align-progress-bar"> {jobIsRetrying && <RetryingStatusBadge />}
you probably want the jib is running check for the div component, not just the div content. No need to create this div if it will be empty
{jobIsRetrying && (<div className....)}
matthewtodd
left a comment
There was a problem hiding this comment.
This looks nice! I'm not sure what to do about 2., but I've done something like it before, flattening a couple of dimensions into one for display purposes. In that context, it became a little icky to work with, but I don't know if it will here, since we're just building a read-only view. IIUC, we do have enough fields in the API response to calculate the "retrying" answer on the client side, yes? (Except we would be depending on the browser's sense of time, which isn't great.)
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jocrl)
Azhng
left a comment
There was a problem hiding this comment.
Just trying to wrap my head around the job CTE query here, is my understanding correct that this CTE finds the schema change jobs on foo table and set its status to reverting?
Few things that I don't quite understand here:
- Why do we need to remove
finishedMicrosandfinalResumeErrorfield? - if a job is running at the reverting stage, why do we need to set the
claim_{session, instance}_idto NULL ?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jocrl)
pkg/server/admin.go, line 1855 at r1 (raw file):
case when ` + retryRunningCondition + `then 'retry-running' when ` + retryRevertingCondition + ` then 'retry-reverting' else status end as status,
super minor nit: is it possible we can format the case operator as
case
when ... then ...
when ... then ...
else ...
end for better readability ?
pkg/server/admin.go, line 2735 at r1 (raw file):
*d = &val case *[]string:
quick question, which column has the destination type *[]string ?
pkg/ui/workspaces/db-console/src/views/jobs/duration.tsx, line 37 at r1 (raw file):
if (isRunning(job.status)) { const fractionCompleted = job.fraction_completed; if (fractionCompleted > 0) {
I think we have other long running jobs like CDC that does not terminate. In that case I think fraction_completed is omitted and it returns a high_water timestamp instead.
So what would happen here if we have a CDC job ?
pkg/ui/workspaces/db-console/src/views/jobs/JobTable.stories.tsx, line 1 at r1 (raw file):
// Copyright 2020 The Cockroach Authors.
nit: s/2020/2021/
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jocrl)
-- commits, line 26 at r1:
link the issue being addressed, with a Fixes #issuenumber or Partially Adresses #issuenumber
3e51673 to
026bc64
Compare
jocrl
left a comment
There was a problem hiding this comment.
@matthewtodd thanks, good to know there's precedent for that!
IIUC, we do have enough fields in the API response to calculate the "retrying" answer on the client side, yes? (Except we would be depending on the browser's sense of time, which isn't great.)
Yup, that's correct. The downside is that the filtering is done back end (https://github.com/cockroachdb/cockroach/pull/72291/files#diff-1402bca7aa2ae9e5b623ab3d1e93d918e0f8d652c689c42d644e46611151c9f6R1853). So there are two places where the logic is used,
- when deciding what badge to show
- when deciding whether a job filtered by query params should be part of the result
We have the fields, but independently calculating 1. on the client side would mean that the definition of "retrying" would be duplicated in the front end and back end code, and risk getting out of sync.
@Azhng so... Andrew Werner gave me the job CTE query; I don't understand it xD. This was the message if you would like to ask him? https://cockroachlabs.slack.com/archives/C02DSDS9TM1/p1635345519023200
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, and @maryliag)
Previously, maryliag (Marylia Gutierrez) wrote…
link the issue being addressed, with a Fixes #issuenumber or Partially Adresses #issuenumber
Done.
pkg/server/admin.go, line 1855 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
super minor nit: is it possible we can format the
caseoperator ascase when ... then ... when ... then ... else ... endfor better readability ?
Done.
pkg/server/admin.go, line 2735 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
quick question, which column has the destination type
*[]string?
Thanks for the catch. This should've been for a different issue, adding execution_errors which is *[]string. I've removed this.
pkg/ui/workspaces/db-console/src/views/jobs/duration.tsx, line 37 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
I think we have other long running jobs like CDC that does not terminate. In that case I think
fraction_completedis omitted and it returns ahigh_watertimestamp instead.
So what would happen here if we have a CDC job ?
The pre-existing code has it that if there is a highwater time stamp, it doesn't show a status badge at all (doesn't go into jobStatus.tsx and doesn't show duration). https://github.com/cockroachdb/cockroach/blob/master/pkg/ui/workspaces/db-console/src/views/jobs/jobStatusCell.tsx#L28
pkg/ui/workspaces/db-console/src/views/jobs/JobTable.stories.tsx, line 1 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
nit:
s/2020/2021/
done
pkg/ui/workspaces/db-console/src/views/jobs/jobStatus.tsx, line 71 at r1 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
you probably want the jib is running check for the div component, not just the div content. No need to create this div if it will be empty
{jobIsRetrying && (<div className....)}
Done.
|
@jocrl Looking good! I had one suggestion for better alignment of the text/badges (Figma link):
Also, I was wondering if we could make a few color tweaks (can break this out in a separate issue if you want)
|
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @maryliag)
pkg/server/admin.go, line 2735 at r1 (raw file):
Previously, jocrl (Josephine) wrote…
Thanks for the catch. This should've been for a different issue, adding
execution_errorswhich is *[]string. I've removed this.
👍
pkg/ui/workspaces/db-console/src/views/jobs/duration.tsx, line 37 at r1 (raw file):
Previously, jocrl (Josephine) wrote…
The pre-existing code has it that if there is a highwater time stamp, it doesn't show a status badge at all (doesn't go into
jobStatus.tsxand doesn't show duration). https://github.com/cockroachdb/cockroach/blob/master/pkg/ui/workspaces/db-console/src/views/jobs/jobStatusCell.tsx#L28
Ah nice, TIL 👍
026bc64 to
6bd9bee
Compare
jocrl
left a comment
There was a problem hiding this comment.
@Annebirzin , how does this look for alignment?
The color change turns out to be more complicated to do nicely (https://cockroachlabs.slack.com/archives/C0182T8NUH0/p1636579523023600 if anyone is interested in the implementation details). I'll break it into a separate issue!
@cockroachdb/sql-observability I've added tests, made a minor UI change, and made the fixtures prettier. Reviewable also seems to detect changes in full.md in the latest push that I'm not seeing on github; does anyone have an idea what's going on?
@laurenbarker, could I get your review for index.spec.tsx? The test for this PR doesn't require any redux stuff, so I've left out the testHelper.tsx port from this PR and will work on that later.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @maryliag)
pkg/ui/workspaces/db-console/src/views/jobs/index.spec.tsx, line 19 at r3 (raw file):
allJobsFixture, retryRunningJobFixture, } from "src/views/jobs/jobTable.fixture";
What do people think about sharing these fixtures between storybook and tests?
Pros:
- The fixture for what a running job looks like already exists, so might as well reuse it. It reduces boilerplate and having to re-specify all the fields in the test
- Having the storybook fixtures used in test makes it less likely that the storybook fixtures will diverge and break
Cons:
- It is not obvious that the fixtures are used in both. One might imagine wanting to change the storybook fixture just because, and being surprised that it affects tests
|
@jocrl alignment looks great, thanks for the update! |
6bd9bee to
a6a9776
Compare
fa7d3d1 to
af2515c
Compare
maryliag
left a comment
There was a problem hiding this comment.
Reviewed 1 of 4 files at r2, 11 of 14 files at r3, 2 of 2 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jocrl, @laurenbarker, and @maryliag)
pkg/ui/workspaces/db-console/src/views/jobs/index.spec.tsx, line 19 at r3 (raw file):
Previously, jocrl (Josephine) wrote…
What do people think about sharing these fixtures between storybook and tests?
Pros:
- The fixture for what a running job looks like already exists, so might as well reuse it. It reduces boilerplate and having to re-specify all the fields in the test
- Having the storybook fixtures used in test makes it less likely that the storybook fixtures will diverge and break
Cons:
- It is not obvious that the fixtures are used in both. One might imagine wanting to change the storybook fixture just because, and being surprised that it affects tests
I think it makes sense to use the same. If a change is made with storybook in mind and that breaks a test, that's good, because we caught that. The ui and tests should be aligned anyway
jocrl
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @laurenbarker, and @maryliag)
pkg/ui/workspaces/db-console/src/views/jobs/index.spec.tsx, line 19 at r3 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
I think it makes sense to use the same. If a change is made with storybook in mind and that breaks a test, that's good, because we caught that. The ui and tests should be aligned anyway
Mmm, yeah I'm thinking there are two types of "storybook change breaks a test", one which is good to catch and one which we don't care about and could be annoying.
E.g., let's say there's a tooltip that if you hover will show the job description, in this case "automatic SQL Stats compaction". The test is like
// pseudocode
renderTheComponent()
hoverTheComponent()
verifyThatThisTextAppears("automatic SQL Stats compaction")
A storybook change could break verifyThatThisTextAppears in two ways
- Something about the new storybook inputs makes it so that the tooltip no longer appears. This is good to catch 😃
- The storybook changes the description to
"BACKUP DATABASE bank TO 'gs://acme-co-backup/database-bank-2017-03-29-nightly' AS OF SYSTEM TIME '-10s' INCREMENTAL FROM 'gs://acme-co-backup/database-bank-2017-03-27-weekly', 'gs://acme-co-backup/database-bank-2017-03-28-nightly' WITH revision_history", or some other different text, to be able to front end develop on whether the page works with a really long tooltip. The tooltip still works, but the test breaks because the text changes.
So yeah, I think 1) is legit and useful. 2) I'm concerned that it constrains the ability of storybook to be a sandbox and causes flaky test churn
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jocrl, and @laurenbarker)
pkg/ui/workspaces/db-console/src/views/jobs/index.spec.tsx, line 19 at r3 (raw file):
Previously, jocrl (Josephine) wrote…
Mmm, yeah I'm thinking there are two types of "storybook change breaks a test", one which is good to catch and one which we don't care about and could be annoying.
E.g., let's say there's a tooltip that if you hover will show the job description, in this case "automatic SQL Stats compaction". The test is like
// pseudocode renderTheComponent() hoverTheComponent() verifyThatThisTextAppears("automatic SQL Stats compaction")A storybook change could break
verifyThatThisTextAppearsin two ways
- Something about the new storybook inputs makes it so that the tooltip no longer appears. This is good to catch 😃
- The storybook changes the description to
"BACKUP DATABASE bank TO 'gs://acme-co-backup/database-bank-2017-03-29-nightly' AS OF SYSTEM TIME '-10s' INCREMENTAL FROM 'gs://acme-co-backup/database-bank-2017-03-27-weekly', 'gs://acme-co-backup/database-bank-2017-03-28-nightly' WITH revision_history", or some other different text, to be able to front end develop on whether the page works with a really long tooltip. The tooltip still works, but the test breaks because the text changes.So yeah, I think 1) is legit and useful. 2) I'm concerned that it constrains the ability of storybook to be a sandbox and causes flaky test churn
I see, in that case we don't want to create new points of failures for tests. Let keep as is and think for future improvements we can make on this area, but that will not be on the scope of this PR
2c22b4f to
9883c4d
Compare
|
Updated images on description to reflect the latest state |
|
Hi @laurenbarker, friendly bump if I could get your review for |
a41047f to
e502248
Compare
… in the DBConsole Jobs Overview page Fixes cockroachdb#68179 This commit surfaces the status `reverting`, annotates existing `running` and `reverting` statuses UI with "retrying" where applicable, and adds the "Last Execution Time (UTC)" and "Execution Count" columns to the jobs overview table in db console. "Retrying" is defined as `status IN ('running', 'reverting') AND next_run > now() AND num_runs > 1`. Hovering a retrying status shows the next execution time. The "Status" column was also moved left to the second column. Filtering using the dropdown by `Status: Running` or `Status: Reverting` will include those that are also "retrying". Users can also filter by `Status: Retrying`. The `/jobs` endpoint was modified to add the `last_run`, `next_run`, and `num_runs` fields required for the UI change. Jobs with status `running` or `reverting` and are also "retrying" have their statuses sent as `retry-running` and `retry-reverting` respectively. The endpoint was also modified to support the value `retrying` for the `status` query parameter. This commit also adds a storybook story for the jobs table, which showcases the different possible statuses in permutations of information that could be present for the `running` status. Release note (ui change): The jobs overview table in DBConsole now shows when jobs have the status "reverting", and shows the badge "retrying" when running or reverting jobs are also retrying. Hovering the status for a "retrying" job will show the "Next execution time" in UTC. Two new columns, "Last Execution Time (UTC)" and "Execution Count", were also added to the jobs overview table in DBConsole, and the "Status" column was moved left to the second column in the table. The `status` query parameter in the `/jobs` endpoint now supports the values `reverting` and `retrying`.
e502248 to
f0d0467
Compare
laurenbarker
left a comment
There was a problem hiding this comment.
Apologize for the delay! I need to setup my notifications for this repo. The test looks good to me, I just had one question about the necessity of pulling out the helper.
My approval is for the test as I didn't review the rest of the PR :)
Reviewed 1 of 14 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jocrl, and @maryliag)
pkg/ui/workspaces/db-console/src/test-utils/tooltip.ts, line 13 at r8 (raw file):
import { expect } from "chai"; export const expectPopperTooltipActivated = () =>
Are you planning on using this in other tests? Based on only this PR is seems like an early optimization.
pkg/ui/workspaces/db-console/src/views/jobs/index.spec.tsx, line 91 at r8 (raw file):
await waitFor(expectPopperTooltipActivated); userEvent.hover(getByText("retrying"));
Glad this works! We've had inconsistent behavior for triggering hover events in tests depending on the tooltip library.
|
pkg/ui/workspaces/db-console/src/test-utils/tooltip.ts, line 13 at r8 (raw file): Previously, laurenbarker (Lauren Barker) wrote…
Not currently planning, but yeah it's an early optimization on the basis that the tooltip is a reusable component. Would you suggest not making that optimization without a current use case? |
maryliag
left a comment
There was a problem hiding this comment.
Can you also add the backport 21.2 label to this? We will want to backport this :)
Reviewed 1 of 14 files at r3, 3 of 4 files at r6, 5 of 8 files at r7, 3 of 3 files at r9, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @laurenbarker)
|
bors r+ |
|
Build succeeded: |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from f0d0467 to blathers/backport-release-21.2-72291: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
|
I'm now realizing that this is also
@maryliag, should I correct these somehow? |
|



Fixes #68179
This commit surfaces the status
reverting, annotates existingrunningandrevertingstatuses UI with "retrying" where applicable, and adds the "LastExecution Time (UTC)" and "Execution Count" columns to the jobs overview table
in db console. "Retrying" is defined as
status IN ('running', 'reverting') AND next_run > now() AND num_runs > 1.Hovering a retrying status shows the next execution time. The "Status" column
was also moved left to the second column. Filtering using the dropdown by
Status: RunningorStatus: Revertingwill include those that arealso "retrying". Users can also filter by
Status: Retrying.The
/jobsendpoint was modified to add thelast_run,next_run, andnum_runsfields required for the UI change. Jobs with statusrunningorrevertingand are also "retrying" have their statuses sent asretry-runningand
retry-revertingrespectively. The endpoint was also modified to supportthe value
retryingfor thestatusquery parameter.This commit also adds a storybook story for the jobs table, which showcases the
different possible statuses in permutations of information that could be
present for the
runningstatus.Release note (ui change): The jobs overview table in DBConsole now shows when
jobs have the status "reverting", and shows the badge "retrying" when running
or reverting jobs are also retrying. Hovering the status for a "retrying" job
will show the "Next execution time" in UTC. Two new columns, "Last Execution
Time (UTC)" and "Execution Count", were also added to the jobs overview table
in DBConsole, and the "Status" column was moved left to the second column in
the table.
The
statusquery parameter in the/jobsendpoint now supports the valuesrevertingandretrying.Jobs table:

Filter and hover:
https://user-images.githubusercontent.com/91907326/141375153-2cf2641a-33a1-4bfb-a900-a187dc5579a1.mov
Permutations of running jobs with present/absent combinations of time remaining, running message, or retrying:
