Skip to content

sql: clean up mutable not-null columns hack#74922

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
RaduBerinde:remove-readable-hack
Jan 19, 2022
Merged

sql: clean up mutable not-null columns hack#74922
craig[bot] merged 1 commit intocockroachdb:masterfrom
RaduBerinde:remove-readable-hack

Conversation

@RaduBerinde
Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde commented Jan 18, 2022

Mutation columns in some cases need to be scanned even if they haven't
been backfilled yet, which means that we may retrieve NULL values even
if they are marked as not-nullable.

We currently have a hack in the table descriptor which changes the
nullable flags in the column descriptors when ReadableColumns() is
used. It is very surprising that we can get different descriptors for
a given ColumnID depending if we look for it in ReadableColumns() or
in AllColumns() (e.g. via FindColumnWithID).

This commit cleans this up, changing the scanning code to check for
Public() instead.

Release note: None

@RaduBerinde RaduBerinde requested review from a team, postamar and yuzefovich January 18, 2022 02:16
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'll defer to @postamar for approval.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @postamar and @RaduBerinde)


-- commits, line 6 at r1:
nit: s/market/marked/.


-- commits, line 12 at r1:
nit: missing closing parenthesis.


pkg/sql/catalog/tabledesc/column.go, line 318 at r1 (raw file):

		c.nonDrop = c.public
	} else {
		//readableDescs := make([]descpb.ColumnDescriptor, 0, numMutations)

Should this be now removed?

Mutation columns in some cases need to be scanned even if they haven't
been backfilled yet, which means that we may retrieve NULL values even
if they are marked as not-nullable.

We currently have a hack in the table descriptor which changes the
nullable flags in the column descriptors when `ReadableColumns()` is
used. It is very surprising that we can get different descriptors for
a given ColumnID depending if we look for it in `ReadableColumns()` or
in `AllColumns()` (e.g. via FindColumnWithID).

This commit cleans this up, changing the scanning code to check for
`Public()` instead.

Release note: None
Copy link
Copy Markdown
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @postamar and @yuzefovich)


pkg/sql/catalog/tabledesc/column.go, line 318 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Should this be now removed?

Oops, done.

Copy link
Copy Markdown
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

le descriptor which changes the
nullable flags in the column descr
LGTM. Much cleaner than the original hacky code.

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @yuzefovich)

Copy link
Copy Markdown
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Much cleaner than the original hacky code.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @yuzefovich)

@RaduBerinde
Copy link
Copy Markdown
Member Author

TFTR!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 19, 2022

Build succeeded:

@craig craig bot merged commit eb26eb3 into cockroachdb:master Jan 19, 2022
@RaduBerinde RaduBerinde deleted the remove-readable-hack branch January 19, 2022 20:48
@postamar
Copy link
Copy Markdown

Thanks for doing this!

craig bot pushed a commit that referenced this pull request Jan 21, 2022
74318: tracing: add /debug/tracez rendering the active spans  r=andreimatei a=andreimatei

`/debug/tracez` lets users take a snapshot of the active spans registry
and render the new snapshot, or one of the previously taken snapshots.
The Tracer can hold up to 10 snapshots in memory.

It looks like this:
![Screenshot from 2022-01-04 19-03-39](https://user-images.githubusercontent.com/377201/148140272-306658d5-5b9c-4f2a-b59c-28df9c5ed10c.png)


When visualizing a snapshot, the page lets you do a number of things:
1. List all the spans.
2. See the (current) stack trace for each span's goroutine (if the
   goroutine was still running at the time when the snapshot was
   captured). Stack traces can be toggled visible/hidden.
3. Sort the spans by name or start time.
4. Filter the span according to text search. The search works across
   the name and stack trace.
5. Go from a span to the full trace containing that span.   

For the table Javascript providing sorting and filtering, this patch
embeds the library from https://listjs.com/ .

Limitations:
- for now, only the registry of the local node is snapshotted. In the
  fuiture I'll collect info from all nodes.
- for now, the relationships between different spans are not represented
  in any way. I'll work on the ability to go from a span to the whole
  trace that the span is part of.
- for now, tags and structured and unstructured log messages that a span
  might have are not displayed in any way.

At the moment, span creation is not enabled in production by default
(i.e. the Tracer is put in `TracingModeOnDemand` by default, instead of
the required `TracingModeActiveSpansRegistry`). This patch does not change
that, so in order to benefit from /debug/tracez in all its glory, one
has to run with `COCKROACH_REAL_SPANS=1` for now. Not for long, though.

Release note: None

74867: sql: Support CREATE DATABASE WITH OWNER r=Fenil-P a=Fenil-P

fixes #67817

Release note (sql change): Allow users to specify the owner when creating a database. 
			                      Similar to postgresql: CREATE DATABASE name [ [ WITH ] [ OWNER [=] user_name ]



74871: sql: add a tracing tag with the txn ID r=andreimatei a=andreimatei

This patch adds the txn's ID as a tag to the tracing span representing a
SQL txn. I'm creating a UI to explore the current spans, and this ID
will make it easy to navigate between a query/request blocking on a lock
held by some other txn, and the activity of that other txn.

Release note: None

75114: sql: directly specify columns in TableReader r=RaduBerinde a=RaduBerinde

~Note: the first commit is #74922.~

The internal columns of the TableReader (as well as the row fetcher)
are all the columns of the table, with only a subset of values
actually produced. This design decision has been carried over way past
the point where it makes sense (I admit, it's questionable whether it
ever made sense). For one, "all the columns" is ambiguous (does it
contain non-public columns? does it include system columns?) leading
to various flags and inherent fragility. Second, it relies on the
execution engine to figure out (based on the PostProcessSpec) which
columns are actually needed, which the optimizer already figures out
for us now.

This commit changes the TableReader spec and the interface of
row.Fetcher to always produce a given specific set of column IDs. The
diagram for table readers now specifies the columns by name.

The JoinReader, InvertedJoiner, ZigzagJoiner are not changed in this
commit (but they should be cleaned up as well).

Release note: None


75175: colfetcher: fix the bytes read statistic collection r=yuzefovich a=yuzefovich

During 21.2 release we adjusted the `cFetcher` to be `Close`d eagerly
when it is returning the zero-length batch. This was done in order to
release some references in order for the memory to be GCed sooner;
additionally, the `cFetcher` started being used for the index join where
the fetcher is restarted from scratch for every batch of spans, so it
seemed reasonable to close it automatically.

However, that eager closure broke "bytes read" statistic collection
since the `row.KVFetcher` was responsible for providing it, and we were
zeroing it out. This commit fixes this problem by the `cFetcher`
memorizing the number of bytes it has read in `Close`. Some care needs
to be taken to not double-count the bytes read in the index join, so
a couple of helper methods have been introduced.

Additionally this commit applies the same eager-close optimization to
the `cFetcher` when the last batch is returned (which makes it so that
if we've just exhausted all KVs, we close the fetcher - previously, we
would set the zero length on the batch and might never get into
`stateFinished`).

Fixes: #75128.

Release note (bug fix): Previously, CockroachDB could incorrectly report
`KV bytes read` statistic in `EXPLAIN ANALYZE` output. The bug is
present only in 21.2.x versions.

75215: cmd/github-post: fix Pebble metamorphic reproduction command r=jbowens a=jbowens

When posting a github issue for a Pebble metamorphic test failure, include the
correct `-ops` flag.

Discovered because cockroachdb/pebble#1459 contained a
reproduction command that contained too few ops to reproduce the issue.

Release note: none

75228: logictestccl: skip flaky TestCCLLogic/fakedist-metadata/partitioning_enum r=mgartner a=mgartner

Informs #75227

Release note: None

75237: cli,rpc: don't check the active cluster version in the CLI r=andreimatei a=knz

This commit removes a code path that would tickle an assertion failure
if we were to later fix the context propagation in the RPC heartbeat
method (see PR #71243): there's no "active cluster version" in the CLI
and so we can't compare it in a client interceptor.

Release note: None

75254: scripts: add `dev generate --mirror` to `bump-pebble.sh` script r=jbowens a=nicktrav

CI now expects that dependencies are mirrored to cloud storage and will
fail if the TODO for mirroring the repo is left unaddressed in the
`DEPS.bzl` file.

Add a mirroring step to the `bump-pebble.sh` script.

Release note: none

Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Co-authored-by: Fenil Patel <fenil.patel@cockroachlabs.com>
Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Jackson Owens <jackson@cockroachlabs.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Co-authored-by: Nick Travers <travers@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants