Skip to content

Only use updated schema if auto update is enabled#26752

Merged
reuvenlax merged 1 commit intoapache:masterfrom
bvolpato:flush-updated-schema
May 19, 2023
Merged

Only use updated schema if auto update is enabled#26752
reuvenlax merged 1 commit intoapache:masterfrom
bvolpato:flush-updated-schema

Conversation

@bvolpato
Copy link
Copy Markdown
Contributor

@bvolpato bvolpato commented May 18, 2023

Updating the schema on-the-fly may cause problems and we need to improve.

This change avoids that problem in case autoSchemaUpdate is disabled, as it shouldn't be considering new schemas anyways.

@github-actions
Copy link
Copy Markdown
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @kennknowles for label java.
R: @pabloem for label io.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@bvolpato bvolpato changed the base branch from release-2.47.0 to master May 18, 2023 05:18
@bvolpato bvolpato force-pushed the flush-updated-schema branch from 2b6dda4 to 11de442 Compare May 18, 2023 05:18
@riteshghorse riteshghorse added this to the 2.48.0 Release milestone May 18, 2023
@Abacn
Copy link
Copy Markdown
Contributor

Abacn commented May 18, 2023

Run Java_GCP_IO_Direct PreCommit

@@ -691,7 +691,7 @@ void postFlush() {
@Nullable
TableSchema updatedTableSchema =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, we actually can only request for updatedTableSchema if autoUpdateSchema, that is include

          TableSchema updatedTableSchema =
              (streamAppendClient != null) ? streamAppendClient.getUpdatedSchema() : null;
          if (updatedTableSchema != null) {
            invalidateWriteStream();
            appendClientInfo =
                Preconditions.checkStateNotNull(getAppendClientInfo(false, updatedTableSchema));
          }

in an if block.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

This would make it clearer, but not needed here for now -- streamAppendClient.getUpdatedSchema() does not make external calls, it's pulling a reference that already exists, so there's no practical difference for changing this.

We'll be addressing the bug when the update has to actually happen as part of another PR, and we can consider this path there. WDYT?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly refer to StorageApiWritesShardedRecords, beside this code path, it used a if clause in the outside:

would be nice to keep Sharded/Unsharded have similar logic.

there are some other code path involve autoUpdateSchema, while this class member is completely unused in StorageApiWritesUnShardedRecords

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants