Fix problems when altering a column from `binary` to `varbinary` by jorendorff · Pull Request #1628 · github/gh-ost

jorendorff · 2026-02-09T23:10:47Z

Description

The issue is that binary values are incorrectly read from the binlog. This PR fixes it by changing a previously-implemented fix to apply to all binary columns, not only primary keys.

Two new tests are added: a unit test for the function being changed, and a localtest. (Oddly, the existing test in localtests/varbinary doesn't seem to actually involve any varbinary columns.)

contributed code is using same conventions as original code
script/cibuild returns with no formatting errors, build errors or unit test errors.

~~I wasn't able to figure out how to run script/build or script/cibuild. They both try to download and use a vendored Go installation, but this doesn't work for me.~~ Figured it out, it was trivial - my version of Go was too new.

MySQL's binlog strips trailing 0x00 bytes from binary(N) columns. PR github#915 fixed this for unique key columns only, but the same issue affects all binary columns in INSERT/UPDATE operations. Remove the isUniqueKeyColumn condition so all binary(N) columns are padded to their declared length. Fixes a variation of github#909 where the affected column is not a primary key.

Copilot

Pull request overview

This PR addresses incorrect handling of BINARY(N) values read from MySQL binlog events during migrations that alter a column to VARBINARY(M), where trailing 0x00 bytes can be stripped and thus written incorrectly to the ghost table. The fix generalizes the existing padding logic so it applies to all binary columns, not just unique-key columns.

Changes:

Apply trailing-zero padding for all BinaryColumnType columns during argument conversion (not only unique key columns).
Add unit tests covering binary padding behavior in convertArg.
Add a new local integration test (localtests/binary-to-varbinary) to reproduce the binary -> varbinary binlog truncation scenario.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
`go/sql/types.go`	Expands binary padding logic to apply to all binary columns.
`go/sql/types_test.go`	Adds unit tests verifying padding behavior for truncated/full binary values.
`localtests/binary-to-varbinary/create.sql`	Creates a repro scenario with binlog-driven inserts/updates involving trailing-zero binary values.
`localtests/binary-to-varbinary/extra_args`	Supplies an `--alter` that changes `data` from `binary(20)` to `varbinary(32)`.

Comments suppressed due to low confidence (1)

go/sql/types.go:87

The padding logic builds a bytes.Buffer directly from arg2Bytes. Since bytes.NewBuffer can reuse the provided slice as its backing store, Write may mutate the input byte slice in-place (depending on capacity). Now that this code runs for all binary columns, it would be safer to avoid mutating the input by copying into a new buffer/slice before padding (and ideally keep the return type as []byte to avoid unnecessary string conversions).

				buf := bytes.NewBuffer(arg2Bytes)
				for i := uint(0); i < (this.BinaryOctetLength - uint(size)); i++ {
					buf.Write([]byte{0})
				}
				arg = buf.String()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

go/sql/types_test.go

go/sql/types.go

In this case, the input is binary, and the column type is `binary`. So the output should be binary, not text.

meiji163 · 2026-02-10T17:41:35Z

Thanks for the fix. I do wonder why the author of #915 chose to apply the padding only to the unique key 🤔

look · 2026-02-10T23:42:55Z

Thanks for the fix. I do wonder why the author of #915 chose to apply the padding only to the unique key 🤔

I think this comment has a clue:

Filling a high number zero bytes may affect processing efficiency. after thinking about it, we can fix it more lighter as below, other scenarios will not cause data consistency problems. 40bc5ae

only handle the binary column in the unique key

only handle update/delete dml event

Guess that wasn't quite true...

* Fix binary column trailing zero stripping for non-key columns MySQL's binlog strips trailing 0x00 bytes from binary(N) columns. PR #915 fixed this for unique key columns only, but the same issue affects all binary columns in INSERT/UPDATE operations. Remove the isUniqueKeyColumn condition so all binary(N) columns are padded to their declared length. Fixes a variation of #909 where the affected column is not a primary key. * Simplify by removing isUniqueKeyColumn now that it's no longer used. * In convertArg, don't convert binary data to strings. In this case, the input is binary, and the column type is `binary`. So the output should be binary, not text. * fix a lint

* Execute hook on every batch insert retry Co-authored-by: Bastian Bartmann <bastian.bartmann@shopify.com> * Expose the last error message to the onBatchCopyRetry hook Co-authored-by: Bastian Bartmann <bastian.bartmann@shopify.com> * Remove double retries CalculateNextIterationRangeEndValues needs to be recomputed on every retry in case of configuration (e.g. chunk-size) changes were made by onBatchCopyRetry hooks. * include dev.yml (temp for Shopify) * Update doc/hooks.md * Remove dev.yml * Fix retry issue where MigrationIterationRangeMinValues advances before insert completes - extract MigrationContext.SetNextIterationRangeValues outside of applyCopyRowsFunc, so that it doesn't run on retries - add an integration test for Migrator with retry hooks Co-authored-by: Bastian Bartmann <bastian.bartmann@shopify.com> * Add localtest that expects gh-ost to fail on exhausted retries * Rename method * fmt and lint * gofmt * Fix problems when altering a column from `binary` to `varbinary` (#1628) * Fix binary column trailing zero stripping for non-key columns MySQL's binlog strips trailing 0x00 bytes from binary(N) columns. PR #915 fixed this for unique key columns only, but the same issue affects all binary columns in INSERT/UPDATE operations. Remove the isUniqueKeyColumn condition so all binary(N) columns are padded to their declared length. Fixes a variation of #909 where the affected column is not a primary key. * Simplify by removing isUniqueKeyColumn now that it's no longer used. * In convertArg, don't convert binary data to strings. In this case, the input is binary, and the column type is `binary`. So the output should be binary, not text. * fix a lint * Fix 4 trigger handling bugs (#1626) * fix: remove double-transformation in trigger length validation ValidateGhostTriggerLengthBelowMaxLength was calling GetGhostTriggerName on an already-transformed name, adding the suffix twice. This caused valid trigger names (ghost name <= 64 chars) to be falsely rejected. The caller in inspect.go:627 already transforms the name via GetGhostTriggerName before passing it, so the validation function should check the length as-is. Unit tests updated to reflect the correct call pattern: transform first with GetGhostTriggerName, then validate the result. Added boundary tests for exactly 64 and 65 char names. * fix: return error from trigger creation during atomic cut-over During atomic cut-over, if CreateTriggersOnGhost failed, the error was logged but not returned. The migration continued and completed without triggers, silently losing them. The two-step cut-over (line 793) already correctly returns the error. This aligns the atomic cut-over to do the same. * fix: check trigger name uniqueness per schema, not per table validateGhostTriggersDontExist was filtering by event_object_table, only checking if the ghost trigger name existed on the original table. MySQL trigger names are unique per schema, so a trigger with the same name on any other table would block CREATE TRIGGER but pass validation. Remove the event_object_table filter to check trigger_name + trigger_schema only, matching MySQL's uniqueness constraint. * fix: use parameterized query in GetTriggers to prevent SQL injection GetTriggers used fmt.Sprintf with string interpolation for database and table names, causing SQL syntax errors with special characters and potential SQL injection. Switched to parameterized query with ? placeholders, matching the safe pattern already used in inspect.go:553-559. * test: add regression tests for trigger handling bugs Add two integration tests: - trigger-long-name-validation: verifies 60-char trigger names (64-char ghost name) are not falsely rejected by double-transform - trigger-ghost-name-conflict: verifies validation detects ghost trigger name conflicts on other tables in the same schema * style: gofmt context_test.go --------- Co-authored-by: Yakir Gibraltar <yakir.g@taboola.com> Co-authored-by: meiji163 <meiji163@github.com> * fix update of LastIterationRange values --------- Co-authored-by: Jan Grodowski <jan.grodowski@shopify.com>

jorendorff requested review from meiji163, rashiq and timvaillancourt as code owners February 9, 2026 23:10

Copilot AI review requested due to automatic review settings February 9, 2026 23:10

Copilot started reviewing on behalf of jorendorff February 9, 2026 23:11 View session

Copilot AI reviewed Feb 9, 2026

View reviewed changes

go/sql/types_test.go Outdated Show resolved Hide resolved

go/sql/types.go Show resolved Hide resolved

jorendorff added 3 commits February 9, 2026 18:45

Simplify by removing isUniqueKeyColumn now that it's no longer used.

698ae27

In convertArg, don't convert binary data to strings.

480d897

In this case, the input is binary, and the column type is `binary`. So the output should be binary, not text.

fix a lint

ca2118a

meiji163 approved these changes Feb 10, 2026

View reviewed changes

meiji163 merged commit aadbb79 into github:master Feb 10, 2026
9 checks passed

jorendorff deleted the fix-binary-column-padding branch February 10, 2026 18:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix problems when altering a column from `binary` to `varbinary`#1628

Fix problems when altering a column from `binary` to `varbinary`#1628
meiji163 merged 4 commits intogithub:masterfrom
jorendorff:fix-binary-column-padding

jorendorff commented Feb 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

meiji163 commented Feb 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

look commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jorendorff commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

meiji163 commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

look commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jorendorff commented Feb 9, 2026 •

edited

Loading

meiji163 commented Feb 10, 2026 •

edited

Loading