importccl: Preserve '\r\n' during CSV import#28181
importccl: Preserve '\r\n' during CSV import#28181craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
neeral
left a comment
There was a problem hiding this comment.
some of the lint rules are failing because of imported code. e.g. you don't need to check for errors on Write or Flush because you call Error immediately afterwards. This is the semantics of how the csv writer is designed. How can ask the linter to ignore these?
Reviewable status:
complete! 0 of 0 LGTMs obtained
|
Can you post the output of the failed lint rule? I'm not seeing it in the failed TeamCity job. |
neeral
left a comment
There was a problem hiding this comment.
@benesh this is the link to TC lint job:
https://teamcity.cockroachdb.com/viewLog.html?buildTypeId=Cockroach_UnitTests_Check&buildId=810996&branch_Cockroach_UnitTests_Check=28181
[/TestUnused] lint_test.go:879: /go/src/github.com/cockroachdb/cockroach/pkg/util/encoding/csv/reader.go:85:2: var ErrTrailingComma is unused (U1000)
lint_test.go:879: /go/src/github.com/cockroachdb/cockroach/pkg/util/encoding/csv/reader.go:134:2: field TrailingComma is unused (U1000)
[23:49:08][TestLint] /TestErrCheck
[23:49:08]
[/TestErrCheck] lint_test.go:1145: pkg/util/encoding/csv/example_test.go:121:12: w.WriteAll(records) // calls Flush internally <- unchecked error
lint_test.go:1145: pkg/util/encoding/csv/writer.go:103:11: w.w.Flush() <- unchecked error
lint_test.go:1145: pkg/util/encoding/csv/writer_test.go:69:9: f.Write([]string{"abc"}) <- unchecked error
lint_test.go:1145: pkg/util/encoding/csv/writer_test.go:78:9: f.Write([]string{"abc"}) <- unchecked error
[/TestForbiddenImports] lint_test.go:1031:
github.com/cockroachdb/cockroach/pkg/util/encoding/csv: log <- please use "util/log" instead of "log"
The Unused variables is easy to remove.
The ErrCheck rule conflicts with the semantics of write/flush.
Use of log package means adding contexts.
I don't want to change this stdlib code more than I have to, as it will make it hard to maintain. I'm open to suggestions on how you normally handle this.
Reviewable status:
complete! 0 of 0 LGTMs obtained
neeral
left a comment
There was a problem hiding this comment.
The job ran on revision1 of this PR - in case the line numbers don't align with the current version. I tried to fix some of these but did a bad job.
Reviewable status:
complete! 0 of 0 LGTMs obtained
See cockroachdb#25344. It appears this is being caused by the behaviour of Golang's encoding/csv library, which folds \r\n into \n when reading. This was fixed in golang/go#21201 but then reverted golang/go#22746. It appears based on that second issue that Go is unlikely to change that behavior. Check in the stdlib `encoding/csv` into `pkg/util` with golang/go#22746 reverted. Release note: `\r\n` characters in CSV files were silently converted into `\n`. This causes imported data to be different. This is now fixed.
neeral
left a comment
There was a problem hiding this comment.
I've added test case for this in pkg/ccl/importccl/import_stmt_test.go
Reviewable status:
complete! 0 of 0 LGTMs obtained
|
To suppress the errcheck warnings, I'd be OK with you adding cockroach/pkg/testutils/lint/lint_test.go Line 1133 in 2e4952d If you do so, please add a comment indicating why it's OK. |
|
I too would be fine with exempting from linters to minimize diff from stdlib, but also LGTM as-is. Thanks! |
benesch
left a comment
There was a problem hiding this comment.
I’d rather not list (*bufio.Writer).Flush in errcheck_excludes. It’s not in general safe to ignore its error. It’s only safe here because encoding/csv knows that it can find that error later by calling w.Write(nil). Which actually seems like an undocumented implementation detail to rely on; but whatever, parity with the stdlib definitely wins here.
neeral
left a comment
There was a problem hiding this comment.
-ignorepkg doesn't do what we thought. -ignorepkg ./pkg/util/encoding/csv doesn't work - need to do -ignorepkg bufio which defeats the purpose of what we want to do.
https://github.com/kisielk/errcheck
The -ignorepkg flag takes a comma-separated list of package import paths to ignore
It's not which packages to ignore when running the checks.
If you are both happy with this, could @benesh or @dt give an LGTM and trigger bors?
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/ccl/importccl/import_stmt_test.go, line 179 at r2 (raw file):
create: `t text`, typ: "CSV", data: "\"hello\r\nworld\"\n\"friend\nfoe\"\n\"mr\rmrs\"",
Is this the correct way to handle this?
|
D’oh, sorry to lead you astray. Your approach is definitely the lesser poison. Leaving to @dt to answer your other question and merge. |
dt
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/ccl/importccl/import_stmt_test.go, line 179 at r2 (raw file):
Previously, neeral (Neeral Dodhia) wrote…
Is this the correct way to handle this?
Yeah, looks right to me! It might be a little easier to read using backticks rather than escaping all the quotes, but not enough to justify another CI cycle.
|
bors r+ |
|
👎 Rejected by code reviews |
neeral
left a comment
There was a problem hiding this comment.
@dt could you give an LGTM, bors is complaining no one has approved this PR
Reviewable status:
complete! 0 of 0 LGTMs obtained
|
Sorry, my review was in the way. bors r+ |
|
Thanks for the fix, @neeral! |
Build failed (retrying...) |
Build succeeded |
27868: backport-2.0: storage: prevent unbounded raft log growth without quorum r=nvanbenschoten a=nvanbenschoten Backport 2/2 commits from #27774. /cc @cockroachdb/release --- Fixes #27772. This change adds safeguards to prevent cases where a raft log would grow without bound during loss of quorum scenarios. It also adds a new test that demonstrates that the raft log does not grow without bound in these cases. There are two cases that need to be handled to prevent the unbounded raft log growth observed in #27772. 1. When the leader proposes a command and cannot establish a quorum. In this case, we know the leader has the entry in its log, so there's no need to refresh it with `reasonTicks`. To avoid this, we no longer use `refreshTicks` as a leader. 2. When a follower proposes a command that is forwarded to the leader who cannot establish a quorum. In this case, the follower can't be sure (currently) that the leader got the proposal, so it needs to refresh using `reasonTicks`. However, the leader now detects duplicate forwarded proposals and avoids appending redundant entries to its log. It does so by maintaining a set of in-flight forwarded proposals that it has received during its term as leader. This set is reset after every leadership change. Both of these cases are tested against in the new TestLogGrowthWhenRefreshingPendingCommands. Without both of the safeguards introduced in this commit, the test fails. Release note (bug fix): Prevent loss of quorum situations from allowing unbounded growth of a Range's Raft log. 28225: release-2.0: importccl: Preserve '\r\n' during CSV import r=dt a=dt Backport 1/1 commits from #28181. /cc @cockroachdb/release --- See #25344. Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: neeral <neeral@users.noreply.github.com> Co-authored-by: David Taylor <tinystatemachine@gmail.com>
See #25344.