Fix PGCopyInputStream readFromCopy() returning last row twice and add tests#2019
Conversation
|
Ideally we need this in 42.2.x, then we can cherry-pick it later for 42.3.x |
|
Sounds good. Is there a branch for that? |
There was a problem hiding this comment.
Not a big deal, however, I would suggest "high-level -> lower-level" ordering (== move this join above.
In other words, place methods in such an order that represents their usages in the "above" methods. It helps readers since typically, people read from top to bottom (even when they start reading from the middle of the file :) )
There was a problem hiding this comment.
Oh, it was broken. Please, never use new String(chunkBytes) as byte sequence can contain half-codepoint which would corrupt data. The proper way here is build the full byte array, and convert it to String, or use something like WriterOutputStream from commons-io.
There was a problem hiding this comment.
What are you suggesting I replace it with here? The byte sequence is expected to exactly match the entirety of the string so I'm don't see how it could get anything else in response. Even if it did, it's checking for an exact match after anyway.
There was a problem hiding this comment.
E.g. add readFully(InputStream) helper method and convert everything into String at once.
There was a problem hiding this comment.
Ah I didn't do that because I wanted to count the chunks as well. It's not just testing that the totality of it creates the correct output, but that we had to read N (or N -1) chunks.
There was a problem hiding this comment.
Then collect chunks to List<byte[]>, join them, and then convert to string.
There was a problem hiding this comment.
Please refrain from new String(buf)
There was a problem hiding this comment.
Just wondering. Have you tried new StrangeInputStream(in,...) + full read from the stream?
Technically speaking, if you read from that StrangeInputStream then it would perform random-sized reads to the downstream inputstream, which might uncover more bugs.
There was a problem hiding this comment.
No I just wanted to add something that mixed the two APIs. It's not so much an issue of reading in random amounts as it is mixing the java.io API that allows for arbitrary sized byte reads and the CopyOut API that returns entire rows.
There was a problem hiding this comment.
If you want to mix Java APIs, then you should exercise read() and read(byte[] buf, int off, int siz) methods as well (which StrangeInputStream could probably do automatically)
There was a problem hiding this comment.
I guess this might be factored-out like readFromCopyFully :)
|
Would you please add a note to changelog to |
e998b73 to
148bead
Compare
|
I force pushed a new version rebased atop master. It squashes the test related commits and removed the String join(...) helper method as it's not used anymore. Pass tests and style locally. Once CI gives the okay should be good to merge. |
148bead to
bc8b99a
Compare
|
One more force push to include the changelog entry. |
sorry for the late reply, yes https://github.com/pgjdbc/pgjdbc/tree/release/42.2 |
Refactor PGCopyInputStreamTest and add a failing test for reading using the CopyOut interface method readFromCopy().
bc8b99a to
0037ce1
Compare
|
Okay I switched out the base for release/42.2 and cherry-picked the PR atop it. Once CI passes again I'll merge it in and cherry-pick and merge into master too. |
All Submissions:
New Feature Submissions:
./gradlew autostyleCheck checkstyleAllpass ?Changes to Existing Features:
Fixes #2016.
Actual fix is one line, the rest is cleaning up the tests and adding some new ones to catch this behavior.
Having read through that PGCopyInputStream class a bit more, I really don't like the internals. Half the methods have variables that shadow the private members and it's pretty confusing to understand what it's trying to do. To keep this PR simple I haven't touched any of it here, but I'm planning on submitting a separate PR that eventually cleans that up. Maybe we should deprecate the CopyOut interface half of those classes first though.