I found a bug that only reproduces with a very specific set of prerequisites (all of the following must be true):
- file with CRLF endings
- file with no EOL/trailling newline (removed using a hex editor since vim always adds them back)
- file larger than 32768 bytes
CSV.foreach
- option
strip: true
- option
skip_lines: /\A,+\n?\z/
The following example (where original.csv is a file containing the line AAAA1234567890 ~2500 times):
CSV.foreach('original.csv', strip: true, skip_lines: /\A,+\n?\z/) do |data|
puts data[0]
end
will print the last line duplicated:
AAAA1234567890
AAAA1234567890
AAAA1234567890
AAAA1234567890
AAAA1234567890AAAA1234567890
...
As a workaround I used CSV.parse(File.read, ...) with the same options, but I still wanted to flag this issue.
I found a bug that only reproduces with a very specific set of prerequisites (all of the following must be true):
CSV.foreachstrip: trueskip_lines: /\A,+\n?\z/The following example (where
original.csvis a file containing the lineAAAA1234567890~2500 times):will print the last line duplicated:
As a workaround I used
CSV.parse(File.read, ...)with the same options, but I still wanted to flag this issue.