Skip to content

Intermittent AIOOB during GzipReader test #8391

@headius

Description

@headius

I have seen this failure appear once in a great while (in CI):

Error: test_gzip_reader_restricted_io(TestZlib): Java::JavaLang::ArrayIndexOutOfBoundsException: arraycopy: last destination index 11 out of bounds for byte[10]
java.base/java.lang.System.arraycopy(Native Method)
org.jruby.dist/org.jruby.util.IOInputStream.read(IOInputStream.java:144)
org.jruby.dist/com.jcraft.jzlib.GZIPInputStream.fill(GZIPInputStream.java:144)
org.jruby.dist/com.jcraft.jzlib.GZIPInputStream.readHeader(GZIPInputStream.java:94)
org.jruby.dist/org.jruby.ext.zlib.JZlibRubyGzipReader.initialize(JZlibRubyGzipReader.java:122)
org.jruby.dist/org.jruby.ext.zlib.JZlibRubyGzipReader.initialize19(JZlibRubyGzipReader.java:150)
org.jruby.dist/org.jruby.ext.zlib.JZlibRubyGzipReader$INVOKER$i$0$1$initialize19.call(JZlibRubyGzipReader$INVOKER$i$0$1$initialize19.gen)
org.jruby.dist/org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:825)
org.jruby.dist/org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:220)
org.jruby.dist/org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:843)
org.jruby.dist/org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:90)
org.jruby.dist/org.jruby.RubyObject.callInit(RubyObject.java:261)
org.jruby.dist/org.jruby.ext.zlib.JZlibRubyGzipReader.newInstance(JZlibRubyGzipReader.java:88)
org.jruby.dist/org.jruby.ext.zlib.RubyGzipFile.wrap19(RubyGzipFile.java:93)
org.jruby.dist/org.jruby.ext.zlib.RubyGzipFile$INVOKER$s$0$1$wrap19.call(RubyGzipFile$INVOKER$s$0$1$wrap19.gen)
org.jruby.dist/org.jruby.ir.targets.indy.InvokeSite.performIndirectCall(InvokeSite.java:725)
org.jruby.dist/org.jruby.ir.targets.indy.InvokeSite.invoke(InvokeSite.java:657)
home.runner.work.jruby.jruby.test.jruby.test_zlib.RUBY$method$test_gzip_reader_restricted_io$57(/home/runner/work/jruby/jruby/test/jruby/test_zlib.rb:369)

The test in question uses a StringIO to receive compressed bytes for "hello", and then uses a duck-typed Object as the read side for GzipReader.wrap to read them out.

When it fails it seems to fail across all of the JRuby suites at the same time. I have been unable to reproduce it locally, and when re-running the CI jobs they always pass.

After reviewing the top several methods in the stack and the test case itself, I only have one theory: sometimes the GzipReader is returning more bytes than requested, causing the following logic to try to arraycopy over the end of the buffer (from IOInputStream):

IRubyObject readValue = readAdapter.call(runtime.getCurrentContext(), io, io, runtime.newFixnum(len));
if (readValue.isNil()) return -1;
ByteList str = readValue.convertToString().getByteList();
System.arraycopy(str.getUnsafeBytes(), str.getBegin(), b, off, str.getRealSize());

If the read call results in a String longer than requested, this code will blindly attempt to write the full length into the array, leading to an ArrayIndexOutOfBounds like seen above. Normally that should not happen, because we expect the underlying IO to be well-behaved.

That's bug #1: if more bytes are returned than requested, we should not blindly write over the end of the buffer. I'm not sure if it should be an error or we should silently discard the excess bytes.

Bug #2 is probably in the test itself: why did it produce more bytes than we requested?

The test is here:

def test_gzip_reader_restricted_io
z = Object.new
def z.read(size)
@buf ||= TestZlib.create_gzip_stream("hello")
@buf.slice!(0, size)
end
called = false
Zlib::GzipReader.wrap(z) { |io|
assert_equal("hello", io.read)
called = true
}
assert(called)
end

And the create_gzip_stream method is here:

def self.create_gzip_stream(string)
s = StringIO.new
Zlib::GzipWriter.wrap(s) { |io|
io.write("hello")
}
s.string
end

My theory is that by creating a default StringIO here, we change the meaning of the length requested by Jzlib from bytes to chars, and the resulting call to slice! may return more compressed data than requested. So the IOInputStream wrapper calls read on the z object, which attempts to slice! that much from the compressed buffer. But the default encoding for StringIO is UTF-8, so the slice size will be interpreted as characters. If any bytes in the buffer appear to be a prefix code, the slice will be longer than the requested number of bytes due to multibyte characters.

Bug #2 can be fixed easily enough: make sure the StringIO encoding is ASCII or BINARY so that it String operations are always interpreted as byte lengths and not character lengths.

Bug #1 needs some discussion, I suppose.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions