Intermittent AIOOB during GzipReader test

I have seen this failure appear once in a great while (in CI):

```
Error: test_gzip_reader_restricted_io(TestZlib): Java::JavaLang::ArrayIndexOutOfBoundsException: arraycopy: last destination index 11 out of bounds for byte[10]
java.base/java.lang.System.arraycopy(Native Method)
org.jruby.dist/org.jruby.util.IOInputStream.read(IOInputStream.java:144)
org.jruby.dist/com.jcraft.jzlib.GZIPInputStream.fill(GZIPInputStream.java:144)
org.jruby.dist/com.jcraft.jzlib.GZIPInputStream.readHeader(GZIPInputStream.java:94)
org.jruby.dist/org.jruby.ext.zlib.JZlibRubyGzipReader.initialize(JZlibRubyGzipReader.java:122)
org.jruby.dist/org.jruby.ext.zlib.JZlibRubyGzipReader.initialize19(JZlibRubyGzipReader.java:150)
org.jruby.dist/org.jruby.ext.zlib.JZlibRubyGzipReader$INVOKER$i$0$1$initialize19.call(JZlibRubyGzipReader$INVOKER$i$0$1$initialize19.gen)
org.jruby.dist/org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:825)
org.jruby.dist/org.jruby.internal.runtime.methods.DynamicMethod.call(DynamicMethod.java:220)
org.jruby.dist/org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:843)
org.jruby.dist/org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:90)
org.jruby.dist/org.jruby.RubyObject.callInit(RubyObject.java:261)
org.jruby.dist/org.jruby.ext.zlib.JZlibRubyGzipReader.newInstance(JZlibRubyGzipReader.java:88)
org.jruby.dist/org.jruby.ext.zlib.RubyGzipFile.wrap19(RubyGzipFile.java:93)
org.jruby.dist/org.jruby.ext.zlib.RubyGzipFile$INVOKER$s$0$1$wrap19.call(RubyGzipFile$INVOKER$s$0$1$wrap19.gen)
org.jruby.dist/org.jruby.ir.targets.indy.InvokeSite.performIndirectCall(InvokeSite.java:725)
org.jruby.dist/org.jruby.ir.targets.indy.InvokeSite.invoke(InvokeSite.java:657)
home.runner.work.jruby.jruby.test.jruby.test_zlib.RUBY$method$test_gzip_reader_restricted_io$57(/home/runner/work/jruby/jruby/test/jruby/test_zlib.rb:369)
```

The test in question uses a StringIO to receive compressed bytes for "hello", and then uses a duck-typed Object as the read side for GzipReader.wrap to read them out.

When it fails it seems to fail across all of the JRuby suites at the same time. I have been unable to reproduce it locally, and when re-running the CI jobs they always pass.

After reviewing the top several methods in the stack and the test case itself, I only have one theory: sometimes the GzipReader is returning more bytes than requested, causing the following logic to try to arraycopy over the end of the buffer (from IOInputStream):

https://github.com/jruby/jruby/blob/d66a2eb83220e59188b7420876112dfb32557b85/core/src/main/java/org/jruby/util/IOInputStream.java#L139-L144

If the read call results in a String longer than requested, this code will blindly attempt to write the full length into the array, leading to an ArrayIndexOutOfBounds like seen above. Normally that should not happen, because we expect the underlying IO to be well-behaved.

That's bug #1: if more bytes are returned than requested, we should not blindly write over the end of the buffer. I'm not sure if it should be an error or we should silently discard the excess bytes.

Bug #2 is probably in the test itself: why did it produce more bytes than we requested?

The test is here:

https://github.com/jruby/jruby/blob/9d867690c68b8e2ee441f3cc2d2f3fc64553457f/test/jruby/test_zlib.rb#L362-L374

And the `create_gzip_stream` method is here:

https://github.com/jruby/jruby/blob/9d867690c68b8e2ee441f3cc2d2f3fc64553457f/test/jruby/test_zlib.rb#L430-L436

My theory is that by creating a default StringIO here, we change the meaning of the length requested by Jzlib from bytes to chars, and the resulting call to `slice!` may return more compressed data than requested. So the IOInputStream wrapper calls `read` on the `z` object, which attempts to `slice!` that much from the compressed buffer. But the default encoding for StringIO is UTF-8, so the slice size will be interpreted as characters. If any bytes in the buffer appear to be a prefix code, the slice will be longer than the requested number of bytes due to multibyte characters.

Bug #2 can be fixed easily enough: make sure the StringIO encoding is ASCII or BINARY so that it String operations are always interpreted as byte lengths and not character lengths.

Bug #1 needs some discussion, I suppose.

	IRubyObject readValue = readAdapter.call(runtime.getCurrentContext(), io, io, runtime.newFixnum(len));

	if (readValue.isNil()) return -1;

	ByteList str = readValue.convertToString().getByteList();
	System.arraycopy(str.getUnsafeBytes(), str.getBegin(), b, off, str.getRealSize());

	def test_gzip_reader_restricted_io
	z = Object.new
	def z.read(size)
	@buf \|\|= TestZlib.create_gzip_stream("hello")
	@buf.slice!(0, size)
	end
	called = false
	Zlib::GzipReader.wrap(z) { \|io\|
	assert_equal("hello", io.read)
	called = true
	}
	assert(called)
	end

	def self.create_gzip_stream(string)
	s = StringIO.new
	Zlib::GzipWriter.wrap(s) { \|io\|
	io.write("hello")
	}
	s.string
	end

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Intermittent AIOOB during GzipReader test #8391

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Intermittent AIOOB during GzipReader test #8391

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions