rbByteEncode fails to no-op when encodings are the same

While investigating fixes for #8682 I discovered that rbByteEncode does not properly handle the case of source and destination encoding being the same.

As described in https://github.com/jruby/jruby/issues/8682#issuecomment-2709893370:

IOOutputStream calls RubyIO.write and passes only the incoming bytes, offsets, and encoding:

https://github.com/jruby/jruby/blob/14ec399968bc9c1c2ff618f12be16b909d3e521a/core/src/main/java/org/jruby/util/IOOutputStream.java#L152-L163

This path was added as an optimization in https://github.com/jruby/jruby/commit/236f7ba25168af78b5b1635a828d6bc968c264d8 to avoid constructing string objects just to immediately unwrap them for IO writes.

Unfortunately it hits a bug eventually in EncodingUtils.rbByteEncode where two identical encodings will not no-op, but instead will trigger an encoding error because it rejects encoding from and to the same encoding.

The short-circuit checks here:

https://github.com/jruby/jruby/blob/844c1510f8b704925189b248873f54f49bdb569f/core/src/main/java/org/jruby/util/io/EncodingUtils.java#L983-L989

...were never updated when the sister logic for RubyString was updated in https://github.com/jruby/jruby/commit/621369cae0901d3f87ad7ccd27cb544cc7537017. The correct logic, with additional refactoring, looks like this:

https://github.com/jruby/jruby/blob/844c1510f8b704925189b248873f54f49bdb569f/core/src/main/java/org/jruby/util/io/EncodingUtils.java#L1094-L1101

Ignoring the problems with IOOutputStream only being able to specify a single encoding for all String writes, we need to fix this issue in rbByteEncode and ensure same-encoding calls no-op the same way as rbStrEncode.

	@Override
	public void write(final byte[] b,final int off, final int len) throws IOException {
	ThreadContext context = runtime.getCurrentContext();

	RubyIO realIO = this.realIO;
	if (realIO != null) {
	realIO.write(context, b, off, len, encoding);
	} else {
	IRubyObject io = this.io;
	writeAdapter.call(context, io, io, RubyString.newStringLight(runtime, new ByteList(b, off, len, encoding, false)));
	}
	}

	if (encoding.isAsciiCompatible() && to.isAsciiCompatible()) {
	if (cr == StringSupport.CR_7BIT) {
	return null;
	}
	} else if (encodingEqual(sname, dname)) {
	return null;
	}

	if (senc != null && senc == denc) {
	return strTranscodeScrub(context, forceEncoding, str, ecflags, ecopts, result, explicitlyInvalidReplace, denc, senc);
	} else if (is7BitCompat(str, denc, senc)) {
	return result.apply(context, str, denc, str);
	} else if (encodingEqual(sname, dname)) {
	if (forceEncoding.isNil()) denc = null;
	return result.apply(context, str, denc, str);
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rbByteEncode fails to no-op when encodings are the same #8686

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

rbByteEncode fails to no-op when encodings are the same #8686

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions