It is significant when crawling maven repo index,
- reuse socket: 0.3s/page
- open a new socket: 1s/page
Will reuse the socket connection
Connection conn = HttpConnection.connect("http://localhost");
conn.execute().parse();
conn.url("http://localhost/index.html");
conn.execute().parse();
Will close and open a new socket connection
Connection conn = HttpConnection.connect("http://localhost");
conn.execute().body();
conn.url("http://localhost/index.html");
conn.execute().body();
- parse() invoke DataUtil.parseInputStream, which close the bodyStream;
- body() invoke DataUtil.readToByteBuffer, which not close the bodyStream.
If the InputStream closed & the host not changed, java.net.HttpURLConnection will reuse the socket, though invoked disconnect().
I do the following change:
private void prepareByteData() {
Validate.isTrue(executed,
"Request must be executed (with .execute(), .get(), or .post() before getting response body");
if (byteData == null) {
Validate.isFalse(inputStreamRead, "Request has already been read (with .parse())");
try {
byteData = DataUtil.readToByteBuffer(bodyStream, req.maxBodySize());
} catch (IOException e) {
throw new UncheckedIOException(e);
} finally {
inputStreamRead = true;
////////////////////////////// start
try {
bodyStream.close();
} catch (IOException e) {
// no-op
} finally {
bodyStream = null;
}
////////////////////////////// end
safeClose();
}
}
}
It is significant when crawling maven repo index,
Will reuse the socket connection
Will close and open a new socket connection
If the InputStream closed & the host not changed,
java.net.HttpURLConnectionwill reuse the socket, though invoked disconnect().I do the following change: