Skip to content

Add some extra closing handshake tests#1906

Closed
pimterry wants to merge 4 commits intowebsockets:simplify/closing-handshakefrom
pimterry:extra-close-tests
Closed

Add some extra closing handshake tests#1906
pimterry wants to merge 4 commits intowebsockets:simplify/closing-handshakefrom
pimterry:extra-close-tests

Conversation

@pimterry
Copy link
Copy Markdown
Contributor

This is an attempt to test some of the behaviour discussed in #1899 (comment) and test how that's improved with the changes currently pending in #1902.

The tests are a bit rough in places, feel free to tweak or drop some of them if you think they aren't relevant. When run against master, 2 of these tests currently fail:

  • "cleanly shuts down after simultaneous errors"
  • "handles socket shutdown while compressing outgoing frames"

Both of these are current bugs on master: simultaneous errors (or anything else that means the remote peer doesn't respond) will break clean connection shutdown so the socket stays fully open until timeout, and also a remote ws.end() that starts socket shutdown immediately drops all pending outgoing data, incorrectly & unnecessarily.

Various tests here fail in interesting ways given the partial fixes I suggested in that PR thread, basically as predicted there. When run against the new PR with the full set of changes though, all the tests do happily pass, which is great!

This tries to cover the shutdown for the various half-open combinations too, by directly simulating no-half-open behaviour rather than using allowHalfOpen=false, so that it's still effective even when the new PR resets that. It's not a super rigorous real-world test by any means, and it doesn't test various possible data loss cases for each combination, but from this alone it seems like the basic behaviour is working great 👍

lpinca and others added 2 commits June 15, 2021 18:50
- When the socket emits the `'end'` event, do not call `socket.end()`.
  End only the `receiver` stream.
- Do not wait for a close frame to be received and sent before calling
  `socket.end()`. Call it right after the close frame is sent.
- When the `receiver` stream emits `'finish'`, send a close frame if no
  close frame is received.

The assumption is that the socket is allowed to be half-open. On the
server side this is always true (unless the user explicitly sets the
`allowHalfOpen` property of the socket to `false`). On the client side
the user might use an agent so we set `socket.allowHalfOpen` to `true`
when the `http.ClientRequest` object emits the `'upgrade'` event.

Refs: websockets#1899
@pimterry
Copy link
Copy Markdown
Contributor Author

I was testing in Node 14 locally - works great there but it turns out there are test failures in 8, 10 & 12. I'll look into those now.

If we're already closing, then we shouldn't move into the new -2 ready
state (i.e. closing-because-the-remote-end-closed). We're already
closing, we shouldn't go back to a state that allows sending another
close frame.

This didn't seem to affect the happy path, or v14, or the existing
tests, but in the new error test on v10 and similar it resulted in a
double close(), and a write-after-end error on the socket itself.
@pimterry pimterry force-pushed the extra-close-tests branch from 04dca84 to 104c4d3 Compare June 22, 2021 16:00
@pimterry
Copy link
Copy Markdown
Contributor Author

Ok, I think I've fixed this. This was an actual bug in the PR that the tests found, affecting older node versions only.

In Node older than v14, after a receiver error, the receiver still fires a 'finish' event when the socket itself ends. In Node v14+ it seems that this doesn't happen. No idea why, but this causes the bug.

In all cases: when the receiver hits the error, we close() the websocket with an error code, we resume() to ignore all future socket input, and we wait for the other end before shutting down the socket. At this point we're CLOSING.

Unfortunately later when socket actually ends, we set _readyState back to -2 (the new 'PRECLOSING' state where we're still allowed to send a close frame) even though we've already sent a close frame.

That wouldn't normally break anything, and it doesn't with Node v14+. With older node though, we get an extra 'finish' event. When that fires it tries to close the websocket, and it's allowed to close it again, even though we've already closed it, because the readyState is no longer CLOSING and it's not yet CLOSED. This double-close then throws write-after-end on the socket.

This is fixed here by not setting _readyState to -2 unless the connection is currently OPEN, so you can't go from CLOSING back to -2 again. Does that seem reasonable?

// `WebSocket.prototype.close()` from sending a close frame.
//
websocket._readyState = -2;
if (websocket._readyState === WebSocket.OPEN) {
Copy link
Copy Markdown
Member

@lpinca lpinca Jun 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not removing the receiverOnFinish listener from receiverOnError() instead?

Nvm. I think that is not sufficient because the issue could be triggered even on Node.js >= 14 and even if the receiver does not emit an 'error' event. For example if the user calls websocket.close() right after a 2 -> -2 readyState change.

@lpinca
Copy link
Copy Markdown
Member

lpinca commented Jun 22, 2021

Good catch. This confirms your finding

const { Writable } = require('stream');

const writable = new Writable({
  write(chunk, encoding, callback) {
    process.nextTick(function () {
      callback(new Error('Oops'));
    });
  }
});

writable.on('error', function () {
  writable.end();
});

writable.on('finish', function () {
  throw new Error("The 'finish' event was emitted after the 'error' event");
});

writable.write('foo');

It seems it was fixed by nodejs/node@ba3be578d8 included in Node.js 13.

@lpinca
Copy link
Copy Markdown
Member

lpinca commented Jun 22, 2021

I've cherry picked the fix into the simplify/closing-handshake branch (ef7c3c7). Will look at tests later.

@pimterry
Copy link
Copy Markdown
Contributor Author

I didn't notice before, but the Mac & Windows tests were still running, not passing. They've since timed out, because they were actually failing & stuck in that same parsing error test too.

In their case the failure is different, and I think it's acceptable behaviour, so I've updated the test.

The cause is that on Windows & Mac, unlike Linux, during this error shutdown the event order is:

  • Receiver error -> close()
  • Error close frame written, client calls socket.end()
  • Server receives close frame, server replies
  • Response close frame written, server calls socket.end()
  • Server socket 'end' event
  • Server socket 'close' event, so server emits 'close'
  • Client socket 'end' event
  • Client socket 'close' event, so client emits 'close'

Meanwhile on Linux, I see:

  • Receiver error -> close()
  • Error close frame written, client calls socket.end()
  • Server receives close frame, server replies
  • Response close frame written, server calls socket.end()
  • Server socket 'end' event
  • Client socket 'end' event
  • Client socket 'close' event, so client emits 'close'
  • Server socket 'close' event, so server emits 'close'

I.e. the order of server/client 'close' events is different. I suspect on Windows & Mac the 'close' event fires synchronously if the other end has already ended, while on Linux it's async, or there's some other step involved.

The test expected the client to always fire 'close' first because I tested it on Linux, but actually I don't think it really matters. Either way the test checks that both sides have sent close frames and closed their sockets without error, so I think it's fine either way. Does that sound right to you?

For now I've updated the test to allow 'close' to fire in either order, as long as both sides get their close frames with no errors, and it seems to work nicely now.


ws.on('open', () => {
// Add 200ms delay to compression:
makeDeflateSlow(ws, 200);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid using timers? We can push the data directly to the socket without relying on the other peer, similarly to how it is done here

ws._socket.push(Buffer.concat(list));
and then assert that sender._deflating === true.

});
});

it('handles connection closures over both half-open-allowed sockets', (done) => {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this and the next tests are not very useful because no data is buffered/compressed/decompressed. They are not testing edge cases.

@lpinca
Copy link
Copy Markdown
Member

lpinca commented Jun 23, 2021

Either way the test checks that both sides have sent close frames and closed their sockets without error, so I think it's fine either way. Does that sound right to you?

Yes, I did the same in #1902.

@lpinca lpinca force-pushed the simplify/closing-handshake branch 4 times, most recently from d5b758f to e350b86 Compare June 23, 2021 18:28

wss.on('connection', (ws) => {
// Server closes after 100ms, before the client finishes sending its compressed message:
setTimeout(() => ws.close(1000), 100);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this behaves differently depending on where the test is run:

  1. On the master branch this just calls socket.write(closeFrame);
  2. On this branch it calls socket.write(closeFrame) and socket.end().

The test should be independent from websocket.close() behavior. Should we test case 1, 2, or both?

@lpinca
Copy link
Copy Markdown
Member

lpinca commented Jun 24, 2021

I cherry picked fad87e3 and tweaked it in f713a6e.

@pimterry
Copy link
Copy Markdown
Contributor Author

I cherry picked fad87e3 and tweaked it in f713a6e.

Great! Sorry I haven't had time to work through the comments above myself. I think that's fine though, these tests were mostly useful as a way to explore how the old & new behaviour differ and confirm that we've fixed the original issues. If you've pulled in the useful ones then we can close this now.

@pimterry pimterry closed this Jun 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants