Skip to content

Conversation

@delvedor
Copy link
Member

Currently, if a user calls fastify.close and the server is under a heavy load, the server will not close until there is an open TCP connection, so every new request will execute the entire request lifecycle.

With this change once fastify.close is invoked, Fastify will automatically start to respond with a 503 error, closing every new connection as soon as possible.

There are not sensible changes in the benchmarks.

Checklist

  • run npm run test and npm run benchmark
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message and code follows Code of conduct

@delvedor delvedor added enhancement semver-minor Issue or PR that should land as semver minor labels Apr 15, 2018
@delvedor delvedor requested a review from a team April 15, 2018 20:22
@allevo
Copy link
Member

allevo commented Apr 15, 2018

I'm -1 with this feature (from my point of view this is a breaking behaviour change).

Anyway I'm +1 if the close method signature will be close(force, callback) with default set to false.

@delvedor
Copy link
Member Author

I'm -1 with this feature (from my point of view this is a breaking behaviour change).

Actually, it can be interpreted as a bugfix.
Close is a LIFO queue of onClose handlers, once a user calls close it will activate that queue, closing database connections for example. This will lead to many errors inside the various route handlers, which is not very nice.
With this change, we are actually protecting the server from all that kind of errors.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example with inject is wrong. Can you use a http request with an agent with keep-alive?

@allevo, this is necessary for send() to actually work. If there is a keepalive connection, the connection is never severed, making a graceful disconnect not possible.

@nwoltman
Copy link
Contributor

Close is a LIFO queue of onClose handlers

Really? I assumed it was FIFO. So in this code:

fastify.onClose(async () => {
  // This is called second
})

fastify.onClose(async () => {
  // And this is called first
})

?

@allevo
Copy link
Member

allevo commented Apr 15, 2018

I don't understand. I expect that the HTTP server is closed before any other dependencies. Is this the currently behaviour ?

@delvedor
Copy link
Member Author

Really? I assumed it was FIFO.

@nwoltman yes. The reason is because the last plugin you have have registered should be the first you must close.

@nwoltman
Copy link
Contributor

@delvedor In that case, what about registering this onClose handler last so that connections are closed before running the other onClose handlers?

fastify/fastify.js

Lines 112 to 118 in 5fefda9

fastify.onClose((instance, done) => {
if (listening) {
instance.server.close(done)
} else {
done(null)
}
})

@delvedor
Copy link
Member Author

@nwoltman done. But the problem persist, because of the keep alive connections as @mcollina pointed out.

@nwoltman
Copy link
Contributor

@delvedor You're right, I forgot about the keep-alive connections. Would it help to include a Connection: close header in the 503 response?

@delvedor
Copy link
Member Author

@nwoltman it should be the default, but it will not hurt ;)

@nwoltman
Copy link
Contributor

close is the default for HTTP/1.0 requests, but modern browsers use HTTP/1.1 where the default is to use a keep-alive connection.

The changes LGTM, although I think @mcollina has a good point about testing with something that uses keep-alive connections instead of .inject().

@delvedor
Copy link
Member Author

I'm trying to write a test with keep alive, but I'm having some trouble to get it working.
Any suggestion?
https://gist.github.com/delvedor/558a1569d7db6e2c0c9e3870eb96fd33

@mcollina
Copy link
Member

@delvedor you should verify it's reusing the same socket. I don't think it is, as you are not resuming the response. Also, add a maxSocket: 1 option to the agent.

@delvedor
Copy link
Member Author

It seems that AppVeyor is failing for a test that is not present in this branch 🤔

if (closing === true) {
res.writeHead(503, {
'Content-Type': 'application/json',
'Content-Length': '80',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you calculate it based on the payload as with 404 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That 80 is the length of the json response ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know. It's better to write payload.length

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the payload length.
We can do the same at the line you have pointed pointed out, since the payload will be always the same.

@StarpTech
Copy link
Member

@delvedor I'm trying to reproduce it on my machine.

@StarpTech
Copy link
Member

StarpTech commented Apr 17, 2018

@delvedor can't reproduce it on local Node 6.13.1 I think the timeouts aren't solid try to increase them. retrigger the build.

@delvedor
Copy link
Member Author

@StarpTech one of the two AppVeyor logs shows
error is not logged because promise was fulfilled with undefined but response was sent before promise resolution not ok Request timed out.
Which is not a test we are running in this branch. Do you know why?

@StarpTech
Copy link
Member

@delvedor no idea. Try to rebuild.

@delvedor
Copy link
Member Author

I can't, @mcollina should do it.

@mcollina
Copy link
Member

done.

@mcollina
Copy link
Member

You'll need to increaase that timeout: https://ci.appveyor.com/project/mcollina/fastify/build/job/x2ml3eliuojq3au2#L41.

@StarpTech
Copy link
Member

@delvedor performance of what? The request / response shouldn't be effected.

@delvedor
Copy link
Member Author

@StarpTech for every new connection you should update a Set and every time that a connection will be closed you should again update that Set. This is overhead that I don't want to pay unless I don't have to.

@StarpTech
Copy link
Member

StarpTech commented Apr 18, 2018

@delvedor you are worried about to maintain a set? What's the alternative to close all pending connections? From my understanding, the problem is that server.close does not clean up the connections in the way we want?

@delvedor
Copy link
Member Author

I'm worried about the cost of maintaining a Set.

Regarding server.close, this is the behavior (documentation):

Stops the server from accepting new connections and keeps existing connections. This function is asynchronous, the server is finally closed when all connections are ended

@StarpTech
Copy link
Member

StarpTech commented Apr 18, 2018

@delvedor https://github.com/hunterloftis/stoppable This looks promising refer to design and performance.

@StarpTech
Copy link
Member

StarpTech commented Apr 18, 2018

There is also an interesting thread about it nodejs/node#2642 Node.js does not close existing, idle connections. We should support it.

@delvedor
Copy link
Member Author

@StarpTech I know, I've already read that issue before open this pr.
The problem is the same, you are keeping a Map, calling .set and .delete very often.
That is pure overhead. I'm looking for a solution with the least overhead.

What I don't understand it's why the test are failing in travis but not in my local machine.

@StarpTech
Copy link
Member

They are failing really randomly. Sometimes it works sometimes not. Try to run the single test in isolation.

@delvedor delvedor force-pushed the block-requests-on-close branch from 4a40e67 to 30428f3 Compare April 18, 2018 19:46
@delvedor delvedor force-pushed the block-requests-on-close branch from 30428f3 to 930292a Compare April 18, 2018 19:57
@delvedor
Copy link
Member Author

@mcollina can you check if the usage of autocannon is correct?

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the autocannon tests to a new file?

@delvedor delvedor force-pushed the block-requests-on-close branch from 89d7374 to affc212 Compare April 18, 2018 21:49
@delvedor
Copy link
Member Author

Done, I've also added another check.
What is the difference?

t.strictEqual(codes.shift(), statusCode)
})

instance.on('done', () => t.pass())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use t.done? It is more clear

} else {
done(null)
}
app.once('preReady', () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe overwriting avvio ready method fits better here. Or not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rephrase?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting the onClose handler inside preReady event doesn't ensure that the callback is invoked before any others. It does only if the preReady callbacks are fired in FIFO.

We use

https://github.com/mcollina/avvio/blob/a957201b0b33facf06687a5ba8e52c82a46b1302/boot.js#L13

and creating ready method inside fastify.

The new method forces the add onClose handler as the last, so the first one to be invoked.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does only if the preReady callbacks are fired in FIFO.

They are, it is an event emitter.

We can't use ready inside Fastify, because ready will start the boot sequence.

Copy link
Member

@allevo allevo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@nwoltman nwoltman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (once tests are passing)

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with green CI

@mcollina
Copy link
Member

Green CI! 🎉

@delvedor delvedor merged commit cbd08af into master May 25, 2018
@delvedor delvedor deleted the block-requests-on-close branch May 25, 2018 08:46
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

semver-minor Issue or PR that should land as semver minor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants