You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are some basic things we'll want to check everywhere (e.g., Nexus, Sled Agent, DNS servers, etc.) for availability:
TCP KeepAlive: want to enable this on all network connections (in both directions) to identify failed systems. external vs. internal should probably have different values.
HTTP KeepAlive: probably want to just pick a value like 60 seconds. Consider having clients make dummy requests to keep the connections open? (to avoid the problem of picking a connection that's been open for just under 60 seconds, sending a request, and having the server slam the door in your face -- we ran into this with Manta, admittedly only at very large scale since it's fairly improbable)
We'll want to review these, too. They might be more security-related (see #2184):
limits for bad client behavior:
maximum time waiting for a client to send request headers (whether on a new connection or between requests)
minimum flow rate for request bodies (can be fairly low -- just want to avoid clients dribbling data in as a DoS vector to keep connections open)
maximum number of open connections (ideally limited separately for different APIs -- e.g., external vs. internal)
TCP listen socket backlog
maximum rate of new connections created [ideally per-client]
maximum rate of incoming requests [per authenticated user? or IP?, as well as overall]
maximum number of connect-in-progress sockets
maximum number of TLS-session-establishment-in-progress sockets
size of tokio worker thread pool, blocked thread pool
maximum length of time that graceful server shutdown can take
There are some basic things we'll want to check everywhere (e.g., Nexus, Sled Agent, DNS servers, etc.) for availability:
We'll want to review these, too. They might be more security-related (see #2184):