Skip to content

proposal: net/http: add .MaxConnLifespan or .SetMaxConnLifespan() to Transport #54429

@josephcopenhaver

Description

@josephcopenhaver

Currently there is no way to signal to a client via the http.Transport struct that a connection should not be used beyond a specific timeout since it was originally created.

Such a feature is useful to some authors as upstreams referenced via dns gain more ip addresses over time and scale-out horizontally. When this occurs it may be in the client's best interest to rebalance connections amongst the available upstream nodes to avoid creating hot nodes in the upstream

It is a naive implementation as really there is no need to reset connections in such a situation until there are indeed more upstreams available to connect to. In addition there's no need to kill all connections in such topologies repeatedly, just the ones we intend to move over to the newly available upstream ips. ( more discussion on the connection pool FIFO behavior being problematic for maintaining the fewest number of connections necessary can be found here )

Despite these drawbacks the benefits are clear for persons who accept the overhead of creating new connections in such a fashion if they wish to avoid hot partitions and do not have load-balancing capabilities offered by traditional expensive cloud provider appliances and mesh framework sidecar proxies. In short this is beneficial for the exact same reasons it's beneficial in https://pkg.go.dev/database/sql 's package via the SetConnMaxLifetime function. An argument can be made this is a tradeoff, but it appears to be one people need. It seems safest to provide something in the net/http.Transport struct rather than encourage people to implement their own transport or adopt an alternative.

This is also beneficial in situations where the maintainers know the upstream server will attempt to close the connections after a max lifetime / idle rate but for some edge cases such as network congestion or an event equivalent to complete packet loss. When the latter occurs and the client has only idle connections it can take quite a few seconds to recover and finally start sending requests again to the new upstream instances ( depending on the idle connection pool max size, keep alive settings, and the client's throughput/burst behaviors )


This would successfully address #23427 's concerns.

An example toy implementation adding MaxConnLifespan as a data member has already been created and a PR is open with passing tests: #46714 .

Minor tweaking to the above is required if exposing a way to mutate the MaxConnLifetime value over time is a desirable feature. By adding this functionality to the existing mechanisms for computing idle timeouts we also make sure the clients close the connections without maintaining them actively in the connection pool just to close them whenever the client finally gets around to using the connection again assuming we have a high idle connection timeout.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Incoming

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions