-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP2: Create additional connections when maximum active streams is reached #35088
Comments
|
Tagging subscribers to this area: @dotnet/ncl |
|
@JamesNK, Is this planned for 5.0? |
|
We would like to address it in some fashion for .NET 5. |
That's true, but this is a separate issue than just dealing with hitting the SETTINGS_MAX_CONCURRENT_STREAMS limit. In other words, if we really think this is an issue (and I agree it could be), then we may need more sophisticated settings/heuristics on the client for when to open an additional connection. |
This seems like a fine solution to me. We will need to figure out how this interacts with connection limit settings. Do we need a separate setting for max HTTP2 connections? Do we just apply the existing limit across both HTTP1.1 and HTTP2? etc. |
I'm having trouble thinking of times where a limit would be super meaningful, but it also feels wrong to have it be unlimited, or to apply the same limit to both. I guess it would serve as a sanity check against poorly configured servers with a low limit? If we were to apply a separate limit here, I would suggest having just an (also, think about what this means for HTTP/3 -- we will likely end up with the same request there)
I'm not convinced this is the correct choice. I would rather be compliant with the standard and respect the server's settings by default. We have had reports of servers detecting this as abusive behavior and serving error responses -- this seems like a reasonable tactic to me that we might see more of as HTTP/2 servers mature, and I worry that relaxing by default will create problems down the road. |
|
I'm ok with whatever default you think is best behavior for HttpClient. A gRPC client typically creates its own HttpClient internally. The gRPC client can change the setting to make adding additional connections the default for gRPC. |
|
@scalablecory Since WinHTTP exposes only a boolean flag enabling /disabling opening extra HTTP/2 connection when the stream limit is reached (WINHTTP_OPTION_DISABLE_STREAM_QUEUE), I think it would be better to also have a boolean On the other hand, it's currently not clear if we can specify a separate HTTP/2 max connection per server limit for WinHttpHandler because WinHTTP has only one option WINHTTP_OPTION_MAX_CONNS_PER_SERVER controlling this limit for all HTTP versions higher than 1.0. |
|
Do we need the |
I think this is the first question that needs answering. |
I think |
|
@JamesNK Could you please clarify whether you need to balance load among opened HTTP/2 connections or not? The proposed implementation in SocketsHttpHandler will open new connection only when the active streams limit reached on the existing connection. Meaning, it will minimize the number of opened connection, but will not balance load across them. |
That is fine. I'm about to start looking at load balancing with gRPC in detail, but I don't see the setting discussed in this issue as being used for it. Load balancing would likely use lower-level connection abstractions. With the current proposal in this issue I imagine that One thing about gRPC is you can have long lived streams. There might be a situation where:
I don't think this is a problem, and it shouldn't require additional logic to handle. Just a heads up that this situation could occur 😄 |
|
@geoffkizer @stephentoub Benchmarks show performance benefits of spreading streams across connections. Do you have any thoughts on using a feature like this to increase RPS? |
Yes, that's part of what we mean by "load balancing" here. I think it's beyond the scope of this particular feature, but it's interesting to look at longer-term. Ideally we can improve the connection concurrency here so that spreading streams across connections is less of a win. |
The original proposal was a boolean On the name, does HTTP/3 have an equivalent max concurrent streams feature? If so, consider whether there is a property name that isn't specific to HTTP/2. |
Yes.
Sort of. It more or less accomplishes the same goal, but in a very different way that makes a client-side bypass feature a little less exact. We'll likely want a similar bypass for HTTP/3, but we'll need to think/discuss on how to reasonably implement it.
I think because HTTP/3 uses UDP and is limited differently, a new setting separate from HTTP/2 is reasonable. But, this isn't a strong opinion. |
|
The HTTP/3 behavior here is pretty similar in terms of user impact. That is, there's a multiplexed connection and the server controls the max degree of multiplexing. The implementation is very different under the covers, but I'm inclined to think that's not relevant to the user experience here. As such I would lean towards naming this in such a way that we can use it for HTTP/3 as well. We can always add a new property for HTTP/3 if we think it's necessary in the future, but it's pretty awkward to have a name that's HTTP2-specific and apply it to HTTP3 as well. @scalablecory Do you have specific concerns here re HTTP3 usage or implementation? |
@geoffkizer I'm hesitant to merge the two because resource usage is different. Thinking in terms of e.g. SNAT which has been a big topic lately, an additional UDP connection could be "free" while a TCP connection will take up limited resources.
Thinking about it further today, I think we can do it reasonably well. QUIC does not use a "max concurrent streams" but rather a "max usable stream id" that is constantly increased (similar to receive window management). I guess a simple policy would just be "if we run out of streams, open a new connection", but the dependency on server sending new stream IDs makes me think a little harder. I'm still a little concerned about our behavior on a high-latency connections. It seems like the server's option will be to allocate a large number of streams (increase the window size) and risk a client entering too much concurrency, or to let them trickle in and starve the client. In the latter case we'd quickly open new connections where with HTTP/2 we would open far less. But, I can't think of a way to work around that. |
So am I. It sounds like the max usable stream ID is a mechanism for the server to throttle the rate of new, possibly short-lived, requests in addition to being a mechanism to limit concurrency. For this reason, we should probably treat running up against the HTTP/3's max usable stream ID differently than running up against HTTP/2's MAX_CONCURRENT_STREAMS setting. Maybe HTTP/3 could still use the same connection limit as HTTP/2 but delay creating a new QUIC connection by some interval to make sure the limit wasn't only hit due to high latency. I'm leaning toward having separate options for HTTP/2 and HTTP/3 connection limits though. |
|
I'm not sure I understand your concerns re HTTP3 and low latency connections. My assumption here is that servers will effectively use HTTP3 the same way they do HTTP2; which is to say, (a) set a stream limit like 100 or whatever; (b) every time a stream is closed, explicitly increase the stream id by 1. This results in essentially the same behavior as HTTP2, it's just that the stream limit increase is explicit instead of implicit. Of course, they could do other things, but it's not clear to me what they would want to do differently here. And they can do similar things with HTTP2 by sending dynamic SETTINGS updates for max concurrent streams, but no one seems to actually do this much in practice. That said, the true test is to see how people actually use this in practice and we just don't have that data yet. |
Thinking about this more, I agree it is pretty similar. An HTTP/2 client is usually affected by latency waiting for a frame with an END_STREAM flag in much the same way an HTTP/3 client is affected by latency waiting for the QUIC max usable stream id to be explicitly incremented. The big difference is that when an HTTP/2 client sends a RST_STREAM frame, it is allowed to immediately consider that stream closed without any sort of acknowledgement. Even if the HTTP/2 max concurrent streams is set to 1, a client can start requests as fast as it can send them without violating the protocol by simply alternating between sending HEADERS frames and RST_STREAM frames in rapid succession. It's because of this, Kestrel sometimes resorts to using the HTTP/2 ENHANCE_YOUR_CALM stream error code despite the client technically never going over the max concurrent stream limit. With HTTP/3 on the other hand, Kestrel can wait increment the max usable stream id until after it fully cleans up the resources used by previous streams. That said, while on the server side we have to be prepared for this, on the client side we can tell people it's a bad idea to rapidly and continuously abort requests. |
Yea, I guess so long as they send the final stream frame with a new max stream frame, it'll be okay. Okay, I'm not too worried about high-latency connections anymore. |
As an example, #35088 described an analogous pattern in the Go ecosystem. While we have distinct patterns and rules for .NET API, if there's analogous prior art it may be interesting to compare approaches.
|
Looks good as proposed. namespace System.Net.Http
{
public partial class SocketsHttpHandler : HttpMessageHandler
{
public int MaxHttp2ConnectionsPerServer { get; set; }
}
public partial class WinHttpHandler : HttpMessageHandler
{
public bool EnableMultipleHttp2Connections { get; set; }
}
} |
* Ask about prior art in API review template As an example, #35088 described an analogous pattern in the Go ecosystem. While we have distinct patterns and rules for .NET API, if there's analogous prior art it may be interesting to compare approaches. * Update .github/ISSUE_TEMPLATE/02_api_proposal.md Co-authored-by: Stephen Toub <stoub@microsoft.com>
New property EnableMultipleHttp2Connections on WinHttpHandler enables multiple HTTP/2 connection to the same server. Contributes to #35088
|
@scalablecory please ping API review to use |
|
@dotnet/fxdc It was decided to remove the connection limit from namespace System.Net.Http
{
public partial class SocketsHttpHandler : HttpMessageHandler
{
public bool EnableMultipleHttp2Connections { get; set; }
}
public partial class WinHttpHandler : HttpMessageHandler
{
public bool EnableMultipleHttp2Connections { get; set; }
}
} |
namespace System.Net.Http
{
public partial class SocketsHttpHandler : HttpMessageHandler
{
public bool EnableMultipleHttp2Connections { get; set; }
}
public partial class WinHttpHandler : HttpMessageHandler
{
public bool EnableMultipleHttp2Connections { get; set; }
}
} |
…s limit is reached (#39439) HTTP/2 standard commands clients to not open more than one HTTP/2 connection to the same server. At the same time, server has right to limit the maximum number of active streams per that HTTP/2 connection. These two directives combined impose limit on the number of requests concurrently send to the server. This limitation is justified in client to server scenarios, but become a bottleneck in server to server cases like gRPC. This PR introduces a new SocketsHttpHandler API enabling establishing additional HTTP/2 connections to the same server when the maximum stream limit is reached on the existing ones. **Note**. This algorithm version uses only retries to make request choose another connection when all stream slots are occupied. It does not implement stream credit management in `HttpConnectionPool` and therefore exhibit a sub-optimal request scheduling behavior in "request burst" and "infinite requests" scenarios. Fixes #35088
…s limit is reached (dotnet#39439) HTTP/2 standard commands clients to not open more than one HTTP/2 connection to the same server. At the same time, server has right to limit the maximum number of active streams per that HTTP/2 connection. These two directives combined impose limit on the number of requests concurrently send to the server. This limitation is justified in client to server scenarios, but become a bottleneck in server to server cases like gRPC. This PR introduces a new SocketsHttpHandler API enabling establishing additional HTTP/2 connections to the same server when the maximum stream limit is reached on the existing ones. **Note**. This algorithm version uses only retries to make request choose another connection when all stream slots are occupied. It does not implement stream credit management in `HttpConnectionPool` and therefore exhibit a sub-optimal request scheduling behavior in "request burst" and "infinite requests" scenarios. Fixes dotnet#35088

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.

API proposal
Rationale
HTTP/2 standard commands to open not more than 1 connection per server while handling concurrent requests via streams. This constraint is aimed to increase network usage efficiency in the most common browser-to-service scenario where many clients talk to single server. However, it can become a bottleneck for service-to-service communication on the cloud where a few service talk to each other because they usually need to make a lot of concurrent requests on behalf of their users while each service using the single
HttpClientinstance with single HTTP/2 connection for all calls. Thus, if that connection have SETTINGS_MAX_CONCURRENT_STREAMS set to 100 (default value), it won't allow to send more than 100 parallel request nor open the same number of concurrent gRPC streams.It's proposed to add new API to SocketsHttpHandler and WinHttpHandler enabling opening multiple HTTP/2 connections per server.
SocketsHttpHandler
Native WinHTTP has only boolean option disabling HTTP/2 streams queueing (WINHTTP_OPTION_DISABLE_STREAM_QUEUE), but it doesn't allow to set the limit of open HTTP/2 connections per server. However it seems a bit risky, so for SocketsHttpHandler it's proposed to add an integer property
MaxHttp2ConnectionsPerServercontrolling the maximum HTTP/2 connections established to the same server. If this property is set to a value greater than 1,SocketsHttpHandlerwill open new HTTP/2 connections when all existing connections reached the maximal number of open streams. Once the number of open connections gets equal toMaxHttp2ConnectionsPerServer, streams queueing will be enabled again.WinHttpHandler
It's proposed to only add the boolean property
EnableMultipleHttp2Connectionswithout any way to limit the number of connections to mirror the behavior of the underlying native implementation.Problem:
HTTP/2 has a SETTINGS_MAX_CONCURRENT_STREAMS setting that is configured by the server. This is the upper limit of active streams for a single connection. The limit exists to prevent a caller from using up resources on the server by starting an unbounded number of streams on one connection. The recommended lower default for SETTINGS_MAX_CONCURRENT_STREAMS is 100. This is the value that Kestrel uses. Some HTTP/2 servers have a slightly higher limit, but 100-200 appears to be the normal default.
Today HttpClient with HTTP/2 will open a single connection for a host, and all HTTP/2 requests open a new stream on a single connection. If there are already 100 active requests in-progress then
SendAsyncwill await, a additional requests will form a FIFO queue, waiting for in-progress requests to complete. You can see this behavior discussed on issue #30596. FYI, if the client didn't hang and attempted to call the server anyway then the server will reject the request.While the limit and queue behavior can make sense for multiple client applications that call a server, it is problematic for server to server communication. It is a common pattern in server applications to create a single HttpClient (either manually, or using the HttpClientFactory), and then use that connection for all calls to another server.
In server to server communication requests will be limited to 100 at a time, decreasing throughput and increasing latency as requests pileup in a queue. Additionally, technologies like gRPC support the concept of long-lived streaming calls. A server app that is using them for real-time communication with another server will hang after the 100th long-lived stream is started.
Two additional issues that worsen this situation:
It is hard customers to figure out what has gone wrong and how to fix it.
Solution:
Two broad solutions:
In my opinion increasing/removing SETTINGS_MAX_CONCURRENT_STREAMS on the server isn't a good solution. Hundreds or thousands of streams multiplexed on one connection will likely degrade performance. TCP level head of line blocking is a thing in HTTP/2, and one dropped packet will hold up every request.
A better solution is for HttpClient to support opening an additional connection to the server when SETTINGS_MAX_CONCURRENT_STREAMS is reached. This will allow a high-throughput of requests or many active gRPC streams without hanging on the client.
New setting on
SocketsHttpHandler:When
StrictMaxConcurrentStreamsisfalsethen an additional connection to the server is created if all existing connections are at the SETTINGS_MAX_CONCURRENT_STREAMS limit.The maximum number of HTTP/2 connections to a server will be limited by
MaxConnectionsPerServer. When it is reached (max streams on max connections) then the existing behavior will resume of additional requests awaiting in a FIFO queue.Note that opening a new connection like this to get around SETTINGS_MAX_CONCURRENT_STREAMS is discouraged by the HTTP/2 spec. I think that this guidance is focused at browsers, and doesn't fit for server to server communication in a microservice environment.
I will leave the decision of whether StrictMaxConcurrentStreams defaults to true or false up to networking team.
gRPC usage
Because gRPC is commonly used in microservice scenarios, and gRPC streaming is a popular concept, it makes sense for gRPC to not queue in the client when the limit is reached.
The .NET gRPC client creates its own HttpClient. It can configure the underlying handler so that StrictMaxConcurrentStreams = false.
Prior art
golang has StrictMaxConcurrentStreams - https://godoc.org/golang.org/x/net/http2#Transport. In the latest version of golang StrictMaxConcurrentStreams defaults to false and MaxConnsPerHost has no limit. The golang client will "just work".
WinHttp has WINHTTP_OPTION_DISABLE_STREAM_QUEUE - https://docs.microsoft.com/en-us/windows/win32/winhttp/option-flags#WINHTTP_OPTION_DISABLE_STREAM_QUEUE. I believe this is false by default so queuing is the default behavior.
@davidfowl @karelz @scalablecory @Tratcher @halter73 @stephentoub @shirhatti
The text was updated successfully, but these errors were encountered: