-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replaced async model with true async #337
Conversation
Rewrote DoAsyncRequest to use real async web methods instead of starting new tasks and allocating thread pool resources for waits and callbacks.
I love pull requests like this! May I ask what prompted the rewrite? I have not run this implementation yet and gathered numbers but at first glance its the same threadpool usage wise. I use Task.Factory.FromAsync to wrap over the BeginXXX/EndXXX pattern which is async and should use IO completion ports the same as manually calling BeginXXX and EndXXX so besides the minimal overhead of creating the task it should be just as async / threadpool friendly. The original implementation iterates over the yielded tasks in a seperate thread (LongRunning hint) thats not part of the treadpool, which is the reason for the semaphore. (I also do not like System.Net.ServicePointManager.DefaultConnectionLimit because this rule applies to the AppDomain globally and you might have more moving parts that require different numbers). The RegisterForSingleObject on the threadpool is sadly the only way to implement timeouts in a controllable way, I've had many cases where the timeout property on requests itself are not adhered which your new routine does not have an alternative implementation for. |
On line 230 you allocate a worker thread to run the async code for you using Task.Factory.StartNew(); |
And besides creating a whole bunch of threads, I think it seems like be a bad idea to create them outside the thread pool. LongRunning on threads that only live for the length of a web request sounds wrong to me. |
Explictly starting threads outside the threadpool is generally a good idea in something like asp.net where you are competing with the server for the 200 max threadpool threads that are actually serving the requests. Threadpool thread starvation is a real issue there. Task.Factory.StartNew() is non blocking: http://localghost.io/articles/oss-development,-a-continuous-lesson-in-humility-2013-03-07/ The LongRunning hint prevents it from being inlined on the calling thread. Before arguing the pros/cons too much I better pull the bits in and let the numbers speak for themselves, I'll pull this in one way or another. If the numbers are around the same throughput wise i'll pull this in as 'NoTasksHttpConnection' implentation of IConnection. Again thanks for taking the time @henkish |
Why start threads at all? The whole point of async is to free resources. I like your lib and use it in a lot of projects, but you need to understand that the current async implementation is dangerous and needs to be fixed. I don't care if you use my code or fix your current code, as long as it gets fixed! The code below emulates 1000 concurrent requests, each taking 1 second. class Program
{
static void Main( string[] args )
{
Console.WriteLine("Wrapped in LongRunning task:");
RunTest( () => Task.Factory.StartNew( () => Task.Delay( 1000 ).Wait(), TaskCreationOptions.LongRunning ) );
Console.WriteLine( "Not wrapped:" );
RunTest( () => Task.Delay( 1000 ) );
}
private static void RunTest( Func<Task> createTask )
{
var tasks = Enumerable.Range( 0, 1000 ).Select( i => createTask() ).ToArray();
var process = Process.GetCurrentProcess();
Console.WriteLine( "\tThreads: " + process.Threads.Count );
Console.WriteLine( "\tVirtual memory: " + process.VirtualMemorySize64 );
Task.WhenAll( tasks ).Wait();
}
} |
Hey @henkish I'm well aware of the fact that threads are not cheap. Dangerous is a bit of an overreach here since we do have a semaphore in place preventing us from creating a 1000 threads at once. I'm not trying to argue which version is better though i'm trying to shed some light on why things are the way they are. The two main reasons for actually running in threads is concurrency throttling not maximizing, i.e single connection taking to long at elasticsearch itself because we are slamming it too hard (yeah i know bigger cluster). Which I've seen cause hard to debug connection issues locally. I'm trying to argue (mostly with myself :)) if that fear was warranted enough to keep it in or not. I get similar thread statistics using the concurrency visualizer (shift+alt+f5 on the protocolloadtest) to your aysnc routine if i simply remove the wrapping in a LongRunning task: One other thing is that async web requests don't listen to Timeout http://stackoverflow.com/questions/7368737/task-fromasync-timeout So the ThreadPool.RegisterWaitForSingleObject really have to be in there. What I'm thinking now is to set the default MaximumAsyncConnections() to 0 in which case it will not wrap in a LongRunningTask nor use the semaphore. Love your thoughts on this! |
This is now merged into master, ty for alerting me to this @henkish. The default I created a separate |
Thanks! But I disagree with the naming of the class. Yes, it does not use tasks, that's true. But the main difference/advantage is that the NoTasksHttpConnection is not using any unnecessary resources (threads). If the codebase was .NET 4.5 i would have used Tasks and the await, but when that's not an option I went for the classic Begin/End pattern to maintain readability. The default Connection implementation is still creating threads using TaskCreationOptions.LongRunning. You wrote: "Explictly starting threads outside the threadpool is generally a good idea in something like asp.net where you are competing with the server for the 200 max threadpool threads that are actually serving the requests. Threadpool thread starvation is a real issue there." I don't agree with you, if you use async the correct way there is no need to allocate threads, not inside nor outside the threadpool (while waiting on resources). There will be no threadpool starvation, or starvation of any other threads for that matter. Whether or not the Timeout property works in HttpWebRequest while performing async calls I don't know. But maybe you could try the System.Net.Http.HttpClient class instead? It's async all the way and the API for doing HTTP calls is in many ways improved from HttpWebRequest. From what I can see there are Timeout settings in the API. Regarding the test results, maybe you should consider monitoring other resources while running the performance comparison tests, how does the thread and memory usage stand in comparison for the process? Please consider once more the importance of removing the TaskFactory.StartNew method from the Connection async implementation. Ask other developers of their opinion!! Generally I think it's a bad idea of maintaining two similar implementations of the some code, and if the thread issue is solved then I will gladly use the default implementation again. Cheers, |
Yeah I realized after the last release that the StartNew is still not I realize there's a difference between threads and socket IO completion Even so, the slight overhead of yielding io tasks and iterating over them Thats why for now I included both, the hipster dev in me loves the task So to recap if you dont set MaximumAsyncConnections in the current version HttpClient is awesome but .net 4.5/mono 3.0 only and the Thanks for getting back to me Henrik, appreciated. On Sunday, September 15, 2013, Henrik Jönsson wrote:
|
These are the current numbers (I'm running the NoTasksHttpConnection first) HTTP (IndexManyAsync): 17.197/s 21 Threads 854593536 Virtual memory HTTP (IndexManyAsync): 16.988/s 21 Threads 854601728 Virual memory These are 2 random outputs of many runs but the (this is on a VM agains ES 0.90.1) I'm seeing the exact same Memory and Thread statistics as your implementation when running a concurrency profiler and inspecting the Combined with the fact I think the task yielding approach reads a bit better then the Begin_/End_ pattern I'm opting for the removal of the Many many thanks for the slap on the wrist @henkish |
…ng it alltogether instead of removing it so we keep on testing it in the future
Rewrote DoAsyncRequest to use real async web methods instead of starting
new tasks and allocating thread pool resources for waits and callbacks.
Removed the MaximumAsyncConnections Semaphore because connection limits are already enforced in the System.Net.ServicePointManager...