2020-02-15

Royal Road To Async/Await


Scope and Purpose

This post will focus on C#'s async/await/Task stuff, as opposed to async/await for F#/JavaScript/Rust.

First, I will try to explain what the await operator does so that readers learn what is actually going on when you await something, and hopefully a bunch of async/await/Task stuff will start to make sense.  A lot of async/await resources don't tell you what is actually going on, so async/await still seems mysterious and full of obscure pitfalls/guidelines.  I want to help my readers take the "royal road" to async/await, getting that major epiphany as soon as possible.

Second, I will present an async/await/Task reading list that is selected and ordered for the benefit of a beginner, with some notes of my own.  The reading list doubles as my own reference of resources that were helpful for me, and as a place to review best practices and pitfalls.  This reading list is another "royal road" to fleshing out readers' understanding.

Note: I'm having trouble with this blog platform display less-than and greater-than symbols correctly, so please tell me if you suspect a formatting error.


What Async/Await Actually Does...Oversimplified

This section is simplistic/inaccurate to help smooth your way to major realizations.

In Case You've Never Seen Async/Await/Task

Synchronous Code Example:
int GetSum()
{
    int num1 = GetNum1FromInternet();
    int num2 = GetNum2FromInternet();
    return num1 + num2;
}

Analogous Asynchronous Code Example:
async Task<int> GetSumAsync()
{
    int num1 = await GetNum1FromInternetAsync();
    int num2 = await GetNum2FromInternetAsync();
    return num1 + num2;
}

Task represents an operation ("chunk of work") that will complete in the future (or has already completed).  Task<T> is the same, but contains a result of type T when complete.  "Task" will refer to both Task and Task<T> types.  Tasks have special support for notifying the system when they have completed (RanToCompletion/Canceled/Faulted).

async is a C# keyword that marks a function as a function that might have some await statements, and thus a lot of invisible compiler magic will happen to that function.  await is a C# unary operator that operates on anything that is an "awaitable expression", but it's okay to think that the only things you'll be awaiting are tasks.

When you await a task, the execution of the function containing the await will not continue until the awaited task has completed.  Thus, GetNum2FromInternetAsync will not be executed until the Task<int> from GetNum1FromInternetAsync has completed.  This behavior might seem like the same behavior as the synchronous "GetNum2FromInternet will not be executed until GetNum1FromInternet has completed", but the next section will show that very different behavior is going on, due to compiler magic.

The Compiler Magic - Await Doesn't Wait

The "await GetNum1FromInternetAsync()" actually causes GetSumAsync to return a Task<int>, so GetSumAsync stops executing and its caller function continues execution.  After GetNum1FromInternetAsync's Task<int> completes, a thread from the thread pool will continue execution of GetSumAsync where it left off.  A lot of compiler magic goes on so that GetSumAsync can be resumed in the right place with all the right local variables.

Note that execution of GetSumAsync can be started by one thread, continued by a second thread, and then finished by a third thread.

Note that each await is like a secret return of a task.  An async function can return Task without a single written return statement and an async function can return Task<T> even if it seems to return a T.  See these two example async functions:

    async Task f1() {}
    async Task<int> f2() { return 0; }

In a way, await doesn't wait at all, since it immediately returns a task and no thread is blocked.  If you have synchronous code that needs to wait for the completion of a task, then use Task.Wait, which actually waits (blocks your thread).

Let's do a different example to really drive home the order of execution that results from doing things asynchronously.  You can execute the example at dotnetfiddle.

    public static void Main()
    {
        Console.WriteLine("m_1");
        var t = f();
        Console.WriteLine("m_2");
        t.Wait();
        Console.WriteLine("m_3");
    }
   
    static async Task f()
    {
        Console.WriteLine("f_1");
        await g();
        Console.WriteLine("f_2");
    }
   
    static async Task g()
    {
        Console.WriteLine("g_1");
        await Task.Delay(TimeSpan.FromSeconds(3));
        Console.WriteLine("g_2");
    }


Here is the output with some added explanation:
m_1: main thread
f_1: main thread still, have not hit await
g_1: main thread still, "await g()" still executes some of g
m_2: main thread still, await caused return up to Main

(main thread blocks on the t.Wait() at this point)
g_2: pool thread 1, g "resumed" after the await
f_2: pool thread 2, f "resumed" after the await
m_3: main thread, unblocked by completion of f's task


If you want a more involved example that prints out thread info, see this other example at dotnetfiddle.

So, await allows a function to be split into pieces.  A piece executes until it hits some incomplete task and returns.  When that task completes, the next piece executes (not necessarily on the same thread).  await allows all of this complex asynchronous behavior even though your code looks very synchronous and sequential.

Hopefully by this time, you understand the basic idea behind the special execution flow of await.  If you want another attempt at a gentle intro, you should probably read Stephen Cleary's "Async and Await" blog post, stopping before his Context section.

The Compiler Magic - Async Functions Into State Machines

When you mark a function as async, the function is radically transformed to create and start a customized state machine (it will implement IAsyncStateMachine).  The state machine object will hold the logic and state of what used to be your function.  Local variables become member variables of this state machine object.

For example, GetSumAsync would be transformed to create a IAsyncStateMachine with a MoveNext method that is called when you originally call GetSumAsync and everytime that we resume GetSumAsync's work after an await.  Imagine something like the following code snippet.  Don't try to fully understand it - you don't need to know, and its horribly oversimplified; just appreciate that every async function creates a state machine that breaks the original function into pieces based on the await operators.  Look at how each "if(_step == ...)" corresponds to a piece of the original GetSumAsync function.

class GetSumAsyncCustomStateMachine : IAsyncStateMachine
{
    enum Step
    {
        Beginning,
        AfterGotNum1,
        AfterGotNum2,

        Faulted = -1,
        Completed = -2,
    }

    public AsyncTaskMethodBuilder<int> TaskBuilder;
    TaskAwaiter<int> _awaiterHelper;
    Step _step;
    int _num1;
    int _num2;


    public GetSumAsyncCustomStateMachine()
    {
        _step = Step.Beginning;
    }

    public void MoveNext()
    {
        var me = this;


        if(_step == Step.Beginning)
        {
            _awaiterHelper = GetNum1FromInternetAsync()
                .GetAwaiter();

            TaskBuilder.AwaitUnsafeOnCompleted(
                ref _awaiterHelper,
                ref me);

            _step = Step.AfterGotNum1;
        }
        else if(_step == Step.AfterGotNum1)
        {
            _num1 = _awaiterHelper.GetResult();

            _awaiterHelper = GetNum2FromInternetAsync()
                .GetAwaiter();

            TaskBuilder.AwaitUnsafeOnCompleted(
                ref _awaiterHelper,
                ref me);

            _step = Step.AfterGotNum2;
        }
        else if(_step == Step.AfterGotNum2)
        {
            _num2 = _awaiterHelper.GetResult();
            TaskBuilder.SetResult(_num1 + _num2);
            _step = Step.Completed;
        }
    }
}


If you really want to see a more accurate and complete example, read Sergey's "Dissecting the async methods in C#" or Ranjeet Singh's Async 3 - Understanding How Async State Machine Works.

I've Lied To You Multiple Times

I've left out some complications to try to make it easier to understand the big picture.

1: Await Sometimes Continues Normally

If you await on a completed task, then execution continues normally; you do not return to your caller.  This added wrinkle is very good - if a task is already complete, it's very wasteful to do all the fancy await things for setting up the resuming of the function later, so don't do all that.


async Task DoesTheSmartThing()
{
    Task delayTask = Thread.Delay(TimeSpan.FromSeconds(5));

    await delayTask; // this will return to caller

    // at this point, delayTask is complete
    // and we have resumed on possibly different thread
    await delayTask; //no return-and-later-resume
    print("done");
}

My GetSumAsyncCustomStateMachine example doesn't attempt to do this smart behavior at all.

To more accurately state what await does:  "await doesn't wait at all. Awaiting an incomplete awaitable causes an immediate return of a task (the incomplete awaitable and the returned task do not have to be the same type).  Awaiting a complete awaitable immediately extracts the result (if any) and continues execution."

2: Async Functions Are Not Always Resumed On The Thread Pool

I've previously said that async functions are resumed by a thread pool thread, but that is often not the case.  By default, work resumes on the same SynchronizationContext.  Roughly speaking, these are the synchronization contexts you care about:
  • Thread pool (I think this includes the main thread in console applications)
  • UI thread (for WPF/UWP/WinForms and maybe any GUI?)
  • Page thread (for ASP.net)
So, by default, an async function that starts being executed on the UI thread will be resumed on the UI thread.  You can call Task.ConfigureAwait(false) to allow a thread pool thread to resume the work.  Imagine DoSomethingAsync was first executed by the UI thread in a WPF app:

async Task DoSomethingAsync()
{
    // started in UI thread
    await Task.Delay(TimeSpan.FromSeconds(5));

    // resumed in UI thread
    await Task.Delay(TimeSpan.FromSeconds(5))
        .ConfigureAwait(false);

    // resumed in a thread pool thread
    await Task.Delay(TimeSpan.FromSeconds(5))
        .ConfigureAwait(false);

    // resumed in a possibly different thread pool thread
    print("done");
}

Warning: ConfigureAwait(false) does not guarantee a switch from UI thread to thread pool thread.  If you're awaiting a task that is already completed, execution will continue immediately in the same thread.

For more about SynchronizationContext...if you really want it right now.

Ordered Tour Of Async/Await/Task Resources

The first time I tried to truly understand asynchronous programming in C# and did some googling, I was overwhelmed by the vast amount of material that was not appropriate for introducing a newcomer to asynchronous code and the huge amount of linking out to yet more material that had me wondering "is this even the right subtopic for me to learn?".  It was also very easy to go in circles.

The ordered tour below should give you a good starting point, and make sure you're prepared for the next item.  There will be overlap between the items, but I think it's a beneficial amount - you do want some repetition and rephrasing as you learn a new concept.

Read all of my bullet points before you decide to read the linked article.

Stephen Cleary's Async And Await (2012-02): good first page to read to intro the concept and gives some usage advice.
  • Awaitables section: "you can await the result of an async method that returns Task … because the method returns Task, not because it’s async"
  • "Tip: If you have a very simple asynchronous method, you may be able to write it without using the await keyword (e.g., using Task.FromResult). If you can write it without await, then you should write it without await, and remove the async keyword from the method. A non-async method returning Task.FromResult is more efficient than an async method returning a value." But keep in mind he mostly changes his mind in Eliding Async and Await.
  • Async Composition section: "It’s also possible to start several operations and await for one (or all) of them to complete. You can do this by starting the operations but not awaiting them until later"
  • The very short Guidelines section is good to read; full of "do this instead of that".
  • Links out to some resources that are okay to read early in your journey.
Stephen Cleary's There Is No Thread (2013-11): good to read this early to drive home the point that if you're doing async code correctly, no thread is ever blocked/waiting due to your async operations.
  • If you really, really want to read more on this, there is .NET Standard Guide's Async In Depth...which should have really been entitled "We Agree There Is No Thread" instead of anything "in depth".

MS C# Guide's Asynchronous Programming (2016); short and good next-thing-to-read
  • First in this reading list to distinguish between I/O-bound work and CPU-bound work and what you'll do differently
  • Links out to Async In Depth, but don't read it unless you really need additional explanation on why CPU-bound work should be treated differently than I/O-bound work.
  • Also, don't bother with the Task Parallel Library unless you want to get into parallelism (where threads execute independent pieces of work at same time) as opposed to async/await (where threads execute dependent pieces of work in an ordered fashion).
MSDN Magazine's Best Practices in Asynchronous Programming (Stephen Cleary, 2013-03); nice bunch of do-this-instead-of-that.

Maybe at this point, you should do some fiddling around in async code if you haven't already, so that you get a sense of what is important for your use.

Stephen Toub's Async/Await FAQ (2012-04): answers a lot of questions you might have after being introduced to the world of async/await/Task
  • Also has a serious reading list of its own, a lot of which I have not read.  Feel free to follow his links only if you're really want to read further on a subtopic right now.
  • For your convenience here is are a few corrections to broken links in the article

Stephen Toub's ConfigureAwait FAQ (2019-12): gets more into SynchronizationContext, TaskScheduler, and then a ton about ConfigureAwait


Stephen Cleary's StartNew Is Dangerous and ContinueWith Is Dangerous Too.
Stephen Cleary's Eliding Async and Await (2016-12): if you can make a function return a Task without doing async/await, should you get rid of async/await?
He has changed his mind to now recommend against to getting rid of async/await and he lists the pitfalls of eliding async/await

TODO: resume with at least the following
https://blog.stephencleary.com/2012/07/dont-block-on-async-code.html
https://blog.stephencleary.com/2012/02/creating-tasks.html
https://blog.stephencleary.com/2013/04/ui-guidelines-for-async.html
https://devblogs.microsoft.com/pfxteam/await-anything/

also the following explanations of the generated state machines
https://ranjeet.dev/understanding-how-async-state-machine-works/
https://blogs.msdn.microsoft.com/seteplia/2017/11/30/dissecting-the-async-methods-in-c/
https://www.markopapic.com/csharp-under-the-hood-async-await/

Unorganized Note Dump From Previous Research

Async Stuff

https://devblogs.microsoft.com/dotnet/configureawait-faq/
seems really good to give highlights/summary of

Notes from "Correcting Common Async/Await Mistakes in .NET - Brandon Minnick": https://www.youtube.com/watch?v=J0mcYVxJEl0
  • code execution on an await can plow forward to following lines, such as if the task is already complete, so you can not relying on an await doing a context switch
  • " Each async method has its own context, so if one async method calls another async method, their contexts are independent."; that means f() can await on g(), and g() can do ConfigureAwait(false) and that doesn't mess up f() even if f() needs to stay on UI context/thread.
Some code from Don't Block On Async Code has cute examples where a function deadlocks itself.

Maybe to understand SynchronizationContext better, read MSDN mag It's All About the SynchronizationContext.


Class Reference Pages:

Commentary/Explanation

“Notes” are not necessarily what is most central to the article.  “Notes” are new things/details beyond the very basics that I wanted to make sure to remember.

Notes from Async Intro:
  • When the await keyword is applied, it suspends the calling method and yields control back to its caller until the awaited task is complete.
  • Although it's less code, take care when mixing LINQ with asynchronous code. Because LINQ uses deferred (lazy) execution, async calls won't happen immediately as they do in a foreach() loop unless you force the generated sequence to iterate with a call to .ToList() or .ToArray().
  • async void should only be used for event handlers.
  • Tread carefully when using async lambdas in LINQ expressions.  Lambda expressions in LINQ use deferred execution, meaning code could end up executing at a time when you’re not expecting it to. The introduction of blocking tasks into this can easily result in a deadlock if not written correctly. Additionally, the nesting of asynchronous code like this can also make it more difficult to reason about the execution of the code. Async and LINQ are powerful, but should be used together as carefully and clearly as possible.
  • Don’t depend on the state of global objects or the execution of certain methods. Instead, depend only on the return values of methods. Why? [helps code understandability and be less prone to race conditions]
Notes from Async in Depth:
  • By default, tasks execute on the current thread and delegate work to the Operating System, as appropriate. Optionally, tasks can be be explicitly requested to run on a separate thread via the Task.Run API.
  • “Exceptions” section unhelpful.  Still don’t quite understand proper way to do exception handling/propagation.
  • When a task completes in the Canceled state, any continuations registered with the task are scheduled or executed, unless a continuation option such as NotOnCanceled was specified to opt out of continuation. Any code that is asynchronously waiting for a canceled task through use of language features continues to run but receives an OperationCanceledException or an exception derived from it. Code that is blocked synchronously waiting on the task through methods such as Wait and WaitAll also continue to run with an exception.
Notes from Consuming the TAP
  • Basics repeated again: Under the covers, the await functionality installs a callback on the task by using a continuation. This callback resumes the asynchronous method at the point of suspension.
  • If a synchronization context (SynchronizationContext object) is associated with the thread that was executing the asynchronous method at the time of suspension (for example, if the System.Threading.SynchronizationContext.Current property is not null), the asynchronous method resumes on that same synchronization context by using the context’s Post method. Otherwise, it relies on the task scheduler (TaskScheduler object) that was current at the time of suspension. Typically, this is the default task scheduler (System.Threading.Tasks.TaskScheduler.Default), which targets the thread pool. This task scheduler determines whether the awaited asynchronous operation should resume where it completed or whether the resumption should be scheduled. The default scheduler typically allows the continuation to run on the thread that the awaited operation completed.
  • Some good stuff on C# support for cancellation and progress reporting
  • Task.FromResult is interesting for returning a result when a Task is expected. (and yes yes, Task.WhenAll and Task.WhenAny are mentioned yet again)
  • For exceptions: lots of Task-stuff might throw an AggregateException (because of child Tasks), but doing an await will only propagate one of those exceptions (so await makes exception handling a lot like what you would do with synchronous code)
  • TODO: finish reading this!  Resume at “Task.Delay” section
Notes from Implementing TAP:
  • This article is fairly basic and has lots of overlap with other articles
  • Use static Task.Run for simple cases; use static TaskFactory.StartNew for more complicated cases with fine-grained control
  • TODO


Delegates
  • http://www.tutorialsteacher.com/csharp/csharp-delegates
    • what if you want to pass a function itself as a parameter? How does C# handle the callback functions or event handler? The answer is - delegate. A delegate is like a pointer to a function. It is a reference type data type and it holds the reference of a method. All the delegates are implicitly derived from System.Delegate class.
    • delegate syntax: delegate ()
    • example declaration: public delegate void Print(int value);
    • The delegate can be invoked by two ways: using () operator or using the Invoke() method of delegate as shown below.
      • someDelegate(someArg)
      • someDelegate.Invoke(someArg)
    • Can do multicast delegate; “+” to add a function or “-“ to remove; functions are executed sequentially in order
  • http://www.tutorialsteacher.com/csharp/csharp-func-delegate
    • Func is a delegate with generics-stuff to help express a delegate that has zero or more inputs and one returned output
  • http://www.tutorialsteacher.com/csharp/csharp-action-delegate
    • Action is a generics-delegate with zero or more inputs and no returned output

Delegate:

TODO:
  • read up on SynchronizationContext, and not just the reference page; also Action and delegates and “Invoke”

Knex & Bookshelf (JavaScript Database Stuff)

Commentary/Explanation

No comments:

Post a Comment