Cancellation Tokens with Stephen Toub
By dotnet
Summary
## Key takeaways - **Cancellation Enables Best Optimizations**: Cancellation is all about avoiding unnecessary work, which represents some of the best optimizations possible by eliminating work you no longer need instead of letting it waste resources. [01:28], [01:47] - **Thread Aborts Were Violent Killers**: Early .NET had thread aborts, an extremely violent mechanism that could kill operations in bad spots, now effectively gone from .NET Core in favor of cooperative cancellation. [02:22], [02:39] - **Explicit Tokens Beat Ambient Scopes**: Ambient cancellation scopes seemed lovely but led to foot guns like unexpected cancellations and hard-to-debug shielding needs; explicit token passing is safer, detectable by analyzers, and allows combining tokens. [11:17], [14:39] - **Register Callbacks Enable Prompt Cancellation**: Leaf operations use CancellationToken.Register to attach callbacks that promptly signal OS-level APIs like CancelIoEx on Windows when cancellation occurs, making it much faster than polling. [18:28], [20:06] - **Token vs Source: Observe vs Request**: CancellationToken is a passive, immutable shim purely for observing cancellation, while CancellationTokenSource is the active producer that can cancel and link tokens, preventing observers from unexpectedly cancelling others. [24:07], [25:13] - **Volatile Prevents Compiler Loop Hoisting**: Without volatile on the cancellation flag, the JIT can lift reads out of loops assuming no multi-threaded changes, missing cancellations; volatile ensures fresh reads every iteration for visibility. [31:23], [34:12]
Topics Covered
- Cancellation Optimizes Performance
- Tokens Enable Composable Cancellation
- Explicit Beats Ambient Cancellation
- Register Enables Cooperative Cleanup
- Core Simplified Token for Serial Workloads
Full Transcript
Hey friends, it's Deep.net. We're back.
We're back with Deep.net. The people
have asked, the people have demanded.
Stephen Tob has come out of retirement.
What have we been doing?
>> No, I'm just I just made that up. Uh,
we've been busy. We've just been busy.
It's not that we don't want to do these as a full-time job. It's just not our job to do episodes of deep.net, but uh I appreciate you taking the time out of your busy schedule with a bit of a cold.
>> I do have a cold. Yeah. Sorry for all the snot and tissues that you'll see in the next hour. [laughter]
>> No, you're you're you're taking one for the team and the team appreciates it.
Team appreciates it. [laughter]
>> Yeah, the kid, you know, it's that time of year. The kids bring home everything
of year. The kids bring home everything from school and then you just spend all winter sick. So,
winter sick. So, >> and that that does not change because I've got a 20-year-old who's home from school and he's brought it home as well.
So, whether they're in kindergarten or they're sophomores in college, they're going to bring snot home uh and you're going to get sick. So, that's is what it is.
>> All right. Today on deep.net we're talking about cancellation tokens.
Cancellation tokens.
>> Why not?
>> Well, we you know we we talked about one of our first episodes I think we talked about async and tasks and stuff and um there's a a deep integration there with
cancellation and we didn't really touch on the subject. Um it cancellation has this really interesting overlap with performance which is something that I
care a lot about um because it's all about avoiding unnecessary work and that's some of like the best optimizations that are possible are the ones where you just avoid work that you
don't have to do. Um and you know cancellation is effectively that it means you have something that you no longer need and you can get rid of it.
Uh you wouldn't need cancellation if you just let everything keep going. Um, but
then you're wasting a whole bunch of resources and you're also running into issues with those things that keep going using other resources that you might want to be using. So cancellation to get rid of it all.
>> But I think it's important to note that cancellation needs to be, for lack of a better word, nonviolent, >> right? You're in the middle of doing
>> right? You're in the middle of doing something and you cooperative cancellation. It's not just smack it
cancellation. It's not just smack it kill. It's interesting because the the
kill. It's interesting because the the initial versions ofnet had sort of no built-in cooperative mechanism for cancellation. They did have what did you
cancellation. They did have what did you say the whacking mechanism for cancel that did exist in the form of threat aborts uh which are extremely violent as
you put it. Um you basically just walk up to something and kill it and you could be in a really bad spot for that to happen. Uh so that mechanism now int
to happen. Uh so that mechanism now int net core effectively no longer exists. Um I'm
waving my hands a little bit because it exists sort of in two pocket corner niche areas but in the general programming model it's not there anymore
which is a good thing and instead we have cancellation token. Um you look at the initial support for asynchrony inn net and there was nothing about
cancellation there at all. Um this is the the APM pattern the asynchronous programming model pattern or the begin end pattern. Uh if I um if I have a type
end pattern. Uh if I um if I have a type like stream uh this you know I can I can call read on it but I could also call
begin read. Um, and begin read just
begin read. Um, and begin read just accepts uh an array where in the array we want to do stuff and then a call back that'll be invoked when the operation
completes. There's nothing here about
completes. There's nothing here about being able to cancel this. The there was there there was no argument about cancellation and the sync result that I get back um has nothing related to
cancellation on its surface area. it
just didn't exist at all in the programming model. Um, which is made it
programming model. Um, which is made it a very challenging thing for people to deal with to be able to say, "Yeah, I I scheduled this work.
Now I don't want it anymore. How do I how do I kill it?" And there wasn't really a good way. Um, even thread aborts, which are an extremely powerful and as you put it, violent mechanism, don't really help here because with
thread aborts, you're aborting a thread and with these kinds of asynchronous operations, there may not be a threat.
you might just have some little pending piece of work sitting in memory waiting for a message to come back over some socket. Um there's nothing to sort of
socket. Um there's nothing to sort of you know shoot. Um
now this is this is called a cancellation token >> and a token [clears throat] is a thing that you you know pass around that you hand to people that you use to get access to things whether it be a token at the arcade
>> you know it's just a little thing that you hold. Is this a convention? Is this
you hold. Is this a convention? Is this
just like, hey, we're just passing a boolean around and before you do anything, check for true.
>> Yeah, so that's a great question and it it's worth actually taking a look at the next asynchronous programming model that came about before we get to tasks and cancellation token because it highlights
something related to what you just said.
So um after the asynchronous programming model, there was the eventbased programming model or EPM, I guess we would call it. Um it was very short-lived. It was basically NET
short-lived. It was basically NET framework 2.0 there was a variety of new APIs that were released according to this pattern and then very very very few
after that point. This is very briefly but if if you look at a type like background worker for example it it basically followed this pattern where you would have some event in this case do work
um and uh you could then say um run async. So you're basically kicking this thing off. And this did
have a cancellation mechanism. You could
say cancel async. Uh, and then the call back here if I look at what's on here, there's a there's a boolean flag like you were mentioning that is basically
gets gets that signal. Um, and so the you you have a caller that's able to provide a signal and the recipient that's able to be notified uh if cancellation has been requested and you can find out when the work
completes if it completed because of success or failure or cancellation. But
this also highlights one of the real challenges with this style which is lack of composition. So if I I can very
of composition. So if I I can very easily you know send this single request but now if this background work wanted to kick off another operation how do I
flow that cancellation request to that next thing? there's nothing here that
next thing? there's nothing here that basically there's no token that I can then pass down to the next thing that will allow that request to prop
propagate through sort of my whole tree of work. Um and and that's what one of
of work. Um and and that's what one of the main things that cancellation token addresses is providing this reusable type that everyone can agree on as the
answer for how this signal flows and you pass this to everyone and they can all observe cancellation creating a a very composable model. Um so you look at what
composable model. Um so you look at what we have now with cancellation token which was introduced um circa 2010 or so
or 2012 that general time frame with tasks um net framework 4.0 um and it's just this little little pipe a little strct that you pass around to methods.
So uh if I have a method fu, it would take a cancellation token and then this method can observe
uh the cancellation token. But if it calls another method, that one can also take a cancellation token and it can pass the the cancellation token down. So
you can kind of thread it down through all of your calls, synchronous or asynchronous, and allow that signal to to propagate uh through to everyone and everyone along the chain can either be
polling this or they can be registering for a call back to say when cancellation occurs um I want to know what's going on. And then all of the asynchronous
on. And then all of the asynchronous APIs in the framework, all in quotes, but 99.999% of them accept a cancellation token. So
if I'm again if I have my stream and I call streamay sync uh there's there's this you know an overload that
takes a cancellation token such that if cancellation is requested of this token this method will be registering with that token to get a signal to say uh you
know I I'd like to um I'd like to go away I'd like to be notified if cancellation occurs.
>> Okay. So this is then by definition conventional.
>> It is a convention and it doesn't work if async methods don't accept a cancellation token.
>> The convention is accepting a cancellation token when you are cancellable. Um and
cancellable. Um and or if you might be cancellable in the future. There were actually a bunch of
future. There were actually a bunch of APIs that were introduced around framework 4.0 know where um for one reason or another they accepted a cancellation token, but they may not
have been super responsive to it. So
maybe they would do an upfront check for cancellation, but then if cancellation occurred after that, they would just wouldn't notice. Um but that's gotten a
wouldn't notice. Um but that's gotten a lot a lot better since then. Most of
those were places where we just didn't have the means to implement the cancellation or it was some base type where derived types would override it.
it would uh appropriately respond to cancellation requests but the base type wouldn't have the capability to do anything. Um but yes it is a convention
anything. Um but yes it is a convention and there are then various analyzers that exist to say hey I see you're an asynchronous method but you're not accepting a cancellation token you
should you know look to add one um or you you if I'm if I'm writing my um
my say a copy loop so I have uh I'm copying from uh stream source to stream destination
Um, and I've got some buffer.
Sure. Uh, and then Sure. Uh, oh yeah.
Okay. I'll just let Copilot do it. Um,
if I accept a cancellation token here now, I can then, uh, thread that through to all the places where I, you know, uh,
can accept a cancellation token. And
there are analyzers where if I didn't have that, it would warn and say, uh, actually, is this it? Uh, yeah. Uh,
let's do it again so we can see it. Um,
to >> forward. Yeah,
>> forward. Yeah, >> like you're you're not forwarding this cancellation token to this method. You
should do that. Um, so you're appropriately um pass cancellation signals everywhere.
>> Is it and again my my my role on deep.net is to ask dumb questions. So
bear with me here. Is it important? Is
it a feature that it is explicit?
>> That's a great question.
>> Be called in so many times because at this point I'm looking at this thinking, can this be made more dry?
>> Yeah. So the very first actually the first two attempts at this we had did not pass the cancellation token around.
We had this concept of a cancellation scope. I think we called it cancellation
scope. I think we called it cancellation scope and this was this was first in maybe .NET framework 3.5 and then again we had a design for this int net 4 not a
framework for um and we threw those away in favor of cancellation token. But the
basic design was you would create a scope where you would say anything within this scope uh not even just within this method but anything this method calls synchronously or asynchronously will be able to witness
this ambient cancellation scope that's hanging out there. And so anything that you know you
there. And so anything that you know you just create your scope and anything within there would sort of implicitly pick up the token or or pick up the
signal. And on the surface this seems
signal. And on the surface this seems lovely, right? You just you just
lovely, right? You just you just declare, hey, anything in this region I want to be cancelellable and then no one has to do anything about it. Um but once you start getting into the weeds, it
starts getting a lot more complicated and you actually start having a whole bunch of foot guns. Um there are a bunch of places where um it would be dangerous
to have cancellation occur and because there's this sort of ambient notion and it effectively cancellation could occur at any point. You're you're now kind of
back out of that cooperative model into sort of a a non-ooperative model. um
even if there are very fixed join points where that cancellation can be observed.
Um you also have situations where you don't actually want that scope to be uh sorry I'm getting a phone call. Just
turn that off. Um you don't actually want the cancellation associated with that scope to be what's propagated down.
You want something else to be propagated down either more or less. So there are lots of APIs that accept an external cancellation token today. Um but
internally they have their own cancellation mechanism. Maybe they add
cancellation mechanism. Maybe they add their own timeout or something. And so
it's a combination of the two. And if
you were only respecting that ambian scope now you're not respecting that additional token or there are places where you're um you don't want to propagate the cancellation. For example,
there are lots of cases where you might do something like um acquire a resource and then kick off a task and the body of
that task will release the resource. If
that task were to be implicitly cancelellable based on that scope, you might never release the resource if the cancellation were to affect it. So then
someone would have to explicitly do something to shield that code from that scope and that leads to um
uh it it becomes really easy to make those mistakes. Whereas with this sort
those mistakes. Whereas with this sort of explicit model, it's in your face and you you can accidentally not forward the cancellation, but you can have analyzers
that can statically detect it. Whereas
in these other models, it becomes really hard to know am I in a region that is implicitly cancelellable or not and what should I do about it.
>> Um uh and so for all those reasons and others we basically moved away from that model and it every once in a while every few years someone reprooes it um and then we end up back where we are.
>> Okay.
>> Excuse me.
Um, >> you can if you if you want to give it a good like a good sneeze or maybe next.
No, you can just press the mute button on >> and just go for it.
>> Where's my carous? Just give me >> fantastic.
>> You couldn't hear it, but it was awesome.
>> I heard nothing. It was amazing.
[laughter] Um, so, okay, that brings up two questions. one in this little chunk
two questions. one in this little chunk of sample code here I see cancellation token being explicitly called out in the method being passed in two other methods but I don't see you because of the
definition of this code checking it there's no if cancellation token or so where does that >> happen the the vast majority of use of cancellation token is just propagating
it like most code that uses a cancellation token are intermediate methods they're not the one that's actually performing the leaf operation
they're compositional over those leaf operations or or over other intermediate operations. So the the leaf operations
operations. So the the leaf operations here are the read and the right async.
My copy async is just kind of coordinating this stuff but it's not actually needing to respond to cancel itself.
>> So I'm imagining a longunning thing like copying a file and we've all copied a file in explore and hit cancel. It
doesn't happen instantly. it happens a couple of seconds later presumably when it becomes more cancelellable when it's not currently on the disc or currently hitting the the
>> are some operations where the underly the leaf operation for whatever reason just doesn't expose a cancellation capability and so in those situations
you end up effectively waiting for that sub operation to finish before you get to something where either an explicit check can be done um or where you call
some other operation that is implicitly canceable. So, you know, I could have uh
canceable. So, you know, I could have uh I could have a cancellation token throw if cancellation requested call in the middle here. And now, even if these
middle here. And now, even if these calls weren't implicitly cancelellable, when I get through my read, the first thing I do is pull the cancellation token.
>> In general, we we discourage folks from doing this. Instead, we just say that
doing this. Instead, we just say that things that take a cancellation token should respect it.
>> But it's a workaround if >> you're dealing with things that aren't implicitly cancelled.
>> This is a silly example, but when you I just in my brain when you said when you made the the bite array there of, you know, 81,000 bytes, I'm like, okay, he's in a tight loop, one bite at a time.
He's going to check cancellation token 81,000 times or, you know, each each chunk or whatever. like how many how often is this thing looked at by leaf nodes?
>> So in general what happens is if we were to look at the implementation of one of these methods u so I'm going to just make something up here but if I had a
method um async task read async that accepts a cancellation token often as as a leaf method um this will do one of a
few things. it might be sitting in some
few things. it might be sitting in some kind of loop uh where it itself is just pulling that
cancellation token. Um but generally we
cancellation token. Um but generally we just we discourage that because you're um you're not the the everything you
know everything here is not actually cancelellable. It's just you're just
cancelellable. It's just you're just sort of pulling and you end up with that delay that you were talking about where uh there's this several seconds of pause before you get back around the loop.
>> Instead, typically what happens is at the beginning of this um there will be a cancellation token.register call and
cancellation token.register call and this basically you provide this with a callback that can do something in response to a cancellation token being issued. So
issued. So >> it's finally in the try catch of the thing.
Yeah, kind of. It's it's the >> it's um >> it's like >> someone is sitting over on the side >> watching the cancellation token and when
it fires then they they come in and they say, "Oh, okay. Now I'm going to go like >> right it's locked the door on the way out like whatever little tidy cleanup needs to happen."
>> Yeah.
>> It would not happen if it was not canceled. Correct.
canceled. Correct.
>> Right. This this will not be invoked if cancellation is never is never requested. What's a thing I might want
requested. What's a thing I might want to do on a cancel?
>> So, typically what happens is you're actually wrapping some other OS level API at this point. Um, and so on Windows
you're calling some um IO completion portbased, you know, o uh over overlapped IO operation or on Linux you're doing
something with um with eole or whatever of some kind, right? you might need.
>> Yeah, this basically this is going to end up signaling the thing to say on Windows it'll say like it'll call the cancel IO X you know method or function.
Uh and on um on Linux it'll basically update some data structures and then single signal the file descriptor to say hey you're you know you're done or it'll call the underlying uh function
associated with whatever kicked off the operation. So it becomes this very
operation. So it becomes this very prompt thing where you're basically just forwarding that cancellation request. Um and this allows it to be a
request. Um and this allows it to be a much more prompt operation. This also
returns um something that's disposable.
So you typically will see it in this sort of using block such that after the operation completes you remove the registration because you no longer need
it when the operation is done. Um,
>> uh, yeah.
>> Okay. And the second second question, and we've, we've hit this, but I want to make sure we hit it hard for people who may never have seen this before.
>> This is not implicit. It is not ambient.
It's not HTTP context. It's not async local.
>> There's no async local context. It's I
mean, you could put one in an async local if you want to.
But just to be clear, it's it is explicit for a reason because cooperative cancellation only works if every layer opts in.
>> Exactly. And if an if a layer doesn't opt in, then you end up with these brief periods where things aren't cancelellable. And over the years, we've
cancelellable. And over the years, we've had, you know, it's been a decade basically since or 15 years maybe since cancellation token was introduced. And
at this point, there are very few methods that accept a cancellation token that don't aggressively respect it. Um,
but that wasn't always the case. And uh
over the years we've fixed this additional case or that additional case either coming up with um either threading it through to the underlying mechanisms that exist in the underlying
OS or coming up with clever mechanisms to sort of pretend um and enable it to be more aggressive. So that pretty much now when you pass these in they're they're prompted.
>> Gotcha. Okay. So other dumb question, but let's say that we inserted between line 11 and 12 a call to another public
async task fu and the the body for public async task fu was await task delay 5,000. It was a 5-second delay,
delay 5,000. It was a 5-second delay, but it doesn't take the token.
>> So if I had if this was like this, >> that's that's basically saying, right?
Yeah. So this cancels If someone cancels here, >> this operation is going to end up pretending like there was no
cancellation request until it gets past that delay because the cancellation token wasn't forwarded.
>> Bingo, there it is. Okay.
>> So, the method can't observe cancellation because it wasn't passed in. It can't it could not take less than
in. It can't it could not take less than five.
>> Some people would in um in certain domains would refer to this as a capability, right? The you're passing
capability, right? The you're passing around this capability and unless you're past it, you don't know that it exists.
You don't see it. Um so you're but so what happens with the delay here is uh it is uh basically creating a timer under the covers and when the
cancellation token is signaled it both completes the task that was returned from here and shuts down the timer. Um
this is a case where we can be extremely prompt because um we can just complete the task immediately when cancellation occurs and and delete all the background resources
um that that were being used.
>> Okay. Then what is a cancellation token source?
>> Yes. So cancellation token source is basically actually let's just implement it and that'll be easier to kind of okay understand. Um so
>> let's just write it from scratch.
>> Exactly. Um so
cancellation token is pur is is purely a madeup um >> convention >> convention that we use just to separate
the ability to cancel from the ability to observe cancellation. Because most of these most of these things where you're passing around a cancellation token, it would be super surprising if one of them
actually caused cancellation to occur.
>> Um the the 99.999% case for methods is they observe cancellation but some other thing over here is what's actually causing
cancellation to be requested. uh it's,
you know, a a client shutting down its connection to a server uh or it's a timer firing or whatever it may be. If
the methods that were receiving the the token also had the ability to poke at it and cause everyone else to cancel, that would be super surprising. So the the
mechanism for cancelling and the mechanism for reserving that cancellation were just split into two.
uh one of which is cancellation token which is there purely to observe and one of them is cancellation token source
which is there to uh produce it's like producer consumer this guy can both produce the token and cancel it and this guy can only observe that cancellation
occurred. So as we saw in the um in sort
occurred. So as we saw in the um in sort of using this there's really two or three members on cancellation token that everyone uses. So there was uh there's a
everyone uses. So there was uh there's a bool is cancellation requested. We'll
get to the implementation in a moment.
Uh there's a um I'm just going to do void for right now but we'll change that. uh there's some sort of method
that. uh there's some sort of method that takes a call back and then there's a throw if cancellation requested which
is just if is cancellation requested throw an exception that's just a little helper. Um the guts actually live on
helper. Um the guts actually live on cancellation token source. So my
cancellation token really just contains a cancellation token source. Let me give this a constructor here. So that's good.
Um, and then pretty much everything on this method is just delegating to corresponding things on cancellation
token source. So let me just copy this.
token source. So let me just copy this.
Um, and then this guy is just going to do CTS dot is cancellation requested.
I'm going to make this shorter.
>> Well, but so okay. So cancellation token source is is active. It does the creation, the cancelling. It can be linked to different tokens. Cancellation
token source is passive. It's just a data structure.
>> Exactly.
>> A sorry, a cancellation token is the data structure. It's immutable.
data structure. It's immutable.
>> Um, so yeah. So you end up with something
so yeah. So you end up with something like this where the token is just a thin wrapper around a cancellation token source instance. It basically has
source instance. It basically has effectively no logic. It's just it's just creating a little shim, a little veneer over the cancellation token
source to protect the ability to cancel.
>> Oh, hang on a second.
Sorry. Give me one second. I'm so sorry.
>> This fine.
>> Uh, we have an edit point of 27 minutes.
All right, cool. Sorry, my 80-year-old dad is getting a haircut and I uh uh the haircut place has two locations and I needed to check find my friends to see
if he was going to the wrong location.
Otherwise, it would be drama.
So, I apologize.
>> All right, so we had an edit point at 27 minutes. Hopefully, we'll see that.
minutes. Hopefully, we'll see that.
Otherwise, this will just be on YouTube.
Okay. [laughter]
>> Um cool.
>> All right. Anyhow, so this my this cancellation token is just a shim just a little veneer over the my cancellation source um with everything delegating
through. So the real the real meat is is
through. So the real the real meat is is on my cancellation token source. Um so
you know there's this is cancellation requested. There's also a a cancel
requested. There's also a a cancel method and from a public service area perspective that's basically it. There's
other stuff but that's that's the core of it. And why is this important? This
of it. And why is this important? This
is important so cancellation token source can link tokens or like you couldn't you couldn't you just have a cancellation token that was just too smart and had all these methods.
>> You could you could have cancel up here but then anyone who was passed the cancellation token could could cancel it and could cause everyone else who had received the cancellation token to
receive that. And as a that that would
receive that. And as a that that would be a perfectly valid design choice. It
is not the design choice we made. We
chose to say the 99.999% cases. So you're getting a token, you
cases. So you're getting a token, you shouldn't be able to cancel it for everyone else. And so we separated that
everyone else. And so we separated that functionality out into this separate instance. And so you have this
instance. And so you have this instantiable thing and then this little strruct wrapper that just provides that little shim around it. Um, and if all
you cared about was polling, the implementation of my cancellation source is trivial. It's just uh you have a
is trivial. It's just uh you have a boolean uh is cancellation requested
and then is cancellation requested is just returning that and cancel is just setting it and with that you have if all you are using is uh is cancellation
requested and throw off cancellation requested. This is a perfectly valid
requested. This is a perfectly valid system. You can come up here and you can
system. You can come up here and you can construct your uh my cancellation token source. Uh and I could have a method
source. Uh and I could have a method that took a cancel uh a my cancellation and this could sit in a you know a loop
doing >> whatever and it that would work just fine. Right.
fine. Right.
>> Um, >> but we really don't want people just doing that kind of polling. And the vast majority of implementations that consume
cancellation token for real um either pass it down to something else or use register or one of the other similar methods or overloads. Um,
>> it might it might be worth just a very brief uh sidecar conversation about why volatile matters in this context and why Sure.
why that visibility and ordering in multi-threaded code is important.
>> Yeah. So just to make it a little bit more obvious, let's just say um my loop was let's just say I had that.
>> Okay.
>> Um so really if I just simplify this down because is cancellation requested is just calling CTS that is cancellation requested and that's just a a boolean
field. Really this is just there was
field. Really this is just there was some bool uh requested and then this is just reading
that bool. Okay. Now the compiler or one
that bool. Okay. Now the compiler or one of the compilers but in this case it's typically the just in time compiler the
JIT um can look at this and say you're reading a boolean and because of you're reading a field and because of the um
semantics that are afforded to uh threads and sort of singlethreaded code um I don't have to care about what
some other thread might be doing with this field. I can just ignore that. And
this field. I can just ignore that. And
so from my perspective, since no one is changing the value of is cancellation requested, I can lift that
uh I can lift that out of the loop, >> right? And there's arguably then no line
>> right? And there's arguably then no line 7.5. There's no space where someone can
7.5. There's no space where someone can jump in and mess with it on the way.
Well, that's true. Uh, but it also means that once you're in this loop, there's no what 9.5. There's no this this value will never change, which
means that the compiler could legitimately do that. Um, or the compiler could
do that. Um, or the compiler could legitimately do uh that depending on the value of the boolean.
>> Typically, it's more like this. So be
because of the semantics that are afforded to these kinds of fields um we need this is a legitimate transformation that the compiler does do and so we need
to tell it don't do that. Now what what it's basically doing when it does that lifting it out of the loop like that um
is every time through the loop there's sort of logically a read on this field
um and the question is is that read observable or not? um with normal fields that it's perfectly valid for the
compiler to um coalesce reads to basically say well I already read the value and I don't see anything else that could have changed it therefore I don't need to read the get if I have >> so the read is happening after any
memory operations before it so there's a barrier >> if I have lists >> yeah the compiler is not going to like look it collapsed that but yeah >> it's perfectly valid for it to do that
this is the same thing it's just an unbounded number of those. And so
there's there's three reads here. And
the compiler is saying, "Yeah, but because I because I can't observe the difference, it's okay for me to coalesce some of those reads." Um, what volatile
does in this situation is it says uh uh uh you can't do that. You cannot
eliminate reads >> because you can't because there's stuff you can't see.
>> Compiler. So it if with this being volatile, it it would be illegal for the JIT to do that. It has to.
>> So volatile is literally saying this thing is volatile. It could change. It
could blow up at any second and you have no way of knowing. So let's just treat it with respect. It's volatile
>> effectively. Yeah. And the same thing that happens here because there's a read on this field every time through the loop. The compiler with this being
loop. The compiler with this being volatile, it's unable to lift it out of the loop because that would be a lighting reads. That would be an illegal
lighting reads. That would be an illegal transformation for it to make.
>> Okay.
>> But volatile is not there to prevent race conditions. It's not trying to
race conditions. It's not trying to replace locking. It's not trying to
replace locking. It's not trying to synchronize operations. It's it's a
synchronize operations. It's it's a visibility keywords.
>> It is used if you are doing lock free programming, which very few people should actually be doing. Um it's really only relevant in super high performance
low-level >> framework level code. Um because it's really easy to get wrong.
>> So I shouldn't be doing that as a person that does text.
>> I do it but I shouldn't uh it's it's it's or you know in such code needs to be super heavily scrutinized. Um,
>> gotcha.
>> Uh, if you were doing lock free programming, volatile shows up all over the place though, >> right?
>> Because um, locks basically provide the kinds of barriers that um that prevent the compiler from doing these same kinds
of transformations and movement. Um, if
you all lied the locks, now you need to do you need to step in yourself and provide those same instructions to the compiler to say there be dragons. You
need to not do the transformations that you would otherwise do. You need to not move stuff around because there is stuff happening in a multi-threaded fashion.
Your thoughts about this being single threaded only don't apply.
>> Gotcha. Okay. I just wanted to call it out because it's a keyword that people don't usually see.
>> And that's and that's a good thing. Uh,
if you see it too much, it means something's wrong.
>> Well, and and I I joke about text boxes over data, but there's no volatile keywords in my blog, >> right? Yeah.
>> right? Yeah.
>> So, if you catch yourself doing business applications and you're throwing volatile around, >> I don't I mean, I have priorities, >> you know, a 200 something page blog post on performance improvements in .NET 10 and
>> I don't know if volatable shows up there at all either. If if it does, it's like once or twice, you know, it's a small number. Um it's a a very rarely used
number. Um it's a a very rarely used thing.
>> Right. Right. Right.
>> Um so you know there's there's really not that much here. There's not much to
much here. There's not much to cancellation token. We do need to
cancellation token. We do need to implement this register but even this is pretty simple. So I'm now it's
pretty simple. So I'm now it's interesting. There are a variety of ways
interesting. There are a variety of ways that this could be implemented. We're
going to do something really basic. I'm
just going to have say an action here and then register effectively is just going to uh add into this list. Now we
start do now we do need some synchronization. We need to coordinate
synchronization. We need to coordinate some things. Uh we need to make sure
some things. Uh we need to make sure that if someone is calling register and someone is calling cancel that the right things happen. So I'm going to make a
things happen. So I'm going to make a few changes here. We'll take a lock uh around this ad. And now down here as well, we're also going to start locking
because cancel needs to actually also look at callbacks to possibly invoke those those callbacks. Um so we'll do
something like this. We'll say um if uh not cancellation requested, then we're going to add the call back and we'll
return. If cancellation has already been
return. If cancellation has already been requested, we'll just invoke the the call back right here. And we're doing it outside of the lock to avoid the possible problems that that result from
that. Um, and then in cancel, we'll do
that. Um, and then in cancel, we'll do something like uh if cancellation was already requested,
>> then there's nothing to do. So we can just return. Um and then uh effectively
just return. Um and then uh effectively do something like
like that. Um and I'm playing a little
like that. Um and I'm playing a little trick here. I'm basically acknowledging
trick here. I'm basically acknowledging the fact that by get you know by the point I get to here no other thread can be looking at callbacks because um the only one the only other thing that looks
at callbacks is or the contents of this list is this guy up here and if cancellation was already requested then it's not going to look at the list. So
down here I can just I can do this outside of the lock again and not be invoking these callbacks while the while the lock is held. Um that's basically it. There's a little bit more to it. I
it. There's a little bit more to it. I
mentioned earlier that register returns um a disposable thing. So if this was doing this for real, uh this method would return an I disposable in the
real.net. It's called it's a
real.net. It's called it's a cancellation token registration which is just an eyes disposable thing. And then
this returns uh a an instance that basically wraps um the the callback and the the cancellation token source and so that it
can remove it from the list. Um but it's it's it's this it's one of those things where it's ubiquitous now across all of.net. You see cancellation token
of.net. You see cancellation token everywhere. But the core implementation
everywhere. But the core implementation is basically this. Um, interestingly,
basically this. Um, interestingly, it was a lot more complicated on .NET Framework. Um, hold I just got to close
Framework. Um, hold I just got to close my door because it's noisy out there.
>> No worries.
>> Got the new dog.
Th this it's really interesting because I uh I was speaking at a conference a couple weeks ago and someone asked a question of are there any optimizations
that have been made in the past that you would um that were a mistake or that we should have you know shouldn't have had.
And I said at the time I was on stage and I said there are but I can't think of any right now. And interestingly this is one of them. Um so the the
cancellation token implementation in .NET framework was actually highly optimized. Um but just not for the same
optimized. Um but just not for the same things that we care about today. Uh so
the optimizations actually made it slower for the things that we care about and faster for the things that we don't.
>> And what are the things they cared about and what what changed? Was it the rise of server side or the containers or crossplatform or >> so let me just >> let me just show an example of this um
and then we can talk about what changed.
So um let's uh we're just going to do a you know the kind of benchmarking you're not supposed to do. You're supposed to use benchmark.net >> basic thing.
>> Well it's okay as long as you're using dark mode that's all that matters.
>> Yeah I'm still everyone who watched deep.net net in the past commented on YouTube that how can Stephen use light mode and so for all of you I'm trying dark mode it's really hard for me
[laughter] you could try blue you could have a compromise you could do hot dog >> yeah I could yeah or or go through some of the other themes that are available but um I'm I'm trying >> um
>> we appreciate your sacrifice >> thank you [laughter] um so we can actually um so normally I would grab I would reach for um uh get
allocated bytes for current thread. Uh
but I want to show this on .NET framework and I don't believe that's available on Oh, it is great. It's on 448. We'll use
that. Um I think maybe it was added to 48 and it wasn't in 472. Uh all right.
So we're going to do get allocated bytes for current thread minus me and then we'll do allocated blah in blah. All
right, let me also have um constant iters. We'll do some number of
iters. We'll do some number of iterations and uh we'll divide
and then here we'll say I equals Z less than its and we'll do our operation here. So the operation we're going to
here. So the operation we're going to do, let's say we have a cancellation token
uh cancellation token source and I want to register and unregister from this cancellation token source in
parallel.
So I'm going to do uh let's say parallel.4 for
zero to uh iters and we'll do um cts.to token.register register. Doesn't
cts.to token.register register. Doesn't
matter what we're registering. And so
we're registering, unregistering, registering, unregistering. And we're
registering, unregistering. And we're doing that on I think I'm on a 16 core machine. So we're doing it 16 times 16
machine. So we're doing it 16 times 16 um at a time for a thousand of them. And
if I run this on .NET Framework 4.8 started, that succeeded. Where's my
window?
Oh, that was a little too fast. Let's do
uh some more iterations.
>> It's pretty fast.
>> That's still too fast.
>> It's still like three orders of magnitude too fast.
>> Yeah.
>> Darn these fast computers. Um I've got a loop uh and repeatedly I'm just getting the um amount of memory starting a timer and in parallel I'm registering unregistering registering unregistering
and so on that I'm stopping and I'm printing out actually I don't even need the the memory but right now we're just going to focus on time um printing out the number of nanoseconds that this
took. So on net framework 4.8 8 this is
took. So on net framework 4.8 8 this is taking about
25 to 30 milliseconds. Okay.
Um on net 10 this is taking 60 50 or 60 nconds. So it's about arguably double >> slow.
>> Yeah.
>> Sorry milliseconds. Um yeah
>> it's about twice as slow as it was on fabric. That's not that's not normally
fabric. That's not that's not normally the direction that I show these improvements. Normally I say, "Oh, look,
improvements. Normally I say, "Oh, look, .NET Framework was slow and .NET 10 is fast."
fast." >> Um, but this is a case where .NET Framework was optimized for something that was very different. So to your question,
>> you asked what was it optimized for? So
we got to rewind time back 15 years.
We're thinking about many cores. We're
thinking about the end of Moore's law.
We're thinking about how do I take a single problem and spread it over the 4 8 16 32,024
cores that are in my machine.
And in such a situation, you're you're partitioning a particular problem up into lots of little pieces to all run on on different cores so that they can they
come back together to form the answer to the the single problem. Um, and
in that world, you're then passing a cancellation token to all of these individual little pieces. So, imagine
you were doing some sort of recursive divide and conquer like um, uh, quicksort. You were parallelizing
quicksort. You were parallelizing quicksort. The the quicksort algorithm
quicksort. The the quicksort algorithm is you partition your data into values that are less than the pivot and greater than the pivot and then you recursively quick sort on on each half or each each part. And then for each of those you
part. And then for each of those you partition to less than the pivot and greater than the pivot and you rec you recursively quick sort on those until you get to some base case. Um in that world if I was passing a cancellation
token into a task that represented that top that entry point and then that is kicking off two subtasks. I'm going to be passing the cancellation token to each of those subtasks and each of those is going to have two subtasks and it's
going to pass the cancellation token to those. So, I'm going to end up in a
those. So, I'm going to end up in a world where I've got lots of cores each processing these small pieces of work all registering and unregistering registering and unregistering with this
token in uh in a massively parallel way.
>> They're paying a tax and the tax is huge often in parallel, >> right? And so cancellation token we we
>> right? And so cancellation token we we were in a world where we thought this was the future. And so we optimized cancellation token for that which means
that cancellation token um had a really nice and sophisticated lockfree algorithm for uh doing these
registrations and unregistrations in sort of a a a partitioned or in a distributed way. there was uh data
distributed way. there was uh data structures that held sort of per core uh per core um collections of these
callbacks and that made it basically so that two different cores that were registering and unregistering didn't have to synchronize with each other. So
you you end up with uh much better throughput as a result. Now, if I were to show uh memory here,
um well, let me change the demo slightly. If I just put
slightly. If I just put uh this in the middle, um
>> it's going to get gedded at the end.
Yeah. Let me just uh use this.
>> Yep.
So, we can see uh basically how much this instance costs.
>> Okay.
>> Uh let me just register with that.
>> Actually, I don't even need to know if I can just register with it and unregister. That'll be sufficient. Uh
unregister. That'll be sufficient. Uh
>> so you just register it doing nothing and then dispose it immediately.
>> Uh so if I were to print out memory instead. So let me print out me
instead. So let me print out me and uh for
i equals itters and then we'll divide by which which version I'm oh let me lower this a little bit. Uh if I'm on
framework, if I'm on dunet 10 and I run this um it's so we're getting about 192 bytes per iteration. Uh if I'm on framework
per iteration. Uh if I'm on framework and I run this, >> we can see the cost associated with sort of the these extra the data structure.
It's about twice as big. And and if I were to do other stuff, it would be even bigger. uh in order to accommodate
bigger. uh in order to accommodate this need um this sort of partitioned data structure. Um the thing is this
data structure. Um the thing is this scenario >> almost never happens these days. That's
not what can how cancellation token is typically used.
>> Well, it seems like it would be too granular of a use of a token as well.
Like you're you're >> what usually >> let the thing hap let it let it finish a little bit more of a chunk of work before you decide to cancel.
>> Yeah. usually happens is a cancellation token is passed into some sort of more serial operation. If there's
serial operation. If there's concurrency, it's limited for specific operation. You know, maybe this is a
operation. You know, maybe this is a cancellation token associated with an ASP.NET web request, right? And that
you're going to do a whole bunch of reads and writes as as part of that operation, but they're sort of they're serialized. They're one after the other.
serialized. They're one after the other.
So, you're registering and unregistering registering unregistering with the token, but you're doing so one after the other, not concurrently. And if there is
concurrently. And if there is concurrency, maybe you kicked off three or four operations and they may happen to um to have conflicts with the token
at the same time, but it's rare. And so
it was just optimized for something that is no longer or maybe never was the dominant case. And so in core, we
dominant case. And so in core, we basically undid that optimization um and just made it much more simple, much closer to what we wrote by hand
where there's effectively just a lock and we're, you know, we don't care about the overhead if there's lots of threads banging on it because it's just not a scenario that's interesting. What's neat
about that, you know, we we talk about um optimizations um adding more code, right? optimization is
basically you you're adding additional code in the general case in order to make something faster and in adding that code it makes it less maintainable. It
makes it harder to do other things. By
removing that optimization it it made it a lot more simple and then it allowed us to do other things that would have been really hard in the initial implementation like pull the underlying
objects that get allocated when you register and unregister. So if I move this back out here, now I'm only reg now I'm only measuring the cost of
registering and unregistering. If I run this on net framework, we can see we're spending 56 bytes
>> uh per register and unregister on net 10.
uh it's zero because it's able to just reuse the same node. It it created a node for the administration and then when it was disposed it's able to reuse that node under the covers that in cancellation token the thing that was
being put into the list. Um so it affords us these additional optimization capabilities because we actually simplified what was there on net framework. I think it's important to
framework. I think it's important to note that.net changes over time that it it's needing what what what we were writing 20 years ago is different from what we're writing now. You're
acknowledging that there is a fundamental difference and it's okay to change with that and it's tra it's about tradeoffs. It's not about moving forward
tradeoffs. It's not about moving forward or moving backwards. In this case, it's a lateral move.
>> Well, often are almost enti almost there. The best optimization is one
there. The best optimization is one where you can just eliminate unnecessary work.
>> But those are unicorns. Like those are >> those very rarely happen. When they do, it's awesome.
>> But the vast majority of optimizations are trade-offs. They're penalizing
are trade-offs. They're penalizing something that you expect to be relatively rare in exchange for making something that
you expect to be more common faster. Uh
and so you can see those trade-offs very clearly here. There are these different
clearly here. There are these different use cases and we're having to choose which ones we we optimize for. Now in
certain situations um we choose to kind of have our cake and eat it too where like with dictionary there is an I dictionary interface and then there are many different
implementations of that interface. If
you just want sort of the standard workhorse you can use dictionary. If you
want one that's read only you can use read only dictionary. If you want one that's optimized for being long lived and being uh read read only, you can use frozen dictionary. If you want one
frozen dictionary. If you want one that's optimized for being able to tear off immutable copies, you have immutable dictionary. Um but you can use I
dictionary. Um but you can use I dictionary sort of to back them all.
With cancellation token, we made the choice to just have a single thing to keep it simple from a consumption perspective. There's just one use case.
perspective. There's just one use case.
You just get a token and you either pass it along or you register with it. uh and
so we're the implementation needs to we basically have to decide what is the case that we're optimizing it for and we have chosen the one that sort of is pervasive now across the workloads that
we could see with 2020 hindsight when it was initially being developed we were speculating and we were speculating at a very different time than than we're in now at that time task even wasn't really
focused on asynchr was focused on parallelism >> um and that has obviously changed.
>> So, cancellation token less of a mystery than we thought. Uh, it turns out that this was a pretty solid 1hour deep.net performed entirely on with caffeine and
Nyquil. So, we appreciate again
Nyquil. So, we appreciate again sacrifice >> and and for your time. Thanks so much.
And again, let us know in the comments if you enjoy this uh enjoy the show.
Oh.
Loading video analysis...