Context

Introduction

In this post, I've tried to summarize some notes I've taken while reading through the code of the context package. I

will share my mental model that has been effective for me so far, as well as some code I wrote to validate my

understanding.

Mental model

I think of the context package as an abstraction for working with cancellation trees. We can pass nodes from these

trees to functions in order to give them the ability to detect cancellation signals as they ripple through the branches.

The package exports a function called Background that can be used to create new root nodes, and a bunch of other

functions like, WithCancel, WithTimeout, and WithDeadline to derive new branches.

There is no magic involved though. The functions you pass your context to won't get stopped automatically by the

runtime. Instead, it works by having developers follow a convention: any function that receives a context can use its

Done method to access a read-only channel. The expectation is that upon receiving a signal on this channel, the

function should stop.

There is one caveat here. The root node that we retrieve using context.Background is always going to return a

nil channel:

package context

type backgroundCtx struct{ emptyCtx }

func (emptyCtx) Done() <-chan struct{} {

	return nil

Hence, reading from a root node is going to block forever. To be honest, I don't find the name Background very

intuitive. Whenever I review a pull request, I to try substitute it in my head with UncancellableRootNode. By doing

so, I find it easier to ask myself if I think this function should be allowed to potentially run forever, and if

justifies the creation of a new cancellation tree, or if its merely an extension of some larger operation.

Propagation

Let's proceed by constructing a cancellation tree in its simplest form: a single straight line:

            ┌────────────┐

            │    root    │

            └────────────┘

│

▼

            ┌────────────┐

            │  nodeOne   │

            └────────────┘

│

▼

            ┌────────────┐

            │  nodeTwo   │

            └────────────┘

│

▼

            ┌────────────┐

            │ nodeThree  │

            └────────────┘

To create the tree as illustrated above, we're going to use Background for the root node, and WithCancel for the

branches:

// print is going to self-cancel after 2 seconds.

func print(ctx context.Context, wg *sync.WaitGroup, name string) {

	select {

	case <-time.After(2 * time.Second):

		fmt.Println(name, "timed out")

	case <-ctx.Done():

		fmt.Println(name, "canceled")

	wg.Done()

func main() {

	root := context.Background()

	wg := sync.WaitGroup{}

	wg.Add(3)

	nodeOne, cancelOne := context.WithCancel(root)

	defer cancelOne()

	go print(nodeOne, &wg, "nodeOne")

	nodeTwo, cancelTwo := context.WithCancel(nodeOne)

	defer cancelTwo()

	go print(nodeTwo, &wg, "nodeTwo")

	nodeThree, cancelThree := context.WithCancel(nodeTwo)

	defer cancelThree()

	go print(nodeThree, &wg, "nodeThree")

	wg.Wait()

To create a new branch using WithCancel, we'll have to specify an existing node that we'd like to branch from. In

return, we'll get the newly created node and a function for cancelling it.

The code above also highlights the importance of deferring a call to cancel the node. This is because, upon entering the

select statement, we're going to enter the case where a message is received on the time.After channel - rather

than the context. This makes the print function and, subsequently, the main function return and exit. Should this

happen, e.g the other operation completes first, we'll release the resources for the context at the same time.

Therefore, it's important to make sure that we always call cancel or we'll create a memory leak.

If we run this program, we should see the following being printed to our terminal:

❯ go run .

nodeOne timed out

nodeTwo timed out

nodeThree timed out

Now, to observe how cancellations propagate, we can modify the code so that nodeTwo, which sits in the middle of the

tree, is cancelled 1 second earlier:

func main() {

    // ...

	nodeTwo, cancelTwo := context.WithCancel(nodeOne)

	time.AfterFunc(time.Second, cancelTwo) // This line was changed.

	go print(nodeTwo, &wg, "nodeTwo")

    // ...

Running the code again, yields the following result:

❯ go run .

nodeThree canceled

nodeTwo canceled

nodeOne timed out

Here, we can see that cancelling a node will traverse the tree and cancel the nodes children as well. The

cancellations only propagate down never up.

Looking at the output, one might mistakenly assume that the context package traverses the tree the all the way down, and

then performs the cancellations bottoms up, but this is not the case. In reality, if we examine the code within the

context package, we'll find that each cancellable context maintains an internal map of its children:

children map[canceler]struct{}

and when we call cancel on a node, it's going to close it's own channel first, and then perform a depth first

traversal to cancel all of its descendants:

func (c *cancelCtx) cancel(removeFromParent bool, err, cause error) {

    // ...

	d, _ := c.done.Load().(chan struct{})

	if d == nil {

		c.done.Store(closedchan)

	} else {

		close(d) // NOTE: This where this nodes channel is being closed.

    for child := range c.children {

        child.cancel(false, err, cause) // NOTE: This is where it calls the same cancel function for all of its children.

Seeing this, it might feel unintuiative that "nodeThree canceled" was printed before "nodeTwo canceled", however, the

channels for both nodes are closed almost simultaneously, probably within nanoseconds of each other.

The decision of which goroutine to wake up first is going to fall on the scheduler. Therefore, if we were to run the

program multiple times, we should be able to see the messages alternate:

❯ go run .

nodeTwo canceled

nodeThree canceled

nodeOne timed out

❯ go run .

nodeThree canceled

nodeTwo canceled

nodeOne timed out

The key takeaway here is that we shouldn't structure our programs in a way where we rely on our nodes to be cancelled in

a specific order. Regardless of whether a goroutine listens to a node at the top of the tree, or to another node 100

branches down, the order in which their cancellation logic gets to execute is going to be nondeterministic.

More branching

So far, we've used Background to create a new tree, and WithCancel for our branches.

We've observed that the channels from nodes created with WithCancel close only when we explicitly invoke the cancel

function, or if a signal is propagating from one of its ancestors.

In addition to this, there is a third set of nodes that can be created using the WithTimeout and WithDeadline

functions. These Nodes have, in addition to the cancel function, a third time-based mechanism for closing their

channels.

And although these function have different names, the nodes they create are functionally equivalent:

func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc) {

	return WithDeadline(parent, time.Now().Add(timeout))

100

Your choice between them depends solely on wether you want to specify the self-cancellation timing using a

101

time.Duration or a time.Time.

102

103

Let us proceed by modifying nodeOne to cancel itself after 100 milliseconds like this:

104

105

func main() {

106

    // ...

107

108

	nodeOne, cancelOne := context.WithTimeout(root, time.Millisecond * 100)

109

	defer cancelOne()

110

	go print(nodeOne, &wg, "nodeOne")

111

112

    // ...

113

106

107

And running the code again we should see the cancellation message being printed for all of our nodes:

108

109

❯ go run .

110

nodeOne canceled

111

nodeTwo canceled

112

nodeThree canceled

110

111

Having this ability to create branches based on time is really powerful. It allows us to build a cancellation tree based

112

on priority, and distribute the nodes across different functions.

113

114

Let's use a search endpoint as an example. The entire search operation could be divided further into multiple multiple

115

sub-operations. One for performing a text search another for images, a third based on location, and so on.

116

117

If we deem the image search to be a less critical feature, we could assign it a node with a shorter timeout. By doing

118

so, we're able to restrict it's abillity to effect the search operations response time as a whole.

119

120

Ending notes

121

Making the context.Context abstraction part of the standard library was a really wise decision by the Go team.

122

123

Having cancellations propagate to release resources at scale can be notoriously difficult to achieve. Often, it's under

124

high load or, unfortunately, during an incident, that we realize that expensive operation wasn't terminated in time.

125

126

I also appreciate how the the standard library usually handles the creation of the more complex cancellation trees for

127

us. For example, consider this basic HTTP server I've set up to mirror the search scenario we discussed earlier:

128

129

func main() {

130

	fmt.Println("Starting server on :8080")

131

	http.HandleFunc("/search", searchHandler)

132

	log.Fatal(http.ListenAndServe(":8080", nil))

133

134

135

// searchHandler orchestrates the search operation, initiating two

136

// parallel sub-operations: one for text and another for images.

137

func searchHandler(w http.ResponseWriter, r *http.Request) {

138

	query := r.URL.Query().Get("query")

139

	// We don't have to create a cancellation tree ourselves, instead we're able to

140

	// add branches to an existing one that the standard library has created for us.

141

	ctx := r.Context()

142

143

	// Here, we're adding one branch to the tree that is cancelled in

144

	// 2 seconds. We'll use this node when performing the text search.

145

	highPriorityBranch, highPriorityCancel := context.WithTimeout(ctx, time.Second*2)

146

	defer highPriorityCancel()

147

	go performSearch(highPriorityBranch, "text", query)

148

149

	// We consider the image search more of a nice-to-have, therefore we'll cancel this

150

	// branch 1 second earlier to reduce it's abillity to affect the overall response time.

151

	lowPrioBranch, lowPrioCancel := context.WithTimeout(ctx, time.Second)

152

	defer lowPrioCancel()

153

	go performSearch(lowPrioBranch, "image", query)

154

155

	// Sleep to allow the timeouts to trigger.

156

	time.Sleep(time.Second * 2)

157

	w.Write([]byte("Search completed"))

158

159

160

// performSearch captures the current time, waits for the context

161

// to cancel, and then prints the elapsed waiting time.

162

func performSearch(ctx context.Context, operation, query string) {

163

	before := time.Now()

164

	<-ctx.Done()

165

	duration := time.Since(before)

166

	fmt.Printf("Cancelling the %s search for %s after %s\n", operation, query, duration)

167

130

131

If we were to start this server:

132

133

❯ go run .

134

Starting server on :8080

134

135

and curl it from a separate terminal window:

136

❯ curl "http://localhost:8080/search?query=go"

137

138

We can see that the image search was cancelled 1 second before the the text search:

139

140

Cancelling the image search for go after 1.001227208s

141

Cancelling the text search for go after 2.001186333s

141

142

The important part here is that we didn't construct the cancellation tree ourselves. Instead, we added two branches to

143

the node that the standard library attached to the request. Too see why this was beneficial, we'll make another request

144

and immediately cancel it by hitting CTRL + C on our keyboard. As a result, we should be able to see the following:

145

146

Cancelling the image search for go after 185.223375ms

147

Cancelling the text search for go after 185.257167ms

147

148

As you can see, we were able to release all of our resources as soon as the client chose to close the connection. This

149

works because the server generates a node for each request, and if the client closes the connection prematurely, it

150

invokes this node's cancel method. So, by making this node our ancestor, we ensure that the cancellation signal

151

reaches our branches too.

152

153

This concludes the post, I hope you've enjoyed it!

153

154

The end

155

I usually tweet something when I've finished writing a new post. You can find me on Twitter

156

by clicking

File name	Tags	Time to read	Created at
context	go context	8 minutes	2024-02-28
circular-buffers	go concurrency data processing	5 minutes	2024-02-04
go-directives	go compiler performance	4 minutes	2023-10-21
async-tree-traversals	node trees graphs typescript	19 minutes	2023-09-10