1

Introduction

2

In this post, I've tried to summarize some notes I've taken while reading through the code of the context package. I

3

will share my mental model that has been effective for me so far, as well as some code I wrote to validate my

4

understanding.

5

6

Mental model

7

I think of the context package as an abstraction for working with cancellation trees. We can pass nodes from these

8

trees to functions in order to give them the ability to detect cancellation signals as they ripple through the branches.

9

10

The package exports a function called Background that can be used to create new root nodes, and a bunch of other

11

functions like, WithCancel, WithTimeout, and WithDeadline to derive new branches.

12

13

There is no magic involved though. The functions you pass your context to won't get stopped automatically by the

14

runtime. Instead, it works by having developers follow a convention: any function that receives a context can use its

15

Done method to access a read-only channel. The expectation is that upon receiving a signal on this channel, the

16

function should stop.

17

18

There is one caveat here. The root node that we retrieve using context.Background is always going to return a

19

nil channel:

20

21
package context
22
 
23
type backgroundCtx struct{ emptyCtx }
24
 
25
func (emptyCtx) Done() <-chan struct{} {
26
	return nil
27
}
22

23

Hence, reading from a root node is going to block forever. To be honest, I don't find the name Background very

24

intuitive. Whenever I review a pull request, I to try substitute it in my head with UncancellableRootNode. By doing

25

so, I find it easier to ask myself if I think this function should be allowed to potentially run forever, and if

26

justifies the creation of a new cancellation tree, or if its merely an extension of some larger operation.

27

28

Propagation

29

Let's proceed by constructing a cancellation tree in its simplest form: a single straight line:

30

31
            ┌────────────┐
32
            │    root    │
33
            └────────────┘
34
35
36
            ┌────────────┐
37
            │  nodeOne   │
38
            └────────────┘
39
40
41
            ┌────────────┐
42
            │  nodeTwo   │
43
            └────────────┘
44
45
46
            ┌────────────┐
47
            │ nodeThree  │
48
            └────────────┘
32

33

To create the tree as illustrated above, we're going to use Background for the root node, and WithCancel for the

34

branches:

35

36
// print is going to self-cancel after 2 seconds.
37
func print(ctx context.Context, wg *sync.WaitGroup, name string) {
38
	select {
39
	case <-time.After(2 * time.Second):
40
		fmt.Println(name, "timed out")
41
	case <-ctx.Done():
42
		fmt.Println(name, "canceled")
43
	}
44
	wg.Done()
45
}
46
 
47
func main() {
48
	root := context.Background()
49
 
50
	wg := sync.WaitGroup{}
51
	wg.Add(3)
52
 
53
	nodeOne, cancelOne := context.WithCancel(root)
54
	defer cancelOne()
55
	go print(nodeOne, &wg, "nodeOne")
56
 
57
	nodeTwo, cancelTwo := context.WithCancel(nodeOne)
58
	defer cancelTwo()
59
	go print(nodeTwo, &wg, "nodeTwo")
60
 
61
	nodeThree, cancelThree := context.WithCancel(nodeTwo)
62
	defer cancelThree()
63
	go print(nodeThree, &wg, "nodeThree")
64
 
65
	wg.Wait()
66
}
37

38

To create a new branch using WithCancel, we'll have to specify an existing node that we'd like to branch from. In

39

return, we'll get the newly created node and a function for cancelling it.

40

41

The code above also highlights the importance of deferring a call to cancel the node. This is because, upon entering the

42

select statement, we're going to enter the case where a message is received on the time.After channel - rather

43

than the context. This makes the print function and, subsequently, the main function return and exit. Should this

44

happen, e.g the other operation completes first, we'll release the resources for the context at the same time.

45

Therefore, it's important to make sure that we always call cancel or we'll create a memory leak.

46

47

If we run this program, we should see the following being printed to our terminal:

48

49
❯ go run .
50
nodeOne timed out
51
nodeTwo timed out
52
nodeThree timed out
50

51

Now, to observe how cancellations propagate, we can modify the code so that nodeTwo, which sits in the middle of the

52

tree, is cancelled 1 second earlier:

53

54
func main() {
55
    // ...
56
 
57
	nodeTwo, cancelTwo := context.WithCancel(nodeOne)
58
	time.AfterFunc(time.Second, cancelTwo) // This line was changed.
59
	go print(nodeTwo, &wg, "nodeTwo")
60
 
61
    // ...
62
}
55

56

Running the code again, yields the following result:

57

58
❯ go run .
59
nodeThree canceled
60
nodeTwo canceled
61
nodeOne timed out
59

60

Here, we can see that cancelling a node will traverse the tree and cancel the nodes children as well. The

61

cancellations only propagate down never up.

62

63

Looking at the output, one might mistakenly assume that the context package traverses the tree the all the way down, and

64

then performs the cancellations bottoms up, but this is not the case. In reality, if we examine the code within the

65

context package, we'll find that each cancellable context maintains an internal map of its children:

66

67
children map[canceler]struct{}
68

69

and when we call cancel on a node, it's going to close it's own channel first, and then perform a depth first

70

traversal to cancel all of its descendants:

71

72
func (c *cancelCtx) cancel(removeFromParent bool, err, cause error) {
73
    // ...
74
	d, _ := c.done.Load().(chan struct{})
75
	if d == nil {
76
		c.done.Store(closedchan)
77
	} else {
78
		close(d) // NOTE: This where this nodes channel is being closed.
79
	}
80
 
81
    for child := range c.children {
82
        child.cancel(false, err, cause) // NOTE: This is where it calls the same cancel function for all of its children.
83
    }
84
}
73

74

Seeing this, it might feel unintuiative that "nodeThree canceled" was printed before "nodeTwo canceled", however, the

75

channels for both nodes are closed almost simultaneously, probably within nanoseconds of each other.

76

77

The decision of which goroutine to wake up first is going to fall on the scheduler. Therefore, if we were to run the

78

program multiple times, we should be able to see the messages alternate:

79

80
❯ go run .
81
nodeTwo canceled
82
nodeThree canceled
83
nodeOne timed out
84
 
85
❯ go run .
86
nodeThree canceled
87
nodeTwo canceled
88
nodeOne timed out
81

82

The key takeaway here is that we shouldn't structure our programs in a way where we rely on our nodes to be cancelled in

83

a specific order. Regardless of whether a goroutine listens to a node at the top of the tree, or to another node 100

84

branches down, the order in which their cancellation logic gets to execute is going to be nondeterministic.

85

86

More branching

87

So far, we've used Background to create a new tree, and WithCancel for our branches.

88

89

We've observed that the channels from nodes created with WithCancel close only when we explicitly invoke the cancel

90

function, or if a signal is propagating from one of its ancestors.

91

92

In addition to this, there is a third set of nodes that can be created using the WithTimeout and WithDeadline

93

functions. These Nodes have, in addition to the cancel function, a third time-based mechanism for closing their

94

channels.

95

96

And although these function have different names, the nodes they create are functionally equivalent:

97

98
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc) {
99
	return WithDeadline(parent, time.Now().Add(timeout))
100
}
99

100

Your choice between them depends solely on wether you want to specify the self-cancellation timing using a

101

time.Duration or a time.Time.

102

103

Let us proceed by modifying nodeOne to cancel itself after 100 milliseconds like this:

104

105
func main() {
106
    // ...
107
 
108
	nodeOne, cancelOne := context.WithTimeout(root, time.Millisecond * 100)
109
	defer cancelOne()
110
	go print(nodeOne, &wg, "nodeOne")
111
 
112
    // ...
113
}
106

107

And running the code again we should see the cancellation message being printed for all of our nodes:

108

109
❯ go run .
110
nodeOne canceled
111
nodeTwo canceled
112
nodeThree canceled
110

111

Having this ability to create branches based on time is really powerful. It allows us to build a cancellation tree based

112

on priority, and distribute the nodes across different functions.

113

114

Let's use a search endpoint as an example. The entire search operation could be divided further into multiple multiple

115

sub-operations. One for performing a text search another for images, a third based on location, and so on.

116

117

If we deem the image search to be a less critical feature, we could assign it a node with a shorter timeout. By doing

118

so, we're able to restrict it's abillity to effect the search operations response time as a whole.

119

120

Ending notes

121

Making the context.Context abstraction part of the standard library was a really wise decision by the Go team.

122

123

Having cancellations propagate to release resources at scale can be notoriously difficult to achieve. Often, it's under

124

high load or, unfortunately, during an incident, that we realize that expensive operation wasn't terminated in time.

125

126

I also appreciate how the the standard library usually handles the creation of the more complex cancellation trees for

127

us. For example, consider this basic HTTP server I've set up to mirror the search scenario we discussed earlier:

128

129
func main() {
130
	fmt.Println("Starting server on :8080")
131
	http.HandleFunc("/search", searchHandler)
132
	log.Fatal(http.ListenAndServe(":8080", nil))
133
}
134
 
135
// searchHandler orchestrates the search operation, initiating two
136
// parallel sub-operations: one for text and another for images.
137
func searchHandler(w http.ResponseWriter, r *http.Request) {
138
	query := r.URL.Query().Get("query")
139
	// We don't have to create a cancellation tree ourselves, instead we're able to
140
	// add branches to an existing one that the standard library has created for us.
141
	ctx := r.Context()
142
 
143
	// Here, we're adding one branch to the tree that is cancelled in
144
	// 2 seconds. We'll use this node when performing the text search.
145
	highPriorityBranch, highPriorityCancel := context.WithTimeout(ctx, time.Second*2)
146
	defer highPriorityCancel()
147
	go performSearch(highPriorityBranch, "text", query)
148
 
149
	// We consider the image search more of a nice-to-have, therefore we'll cancel this
150
	// branch 1 second earlier to reduce it's abillity to affect the overall response time.
151
	lowPrioBranch, lowPrioCancel := context.WithTimeout(ctx, time.Second)
152
	defer lowPrioCancel()
153
	go performSearch(lowPrioBranch, "image", query)
154
 
155
	// Sleep to allow the timeouts to trigger.
156
	time.Sleep(time.Second * 2)
157
	w.Write([]byte("Search completed"))
158
}
159
 
160
// performSearch captures the current time, waits for the context
161
// to cancel, and then prints the elapsed waiting time.
162
func performSearch(ctx context.Context, operation, query string) {
163
	before := time.Now()
164
	<-ctx.Done()
165
	duration := time.Since(before)
166
	fmt.Printf("Cancelling the %s search for %s after %s\n", operation, query, duration)
167
}
130

131

If we were to start this server:

132

133
❯ go run .
134
Starting server on :8080
134

135

and curl it from a separate terminal window:

136
❯ curl "http://localhost:8080/search?query=go"
137

138

We can see that the image search was cancelled 1 second before the the text search:

139

140
Cancelling the image search for go after 1.001227208s
141
Cancelling the text search for go after 2.001186333s
141

142

The important part here is that we didn't construct the cancellation tree ourselves. Instead, we added two branches to

143

the node that the standard library attached to the request. Too see why this was beneficial, we'll make another request

144

and immediately cancel it by hitting CTRL + C on our keyboard. As a result, we should be able to see the following:

145

146
Cancelling the image search for go after 185.223375ms
147
Cancelling the text search for go after 185.257167ms
147

148

As you can see, we were able to release all of our resources as soon as the client chose to close the connection. This

149

works because the server generates a node for each request, and if the client closes the connection prematurely, it

150

invokes this node's cancel method. So, by making this node our ancestor, we ensure that the cancellation signal

151

reaches our branches too.

152

153

This concludes the post, I hope you've enjoyed it!

153

154

The end

155

I usually tweet something when I've finished writing a new post. You can find me on Twitter

156

by clicking 

normalintroduction.md
||364:50

Recently Edited

Recently Edited

File name

Tags

Time to read

Created at

context

  • go
  • context
8 minutes2024-02-28

circular-buffers

  • go
  • concurrency
  • data processing
5 minutes2024-02-04

go-directives

  • go
  • compiler
  • performance
4 minutes2023-10-21

async-tree-traversals

  • node
  • trees
  • graphs
  • typescript
19 minutes2023-09-10

All Files

All Files

  • go

    5 files

  • node

    2 files

  • typescript

    1 file

  • frontend

    1 file

  • workflow

    7 files