golang design pattern for cancelling routines inflight

I would use a single channel to communicate results, so it's much easier to gather the results and it "scales" automatically by its nature. If you need to identify the source of a result, simply use a wrapper which includes the source. Something like this:

type Result struct {
    ID     string
    Result bool
}

To simulate "real" work, the workers should use a loop doing their work in an iterative manner, and in each iteration they should check the cancellation signal. Something like this:

func foo(ctx context.Context, pretendWorkMs int, resch chan<- Result) {
    log.Printf("foo started...")
    for i := 0; i < pretendWorkMs; i++ {
        time.Sleep(time.Millisecond)
        select {
        case <-ctx.Done():
            log.Printf("foo terminated.")
            return
        default:
        }
    }
    log.Printf("foo finished")
    resch <- Result{ID: "foo", Result: false}
}

In our example the bar() is the same just replace all foo word with bar.

And now executing the jobs and terminating the rest early if one does meet our expectation looks like this:

ctx, cancel := context.WithCancel(context.Background())
defer cancel()

resch := make(chan Result, 2)

log.Println("Kicking off workers...")
go foo(ctx, 3000, resch)
go bar(ctx, 5000, resch)

for i := 0; i < cap(resch); i++ {
    result := <-resch
    log.Printf("Result of %s: %v", result.ID, result.Result)
    if !result.Result {
        cancel()
        break
    }
}
log.Println("Done.")

Running this app will output (try it on the Go Playground):

2009/11/10 23:00:00 Kicking off workers...
2009/11/10 23:00:00 bar started...
2009/11/10 23:00:00 foo started...
2009/11/10 23:00:03 foo finished
2009/11/10 23:00:03 Result of foo: false
2009/11/10 23:00:03 Done.

Some things to note. If we terminate early due to unexpected result, the cancel() function will be called, and we break out form the loop. It may be the rest of the workers also complete their work concurrently and send their result, which will not be a problem as we used a buffered channel, so their send will not block and they will end properly. Also, if they don't complete concurrently, they check ctx.Done() in their loop, and they terminate early, so the goroutines are cleaned up nicely.

Also note that the output of the above code does not print bar terminated. This is because the main() function terminates right after the loop, and once the main() function ends, it does not wait for other non-main goroutines to complete. For details, see No output from goroutine in Go. If the app would not terminate immediately, we would see that line printed too. If we add a time.Sleep() at the end of main():

log.Println("Done.")
time.Sleep(3 * time.Millisecond)

Output will be (try it on the Go Playground):

2009/11/10 23:00:00 Kicking off workers...
2009/11/10 23:00:00 bar started...
2009/11/10 23:00:00 foo started...
2009/11/10 23:00:03 foo finished
2009/11/10 23:00:03 Result of foo: false
2009/11/10 23:00:03 Done.
2009/11/10 23:00:03 bar terminated.

Now if you must wait for all workers to end either "normally" or "early" before moving on, you can achieve that in many ways.

One way is to use a sync.WaitGroup. For an example, see Prevent the main() function from terminating before goroutines finish in Golang. Another way would be to have each worker send a Result no matter how they end, and Result could contain the termination condition, e.g. normal or aborted. And the main() goroutine could continue the receive loop until it receives n values from resch. If this solution is chosen, you must ensure each worker sends a value (even if a panic occurs) to not block the main() in such cases (e.g. with using defer).

Tags:

Go