Golang 3

Concurrency in Go course notes by Oliver Frolovs, 2020.

This is different from other two sets of notes in that I had recorded only the concepts or facts that were truly new to me… so the notes are very short))

Moore’s law and other animals

Concurrent and parallel

Concurrent means that task start and end times overlap, so that their execution may happen in parallel.

True parallel execution requires having more than one instance of hardware to run several sets of instructions at the same time.

Concurrent execution is very likely to still offer advantage over sequential execution because often the tasks have to wait on IO (incl. memory access).

Parallel compilers are hard and they don’t work (yet).

Processes

On Linux, the scheduler gives every process a 20 ms time slice.

Goroutines

Interleavings are instructions in two different tasks. The order of execution between concurrent tasks is not deterministic. A race condition is a problem that might happen as a result of interleaving. It happens when the outcome of a program depends on interleaving order. Race conditions happen due to communication between tasks.

A goroutine is a lightweight thread managed by the Go runtime.

go f(x, y, z) // starts a new coroutine running f(x,y,z)

The evaluation of f, x, y, and z happens in the current goroutine and the execution of f happens in the new goroutine.

Goroutines run in the same address space, so access to shared memory must be synchronised. The sync package provides useful primitives, although you won’t need them much as there are other primitives. (???)

Synchronisation

By using synchronisation we restrict the scheduling. Every time we use synchronisation, we reduce performance, but this is necessary evil.

// GOROUTINE 1:

x = 0
x = x + 1

// MAKE GLOBAL EVENT HAPPEN
if <GLOBAL EVENT HAPPENED> {
    fmt.Printf("x = %d", x)
}

Wait groups

The sync package contains functions to synchronise between Goroutines. It includes WaitGroups.

FIXME is this implemented using defer?

// Main Thread:

var wg sync.WaitGroup

wg.Add(1)
go MouseDoTheJob(&wg) // create another goroutine
wg.Wait() // the current (main) goroutine stops until MouseDoTheJob() returns
// MouseDoTheJob() goroutine

// ...
wg.Done() // can use defer to make sure it's done at the end

A summary of WorkGroup methods:

Example

func foo(wg *sync.WaitGroup) {
    fmt.Println("'Mamka Tvoya', said the Horse.")
}

func main() {
    var wg sync.WaitGroup
    wg.Add(1)
    go foo(&wg) // because Go is pass by value, remember, and foo must call Done,
    // which decrements the counter
    wg.Wait()
    fmt.Println("But the Mouse heard only 'Neigh!'")
}

Goroutine communication

Simple form of communication between the goroutines:

Use make() to create a channel:

c := make(chan int)

Once the channel is made, one cand send and receive the data using the arrow operator:

By default the channels are unbuffered. It means it cannot hold the data in channel:

So, when using the channels in this way (unbuffered), they also perform synchronisation.

This means the channels can be used just for synchronisation, and the result can be dropped.

Not assigning to a variable, throw away the value received, synchronise only:

<- c

Buffered channels

The default capacity of a channel is zero, that is the channel is unbuffered.

But one can make a buffered channel, which have some capacity to hold the data in transit.

The example below creates a buffered channel of capacity three, that is it can hold three integers without blocking, unless the channel is full:

c := make(chan int, 3)

So, this channel would block iff the buffer is full.

The main reason to use buffering is when two goroutines operate at a different speeds. It’s a classic consumer/producer problem.

Producer -> [ B U F F E R ] -> Consumer

If the producer is faster than a consumer (on average), the buffer will allow both of them to continue without blocking, that is without reducing concurrency.

On average requirement is there because if the producer is always faster than the consumer, with a fixed buffer size, at some point the buffer is going to be full.

Revision interlude

… the size of an array is part of its type …

There is a built-in function called len that returns the number of elements of an array or slice and also of a few other data types.

Arrays have their place—they are a good representation of a transformation matrix for instance—but their most common purpose in Go is to hold storage for a slice.

Arrays, slices (and strings): The mechanics of ‘append’ by… Rob Pike!

Iterating through a channel

The following loop will continue to read from channel c, blocking when there is no data to be read. The loop will run one iteration each time a new value is received. The variable i is assigned to the read value.

// "c" is a channel as defined before

for i := range c {
    // ...
}

This for loop is infinite loop, unless the sender calls Close method on the channel c:

// Some sender goroutine...
close(c)
// Now another goroutine which listens on that channel, can quit the for loop

There is no need to call close on a channel, unless there is a for listening on it.

Receiving from multiple goroutines

The data can be read from multiple goroutines using multiple channels.

Sequential read (blocking):

a := <- c1
b := <- c2
fmt.Println(a*b)

Select statement (having a value read from just one channel is enough):

// Whichever value comes first, it gets processed and that's that...
select {
    case a = <- c1:
        fmt.Println(a)
    case b = <- c2:
        fmt.Println(b)
}

It reads whatever comes first and doesn’t block on the other channel. First come first served!

Synchronised communication

In the previous example using select we’ve been blocking on receiving data. But we can block on sending data, too.

May select either send or receive operations.

select {
    case a = <- inchan:
      fmt.Println("Received a")
    case outchan <- b:
      fmt.Println("sent b")
}

Either one of these cases can block. Whatever is unblocked first, happens.

Select with an Abort channel

for {
    select {
        case a <- c:
            fmt.Println(a) // process the data
        case <- abort
            return  // another signal received on the abort channel
    }
}

Default select

To avoid blocking, use a default clause in a select statement.

select {
    case a = <- c1:
      fmt.Println(a)
    case b = <- c2:
      fmt.Println(b)
    default:
      fmt.Println("nop")
}

Mutual exclusion

Sharing variables between goroutines can cause problems. A thread program is said to be concurrency-safe if it can be invoked concurrently with other goroutines without interfering with those other goroutines.

Variable sharing example

The following example does not always return two.

var i int
var wg sync.WaitGroup

func inc() {
    i = i + 1
    wg.Done()
}

func main() {
    wg.Add(2)
    go inc()
    go inc()
    wg.Wait()
    fmt.Println(i)

}

To understand why, think about how the statement i = i + 1 is represented in assembly. It may take more than a single instruction. Now, the interleaving between the two goroutines happens at the machine instruction level, so it is possible that the first goroutine has read the value from memory but got interrupted right after, and then the second goroutine has done the same… I see where this is going!

The granularity of concurrency is at the level of machine code instructions, not Go statements.

So, with the wrong interleaving, the value of 1, instead of 2 will be written into the variable.

Mutex

Mutex methods

Methods of sync.Mutex:

Mutex example

var i int = 0
var mut sync.Mutex     // creates a mutex; shared between all instances of inc()

func inc() {
    mut.Lock()         // locks a mutex, "flag up"
    i = i + 1
    mut.Unlock()       // unlocks a mutex, "flag down"
}

Once synchronisation

Synchronous initialisation

By definition, initialisation happens once and before everything else. But this can be hard to guarantee in a concurrent environment.

How do you guarantee in a concurrent environment, that:

Solution:

once.Do(f)

This can be put in lots of different goroutines, but it’ll be run only once.

Synchronous initialisation example

The main program creates two goroutines which both rely on some shared data which must be initialised before anything else happens:

var wg sync.WaitGroup

func main() {
    wg.Add(2)
    go bleat()
    go bleat()
    wg.Wait()
}

The setup function sets up the environment and will have been finished before the goroutines can proceed. This is ensured by the on.Do(setup) statement.

var on sync.Once

func setup() {
    fmt.Println("Init")
}

func bleat() {
    on.Do(setup)
    fmt.Println("Baa!")
    wg.Done()
}

Result:

Init
Baa!
Baa!

Deadlocks

So, channels and the primitives provided in the sync package allow for various forms of synchronisation.

Deadlocks are caused by synchronisation dependencies. Specifically, the circular depenendencies. A goroutine A is blocked because it is waiting on something done by a goroutine B. The latter goroutine, actually is blocked too, because it is waiting on the goroutine A to complete something else! Both goroutines are blocked waiting on one another!

Example

Deadlock detection

Dining philosophers problem

The problem:

This problem was lifted from Hoare’s book. I highly recommend it, it’s rather accessible and it explaines the problem and the model on which concurrency in Go is based on!