Golang 3

Concurrency in Go course notes by Oliver Frolovs, 2020.

This is different from other two sets of notes in that I had recorded only the concepts or facts that were truly new to me… so the notes are very short))

Moore’s law and other animals

Dennard scaling (1974) – an observation stating that *as the transistors get smaller their power density stays constant. Not true anymore.

Concurrent and parallel

Concurrent means that task start and end times overlap, so that their execution may happen in parallel.

True parallel execution requires having more than one instance of hardware to run several sets of instructions at the same time.

Concurrent execution is very likely to still offer advantage over sequential execution because often the tasks have to wait on IO (incl. memory access).

Parallel compilers are hard and they don’t work (yet).

Processes

On Linux, the scheduler gives every process a 20 ms time slice.

Goroutines

Goroutines are a thread but in Go.
Many goroutines execute within a single OS thread.
From OS point of view, it schedules the “main thread”, but inside it Go switches its goroutines.
Go Runtime Scheduling does switching.
Go coroutines mapped to a logical processor which is tied to the main thread. There is no opportunity for the parallelism in this scheme of things, but then you can add cores, hence add logical processors, and then Go runtime can do parallel scheduling.

Interleavings are instructions in two different tasks. The order of execution between concurrent tasks is not deterministic. A race condition is a problem that might happen as a result of interleaving. It happens when the outcome of a program depends on interleaving order. Race conditions happen due to communication between tasks.

A goroutine is a lightweight thread managed by the Go runtime.

go f(x, y, z) // starts a new coroutine running f(x,y,z)

The evaluation of f, x, y, and z happens in the current goroutine and the execution of f happens in the new goroutine.

Goroutines run in the same address space, so access to shared memory must be synchronised. The sync package provides useful primitives, although you won’t need them much as there are other primitives. (???)

Synchronisation

By using synchronisation we restrict the scheduling. Every time we use synchronisation, we reduce performance, but this is necessary evil.

// GOROUTINE 1:

x = 0
x = x + 1

// MAKE GLOBAL EVENT HAPPEN

if <GLOBAL EVENT HAPPENED> {
    fmt.Printf("x = %d", x)
}

Wait groups

The sync package contains functions to synchronise between Goroutines. It includes WaitGroups.

sync.WaitGroup forces a goroutine to wait for other goroutines. The goroutine would not continue until all goroutines in a waiting group have finished.
A WaitGroup is a general type with a lot of methods defined for it.
The WaitGroup object internally contains a counting semaphore. The counter is increment for each goroutine added to the group, and decremented when a goroutine from the group is completed.

FIXME is this implemented using defer?

// Main Thread:

var wg sync.WaitGroup

wg.Add(1)
go MouseDoTheJob(&wg) // create another goroutine
wg.Wait() // the current (main) goroutine stops until MouseDoTheJob() returns

// MouseDoTheJob() goroutine

// ...
wg.Done() // can use defer to make sure it's done at the end

A summary of WorkGroup methods:

Add increments the counter
Done decrements the counter. It’s executed at end of each goroutine.
Wait would block its calling goroutine, until the counter is zero.

Example

func foo(wg *sync.WaitGroup) {
    fmt.Println("'Mamka Tvoya', said the Horse.")
}

func main() {
    var wg sync.WaitGroup
    wg.Add(1)
    go foo(&wg) // because Go is pass by value, remember, and foo must call Done,
    // which decrements the counter
    wg.Wait()
    fmt.Println("But the Mouse heard only 'Neigh!'")
}

Goroutine communication

Simple form of communication between the goroutines:

the initial data send to the goroutines
the results are sent back from the goroutines
The goroutines communicate using channels.
The channels are, essentially, pipes.
The channels are typed.

Use make() to create a channel:

c := make(chan int)

Once the channel is made, one cand send and receive the data using the arrow operator:

Send data on a channel: c <- 3
Receive data from a channel: x := <- c

By default the channels are unbuffered. It means it cannot hold the data in channel:

the sending operation has to block until the data is received.
the receiving operation has to block until the data is sent.

So, when using the channels in this way (unbuffered), they also perform synchronisation.

Channel communication is synchronous.
Blocking is the same as waiting for communication.

This means the channels can be used just for synchronisation, and the result can be dropped.

Not assigning to a variable, throw away the value received, synchronise only:

<- c

Buffered channels

The default capacity of a channel is zero, that is the channel is unbuffered.

But one can make a buffered channel, which have some capacity to hold the data in transit.

The example below creates a buffered channel of capacity three, that is it can hold three integers without blocking, unless the channel is full:

c := make(chan int, 3)

So, this channel would block iff the buffer is full.

The main reason to use buffering is when two goroutines operate at a different speeds. It’s a classic consumer/producer problem.

Producer -> [ B

R ] -> Consumer

If the producer is faster than a consumer (on average), the buffer will allow both of them to continue without blocking, that is without reducing concurrency.

On average requirement is there because if the producer is always faster than the consumer, with a fixed buffer size, at some point the buffer is going to be full.

Revision interlude

… the size of an array is part of its type …

…

There is a built-in function called len that returns the number of elements of an array or slice and also of a few other data types.

…

Arrays have their place—they are a good representation of a transformation matrix for instance—but their most common purpose in Go is to hold storage for a slice.

…

Arrays, slices (and strings): The mechanics of ‘append’ by… Rob Pike!

Iterating through a channel

The following loop will continue to read from channel c, blocking when there is no data to be read. The loop will run one iteration each time a new value is received. The variable i is assigned to the read value.

// "c" is a channel as defined before

for i := range c {
    // ...
}

This for loop is infinite loop, unless the sender calls Close method on the channel c:

// Some sender goroutine...
close(c)
// Now another goroutine which listens on that channel, can quit the for loop

There is no need to call close on a channel, unless there is a for listening on it.

Receiving from multiple goroutines

The data can be read from multiple goroutines using multiple channels.

Sequential read (blocking):

a := <- c1
b := <- c2
fmt.Println(a*b)

Select statement (having a value read from just one channel is enough):

// Whichever value comes first, it gets processed and that's that...
select {
    case a = <- c1:
        fmt.Println(a)
    case b = <- c2:
        fmt.Println(b)
}

It reads whatever comes first and doesn’t block on the other channel. First come first served!

Synchronised communication

In the previous example using select we’ve been blocking on receiving data. But we can block on sending data, too.

May select either send or receive operations.

select {
    case a = <- inchan:
      fmt.Println("Received a")
    case outchan <- b:
      fmt.Println("sent b")
}

Either one of these cases can block. Whatever is unblocked first, happens.

Select with an Abort channel

for {
    select {
        case a <- c:
            fmt.Println(a) // process the data
        case <- abort
            return  // another signal received on the abort channel
    }
}

Default select

To avoid blocking, use a default clause in a select statement.

select {
    case a = <- c1:
      fmt.Println(a)
    case b = <- c2:
      fmt.Println(b)
    default:
      fmt.Println("nop")
}

Mutual exclusion

Sharing variables between goroutines can cause problems. A thread program is said to be concurrency-safe if it can be invoked concurrently with other goroutines without interfering with those other goroutines.

The following example does not always return two.

var i int
var wg sync.WaitGroup

func inc() {
    i = i + 1
    wg.Done()
}

func main() {
    wg.Add(2)
    go inc()
    go inc()
    wg.Wait()
    fmt.Println(i)

}

To understand why, think about how the statement i = i + 1 is represented in assembly. It may take more than a single instruction. Now, the interleaving between the two goroutines happens at the machine instruction level, so it is possible that the first goroutine has read the value from memory but got interrupted right after, and then the second goroutine has done the same… I see where this is going!

The granularity of concurrency is at the level of machine code instructions, not Go statements.

So, with the wrong interleaving, the value of 1, instead of 2 will be written into the variable.

Mutex

Don’t let two goroutines write to a shared variable at the same time! Interleavings can cause problems.
Need to restrict possible interleavings. Access to shared variables cannot be interleaved.
The technique is called mutual exclusion, or mutex.
The means there will be code segments in different goroutines which cannot execute concurrently, so they’ll never interleave.
Mutex (as in sync.Mutex) ensure mutual exclusion. Uses a binary semaphore: flag up, flag down.

Mutex methods

Methods of sync.Mutex:

Lock method puts the flag up; shared variable in use; second call to Lock method from another goroutine will block until the first flag is lifted.
Unlock method puts the flag down; done using shared variable; when Unlock is called, a blocked call can proceed.

Mutex example

var i int = 0
var mut sync.Mutex     // creates a mutex; shared between all instances of inc()

func inc() {
    mut.Lock()         // locks a mutex, "flag up"
    i = i + 1
    mut.Unlock()       // unlocks a mutex, "flag down"
}

Once synchronisation

Synchronous initialisation

By definition, initialisation happens once and before everything else. But this can be hard to guarantee in a concurrent environment.

How do you guarantee in a concurrent environment, that:

initialisation happens only once; and
initialisation happens first, before anything else?

Solution:

One way is to initialise before starting the goroutines. But sometimes we don’t have that option.
Another way is to use sync.Once object. It has a method Do, which runs the given function only once.

once.Do(f)

This can be put in lots of different goroutines, but it’ll be run only once.

All calls to once.Do() block until the first returns. This ensures that initialisation executes first.

Synchronous initialisation example

The main program creates two goroutines which both rely on some shared data which must be initialised before anything else happens:

var wg sync.WaitGroup

func main() {
    wg.Add(2)
    go bleat()
    go bleat()
    wg.Wait()
}

The setup function sets up the environment and will have been finished before the goroutines can proceed. This is ensured by the on.Do(setup) statement.

var on sync.Once

func setup() {
    fmt.Println("Init")
}

func bleat() {
    on.Do(setup)
    fmt.Println("Baa!")
    wg.Done()
}

Result:

Init
Baa!
Baa!

Deadlocks

So, channels and the primitives provided in the sync package allow for various forms of synchronisation.

Deadlocks are caused by synchronisation dependencies. Specifically, the circular depenendencies. A goroutine A is blocked because it is waiting on something done by a goroutine B. The latter goroutine, actually is blocked too, because it is waiting on the goroutine A to complete something else! Both goroutines are blocked waiting on one another!

Example

G1 is waiting on G2 to unlock some mutex
G2 is waiting on G1 to write something into some channel

Deadlock detection

If all goroutines are deadlockes, Go runtime will detect it and panic()
If only some goroutines are deadlocked, the Go runtime cannot detect that.

Dining philosophers problem

Classic problem involving concurrency and synchronisation

The problem:

Five philosophers sitting at a round table
One chopstick is placed between each adjacent pair
Want to eat rice from their plate, but needs two chopsticks
Only one philosopher can hold a chopstick at a time
Not enough chopsticks for everyone to eat at once

This problem was lifted from Hoare’s book. I highly recommend it, it’s rather accessible and it explaines the problem and the model on which concurrency in Go is based on!

Golang 3

Moore’s law and other animals

Concurrent and parallel

Processes

Goroutines

Synchronisation

Wait groups

Example

Goroutine communication

Buffered channels

Revision interlude

Iterating through a channel

Receiving from multiple goroutines

Synchronised communication

Select with an Abort channel

Default select

Mutual exclusion

Variable sharing example

Mutex

Mutex methods

Mutex example

Once synchronisation

Synchronous initialisation

Synchronous initialisation example

Deadlocks

Example

Deadlock detection

Dining philosophers problem