Parallel For-Loop

One common way to add controlled parallelism to a loop like this is to spawn a number of worker goroutines that will read tasks from a channel. The runtime.NumCPU function may help in deciding how many workers it makes sense to spawn (make sure you set GOMAXPROCS appropriately to take advantage of those CPUs though). You then simply write the jobs to the channel and they will be handled by the workers.

In this case where the job is to initialise elements of the population slice, so using a channel of *Individual pointers might make sense. Something like this:

ch := make(chan *Individual)
for i := 0; i < nworkers; i++ {
    go initIndividuals(individualSize, ch)
}

population := make([]Individual, populationSize)
for i := 0; i < len(population); i++ {
    ch <- &population[i]
}
close(ch)

The worker goroutine would look something like this:

func initIndividuals(size int, ch <-chan *Individual) {
    for individual := range ch {
        // Or alternatively inline the createIndividual() code here if it is the only call
        *individual = createIndividual(size)
    }
}

Since tasks are not portioned out ahead of time, it doesn't matter if createIndividual takes a variable amount of time: each worker will only take on a new task when the last is complete, and will exit when there are no tasks left (since the channel is closed at that point).

But how do we know when the job has completed? The sync.WaitGroup type can help here. The code to spawn the worker goroutines can be modified like so:

ch := make(chan *Individual)
var wg sync.WaitGroup
wg.Add(nworkers)
for i := 0; i < nworkers; i++ {
    go initIndividuals(individualSize, ch, &wg)
}

The initIndividuals function is also modified to take the additional parameter, and add defer wg.Done() as the first statement. Now a call to wg.Wait() will block until all the worker goroutines have completed. You can then return the fully constructed population slice.


So basically the goroutine should not return a value but push it down a channel. If you want to wait for all goroutines to finish you can just count to the number of goroutines, or use a WaitGroup. In this example it's an overkill because the size is known, but it's good practice anyway. Here's a modified example:

package main

import (
    "math/rand"
    "sync"
)

type Individual struct {
    gene    []bool
    fitness int
}


func createPopulation(populationSize int, individualSize int) []Individual  {

    // we create a slice with a capacity of populationSize but 0 size
    // so we'll avoid extra unneeded allocations
    population := make([]Individual, 0, populationSize)

    // we create a buffered channel so writing to it won't block while we wait for the waitgroup to finish
    ch := make(chan Individual, populationSize)

    // we create a waitgroup - basically block until N tasks say they are done
    wg := sync.WaitGroup{}

    for i := 0; i < populationSize; i++ {

        //we add 1 to the wait group - each worker will decrease it back
        wg.Add(1)

        //now we spawn a goroutine
        go createIndividual(individualSize, ch, &wg)
    }

    // now we wait for everyone to finish - again, not a must.
    // you can just receive from the channel N times, and use a timeout or something for safety
    wg.Wait()

    // we need to close the channel or the following loop will get stuck
    close(ch)

    // we iterate over the closed channel and receive all data from it
    for individual := range ch {

        population = append(population, individual)
    }
    return population

}   

func createIndividual(size int, ch chan Individual, wg *sync.WaitGroup) {

    var individual = Individual{make([]bool, size), 0}

    for i := 0; i < len(individual.gene); i++ {
        if rand.Intn(2)%2 == 1 {
            individual.gene[i] = true
        } else {
            individual.gene[i] = false
        }
    }

    // push the population object down the channel
    ch <- individual
    // let the wait group know we finished
    wg.Done()

}

For your specific problem you don't need to use channels at all.

However unless your createIndividual spends some time doing calculations, the context switches between the goroutines is always gonna be much slower when run in parallel.

type Individual struct {
    gene    []bool
    fitness int
}

func createPopulation(populationSize int, individualSize int) (population []*Individual) {
    var wg sync.WaitGroup
    population = make([]*Individual, populationSize)

    wg.Add(populationSize)
    for i := 0; i < populationSize; i++ {
        go func(i int) {
            population[i] = createIndividual(individualSize)
            wg.Done()
        }(i)
    }
    wg.Wait()
    return
}

func createIndividual(size int) *Individual {
    individual := &Individual{make([]bool, size), 0}

    for i := 0; i < size; i++ {
        individual.gene[i] = rand.Intn(2)%2 == 1
    }

    return individual
}

func main() {
    numcpu := flag.Int("cpu", runtime.NumCPU(), "")
    flag.Parse()
    runtime.GOMAXPROCS(*numcpu)
    pop := createPopulation(1e2, 21e3)
    fmt.Println(len(pop))
}

Output:

┌─ oneofone@Oa [/tmp]                                                                                                            
└──➜ go build blah.go; xtime ./blah -cpu 1
100
0.13u 0.00s 0.13r 4556kB ./blah -cpu 1
┌─ oneofone@Oa [/tmp]                                                                                                            
└──➜ go build blah.go; xtime ./blah -cpu 4
100
2.10u 0.12s 0.60r 4724kB ./blah -cpu 4