Stack vs heap allocation of structs in Go, and how they relate to garbage collection

It's worth noting that the words "stack" and "heap" do not appear anywhere in the language spec. Your question is worded with "...is declared on the stack," and "...declared on the heap," but note that Go declaration syntax says nothing about stack or heap.

That technically makes the answer to all of your questions implementation dependent. In actuality of course, there is a stack (per goroutine!) and a heap and some things go on the stack and some on the heap. In some cases the compiler follows rigid rules (like "new always allocates on the heap") and in others the compiler does "escape analysis" to decide if an object can live on the stack or if it must be allocated on the heap.

In your example 2, escape analysis would show the pointer to the struct escaping and so the compiler would have to allocate the struct. I think the current implementation of Go follows a rigid rule in this case however, which is that if the address is taken of any part of a struct, the struct goes on the heap.

For question 3, we risk getting confused about terminology. Everything in Go is passed by value, there is no pass by reference. Here you are returning a pointer value. What's the point of pointers? Consider the following modification of your example:

type MyStructType struct{}

func myFunction1() (*MyStructType, error) {
    var chunk *MyStructType = new(MyStructType)
    // ...
    return chunk, nil
}

func myFunction2() (MyStructType, error) {
    var chunk MyStructType
    // ...
    return chunk, nil
}

type bigStruct struct {
    lots [1e6]float64
}

func myFunction3() (bigStruct, error) {
    var chunk bigStruct
    // ...
    return chunk, nil
}

I modified myFunction2 to return the struct rather than the address of the struct. Compare the assembly output of myFunction1 and myFunction2 now,

--- prog list "myFunction1" ---
0000 (s.go:5) TEXT    myFunction1+0(SB),$16-24
0001 (s.go:6) MOVQ    $type."".MyStructType+0(SB),(SP)
0002 (s.go:6) CALL    ,runtime.new+0(SB)
0003 (s.go:6) MOVQ    8(SP),AX
0004 (s.go:8) MOVQ    AX,.noname+0(FP)
0005 (s.go:8) MOVQ    $0,.noname+8(FP)
0006 (s.go:8) MOVQ    $0,.noname+16(FP)
0007 (s.go:8) RET     ,

--- prog list "myFunction2" ---
0008 (s.go:11) TEXT    myFunction2+0(SB),$0-16
0009 (s.go:12) LEAQ    chunk+0(SP),DI
0010 (s.go:12) MOVQ    $0,AX
0011 (s.go:14) LEAQ    .noname+0(FP),BX
0012 (s.go:14) LEAQ    chunk+0(SP),BX
0013 (s.go:14) MOVQ    $0,.noname+0(FP)
0014 (s.go:14) MOVQ    $0,.noname+8(FP)
0015 (s.go:14) RET     ,

Don't worry that myFunction1 output here is different than in peterSO's (excellent) answer. We're obviously running different compilers. Otherwise, see that I modfied myFunction2 to return myStructType rather than *myStructType. The call to runtime.new is gone, which in some cases would be a good thing. Hold on though, here's myFunction3,

--- prog list "myFunction3" ---
0016 (s.go:21) TEXT    myFunction3+0(SB),$8000000-8000016
0017 (s.go:22) LEAQ    chunk+-8000000(SP),DI
0018 (s.go:22) MOVQ    $0,AX
0019 (s.go:22) MOVQ    $1000000,CX
0020 (s.go:22) REP     ,
0021 (s.go:22) STOSQ   ,
0022 (s.go:24) LEAQ    chunk+-8000000(SP),SI
0023 (s.go:24) LEAQ    .noname+0(FP),DI
0024 (s.go:24) MOVQ    $1000000,CX
0025 (s.go:24) REP     ,
0026 (s.go:24) MOVSQ   ,
0027 (s.go:24) MOVQ    $0,.noname+8000000(FP)
0028 (s.go:24) MOVQ    $0,.noname+8000008(FP)
0029 (s.go:24) RET     ,

Still no call to runtime.new, and yes it really works to return an 8MB object by value. It works, but you usually wouldn't want to. The point of a pointer here would be to avoid pushing around 8MB objects.


type MyStructType struct{}

func myFunction1() (*MyStructType, error) {
    var chunk *MyStructType = new(MyStructType)
    // ...
    return chunk, nil
}

func myFunction2() (*MyStructType, error) {
    var chunk MyStructType
    // ...
    return &chunk, nil
}

In both cases, current implementations of Go would allocate memory for a struct of type MyStructType on a heap and return its address. The functions are equivalent; the compiler asm source is the same.

--- prog list "myFunction1" ---
0000 (temp.go:9) TEXT    myFunction1+0(SB),$8-12
0001 (temp.go:10) MOVL    $type."".MyStructType+0(SB),(SP)
0002 (temp.go:10) CALL    ,runtime.new+0(SB)
0003 (temp.go:10) MOVL    4(SP),BX
0004 (temp.go:12) MOVL    BX,.noname+0(FP)
0005 (temp.go:12) MOVL    $0,AX
0006 (temp.go:12) LEAL    .noname+4(FP),DI
0007 (temp.go:12) STOSL   ,
0008 (temp.go:12) STOSL   ,
0009 (temp.go:12) RET     ,

--- prog list "myFunction2" ---
0010 (temp.go:15) TEXT    myFunction2+0(SB),$8-12
0011 (temp.go:16) MOVL    $type."".MyStructType+0(SB),(SP)
0012 (temp.go:16) CALL    ,runtime.new+0(SB)
0013 (temp.go:16) MOVL    4(SP),BX
0014 (temp.go:18) MOVL    BX,.noname+0(FP)
0015 (temp.go:18) MOVL    $0,AX
0016 (temp.go:18) LEAL    .noname+4(FP),DI
0017 (temp.go:18) STOSL   ,
0018 (temp.go:18) STOSL   ,
0019 (temp.go:18) RET     ,

Calls

In a function call, the function value and arguments are evaluated in the usual order. After they are evaluated, the parameters of the call are passed by value to the function and the called function begins execution. The return parameters of the function are passed by value back to the calling function when the function returns.

All function and return parameters are passed by value. The return parameter value with type *MyStructType is an address.


According to Go's FAQ:

if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors.