What’s the default stack size of a goroutine?
For the default stack size of a goroutine. Basically it’s 2Kb pre routine. But soon we can discover in the source code, there is more stack space needed for OS-specific purposes like signal handling. And the code will round the sum of them to a power of 2:
// The minimum stack size to allocate.
// The hackery here rounds FixedStack0 up to a power of 2.
_FixedStack0 = _StackMin + _StackSystem
_FixedStack1 = _FixedStack0 - 1
_FixedStack2 = _FixedStack1 | (_FixedStack1 >> 1)
_FixedStack3 = _FixedStack2 | (_FixedStack2 >> 2)
_FixedStack4 = _FixedStack3 | (_FixedStack3 >> 4)
_FixedStack5 = _FixedStack4 | (_FixedStack4 >> 8)
_FixedStack6 = _FixedStack5 | (_FixedStack5 >> 16)
_FixedStack = _FixedStack6 + 1
The _FixedStack will finally assign to variable s
, the stack in func stackpoolalloc
.
Where to increase the stack size?
I know the answer is pretty clear, but let’s find them in the assembly.
Give this example code:
// example.go
package main
func f() {}
func main() {
f()
}
We use dlv debug example.go
and type disass -l main.main
. And we can see following output:
example.go:5 0x1054c00 65488b0c2530000000 mov rcx, qword ptr gs:[0x30]
example.go:5 0x1054c09 483b6110 cmp rsp, qword ptr [rcx+0x10]
example.go:5 0x1054c0d 761a jbe 0x1054c29
example.go:5 0x1054c0f 4883ec08 sub rsp, 0x8
example.go:5 0x1054c13 48892c24 mov qword ptr [rsp], rbp
example.go:5 0x1054c17 488d2c24 lea rbp, ptr [rsp]
example.go:6 0x1054c1b e8d0ffffff call $main.f
example.go:7 0x1054c20 488b2c24 mov rbp, qword ptr [rsp]
example.go:7 0x1054c24 4883c408 add rsp, 0x8
example.go:7 0x1054c28 c3 ret
example.go:5 0x1054c29 e86284ffff call $runtime.morestack_noctxt
example.go:5 0x1054c2e ebd0 jmp $main.main
In jbe 0x1054c29
, jbe is “jump if below or equal (unsigned)” and if the jump is taken, the instruction in 0x1054c29 would be called, which is call $runtime.morestack_noctxt
. The cmp would set CF
flag for jbe when ptr [rcx+0x10]
is below or equal to rsp. Which means “the stack is not enough”.
Ref:
Why put $runtime.morestack_noctxt at the end of the function?
The main reason is for static branch prediction. Which implies that this conditional jump will not be taken that frequently. The critial instructions can be tightened and executed fluently by CPU which impoves performance.
Ref:
Environment
go version go1.12 darwin/amd64
Delve Debugger
Version: 1.2.0
Build: $Id: 068e2451004e95d0b042e5257e34f0f08ce01466 $