What’s the default stack size of a goroutine?

For the default stack size of a goroutine. Basically it’s 2Kb pre routine. But soon we can discover in the source code, there is more stack space needed for OS-specific purposes like signal handling. And the code will round the sum of them to a power of 2:

// The minimum stack size to allocate.
// The hackery here rounds FixedStack0 up to a power of 2.
_FixedStack0 = _StackMin + _StackSystem
_FixedStack1 = _FixedStack0 - 1
_FixedStack2 = _FixedStack1 | (_FixedStack1 >> 1)
_FixedStack3 = _FixedStack2 | (_FixedStack2 >> 2)
_FixedStack4 = _FixedStack3 | (_FixedStack3 >> 4)
_FixedStack5 = _FixedStack4 | (_FixedStack4 >> 8)
_FixedStack6 = _FixedStack5 | (_FixedStack5 >> 16)
_FixedStack  = _FixedStack6 + 1

The _FixedStack will finally assign to variable s, the stack in func stackpoolalloc.

Where to increase the stack size?

I know the answer is pretty clear, but let’s find them in the assembly.
Give this example code:

// example.go
package main

func f() {}

func main() {

We use dlv debug example.go and type disass -l main.main. And we can see following output:

example.go:5	0x1054c00	65488b0c2530000000	mov rcx, qword ptr gs:[0x30]
example.go:5	0x1054c09	483b6110		cmp rsp, qword ptr [rcx+0x10]
example.go:5	0x1054c0d	761a			jbe 0x1054c29
example.go:5	0x1054c0f	4883ec08		sub rsp, 0x8
example.go:5	0x1054c13	48892c24		mov qword ptr [rsp], rbp
example.go:5	0x1054c17	488d2c24		lea rbp, ptr [rsp]
example.go:6	0x1054c1b	e8d0ffffff		call $main.f
example.go:7	0x1054c20	488b2c24		mov rbp, qword ptr [rsp]
example.go:7	0x1054c24	4883c408		add rsp, 0x8
example.go:7	0x1054c28	c3			ret
example.go:5	0x1054c29	e86284ffff		call $runtime.morestack_noctxt
example.go:5	0x1054c2e	ebd0			jmp $main.main

In jbe 0x1054c29, jbe is “jump if below or equal (unsigned)” and if the jump is taken, the instruction in 0x1054c29 would be called, which is call $runtime.morestack_noctxt. The cmp would set CF flag for jbe when ptr [rcx+0x10] is below or equal to rsp. Which means “the stack is not enough”.


Why put $runtime.morestack_noctxt at the end of the function?

The main reason is for static branch prediction. Which implies that this conditional jump will not be taken that frequently. The critial instructions can be tightened and executed fluently by CPU which impoves performance.



go version go1.12 darwin/amd64
Delve Debugger
Version: 1.2.0
Build: $Id: 068e2451004e95d0b042e5257e34f0f08ce01466 $