This post takes a deeper look into how Go compiles method invocations; specifically, how it compiles interface method invocations. Many online descriptions of this process are outdated, because the Go compiler has switched to a new, register-based ABI in the 1.17 release for AMD64. Go 1.18 enabled the register-based ABI for additional architectures, but this post will focus on AMD64.
We'll look at a simple piece of Go code to demonstrate the machine code Go generates and explaining how it works. This code represents the inner loop of Bubble sort:
func bubbleUp(x sort.Interface) {
n := x.Len()
for i := 1; i < n; i++ {
if x.Less(i, i-1) {
x.Swap(i, i-1)
}
}
}
It goes through a data structure once, moving at least one element into its final position. The data structure is abstracted away with the sort.Interface interface, which lets us query its length, check if one element is smaller than another and to swap elements.
type Interface interface {
Len() int
Less(i, j int) bool
Swap(i, j int)
}
I've covered sort.Interface in more detail in an older post. Typically, the actual value passed into bubbleUp will be a slice of some sort, but it can really be anything, as long as it implements sort.Interface.
Interface layout in memory
First, let's discuss how values of interface types are laid out in memory by the Go compiler. This was also covered in another post, which I encourage you to read. A quick reminder: any interface value occupies two quadwords (on a 64-bit machine), holding two pointers: the first points to the dispatch table for the methods of the value, and the second points to the runtime value itself.
This structure can be observed in the Go runtime:
type iface struct {
tab *itab
data unsafe.Pointer
}
The itab type is:
type itab struct {
inter *interfacetype
_type *_type
hash uint32 // copy of _type.hash. Used for type switches.
_ [4]byte
fun [1]uintptr // variable sized. fun[0]==0 means _type does not implement inter.
}
I'll leave a fuller coverage of this type to another time, but for our purposes the important field is fun, which is a variable-sized array of function pointers - the dispatch table. This is where the generated code will expect to find functions, as we'll see shortly.
Some notes on the internal Go ABI
The Go ABI discussed here is the internal ABI of Go. As the linked page is quick to warn the reader - this ABI is completely unstable and can change between Go releases; in fact, as mentioned above it just changed in very fundamental way very recently.
In the next section we're going to walk through a detailed disassembly of the bubbleUp function. In preparation for that, I'd like to cover some salient points from the current Go ABI that will be useful to grok the generated machine code.
The current Go ABI is uses registers for function call arguments and return values. On the AMD64 architecture, the following registers are used in order:
RAX, RBX, RCX, RDI, RSI, R8, R9, R10, R11
Some values occupy multiple registers. One relevant example is interface values. As described in the previous section, interface values are represented by the iface structure, which occupies two quadwords. Therefore, when an interface value is passed as the first argument to a function, it occupies RAX and RBX for the table pointer and the data pointer, respectively. If the interface is passed as a second argument and the first argument already takes up RAX, it would be passed in RBX and RCX, and so on. By the way, the Go disassembly refers to these by the names AX, BX etc. (without the R prefix).
The same registers are used for the return values of functions, in the same order. For a function returning a single value (fitting into a quadword), this value will be placed in RAX.
Moreover, in the current Go register ABI there are no callee-saved registers; values that are live across calls have to be stored on the stack by the caller before a call is performed.
Disassembling bubbleUp
Let's now work through the disassembly generated for bubbleUp by running:
$ go tool compile -S bubble.go
The disassembly is cleaned up a bit to remove some directives for the garbage collector (PCDATA and FUNCDATA). Each block of disassembly is followed by an explanation of what it does.
This is the function prologue, which we won't cover:
"".bubbleUp STEXT size=182 args=0x10 locals=0x38 funcid=0x0 align=0x0
0x0000 00000 (bubble.go:11) TEXT "".bubbleUp(SB), ABIInternal, $56-16
0x0000 00000 (bubble.go:11) CMPQ SP, 16(R14)
0x0004 00004 (bubble.go:11) JLS 152
0x000a 00010 (bubble.go:11) SUBQ $56, SP
0x000e 00014 (bubble.go:11) MOVQ BP, 48(SP)
0x0013 00019 (bubble.go:11) LEAQ 48(SP), BP
The main code generated for the function's body is next:
0x0018 00024 (bubble.go:12) MOVQ AX, "".x+64(SP)
0x001d 00029 (bubble.go:12) MOVQ BX, "".x+72(SP)
0x0022 00034 (bubble.go:12) MOVQ 24(AX), CX
0x0026 00038 (bubble.go:12) MOVQ BX, AX
0x0029 00041 (bubble.go:12) CALL CX
The parameter to bubbleUp is x sort.Interface - the interface value. As we've seen above, the interface value in Go is represented by two quadwords, and according to the ABI will be laid out in two consecutive registers: AX and BX. The code starts by saving these onto the stack because the registers will be needed in a subsequent call.
Then the compiler needs to arrange for the call x.Len(). For this it needs to find which function to call, and then generate the code to call it.
The function to call is taken from 24(AX). He's how it works:
- As mentioned earlier, AX is the first quadword of the interface value x passed into bubbleUp.
- Based on the layout of iface, AX is therefore a pointer to the itab structure.
- The addressing mode 24(AX) means the address at AX+24 (24 is decimal and means literally "24 bytes"), so this is the value loaded into CX.
- Since offset 24 in itab points to fun, this means that the address of the first method in the interface is loaded into CX.
- The methods in each interface are sorted by name [1], so the address of x's Len method ends up in CX.
Now that the function is found, the code sets up its arguments for invocation. Since Len has no parameters, the only argument passed is the receiver (the value this method is invoked on). bubbleUp's own second argument - which lives in BX is passed as the first argument to Len. Since BX is the second quadword of the interface value - it holds the data pointer, which is the value itself - exactly what we need.
0x002b 00043 (bubble.go:12) MOVQ AX, "".n+24(SP)
Len returns a single integer value, which is returned in AX. That's n in the Go code. Here AX is saved onto the stack (because it's going to be clobbered soon).
0x0030 00048 (bubble.go:12) MOVL $1, CX
0x0035 00053 (bubble.go:13) JMP 69 ----\
0x0037 00055 (bubble.go:13) MOVQ "".i+32(SP), DX |
0x003c 00060 (bubble.go:13) LEAQ 1(DX), CX |
0x0040 00064 (bubble.go:13) MOVQ "".n+24(SP), AX |
0x0045 00069 (bubble.go:13) CMPQ AX, CX <<---/
0x0048 00072 (bubble.go:13) JLE 142
This implements a part of the loop's condition. It starts by assigning i = 1 in CX and jumps to the condition, which compares AX (remember, that's n) to CX and jumps out of the loop if n <= i.
The code between the JMP and its target handles the i++ part, acting on registers that were saved on the stack throughout the loop iteration.
Now it's time to call x.Less. It's the second function in the interface's dispatch table, so its offset in itab is 32 (recall that the first function's offset is 24).
0x004a 00074 (bubble.go:13) MOVQ CX, "".i+32(SP)
0x004f 00079 (bubble.go:14) MOVQ "".x+64(SP), DX
0x0054 00084 (bubble.go:14) MOVQ 32(DX), SI
0x0058 00088 (bubble.go:14) LEAQ -1(CX), DI
0x005c 00092 (bubble.go:14) MOVQ DI, ""..autotmp_4+40(SP)
0x0061 00097 (bubble.go:14) MOVQ "".x+72(SP), AX
0x0066 00102 (bubble.go:14) MOVQ CX, BX
0x0069 00105 (bubble.go:14) MOVQ DI, CX
0x006c 00108 (bubble.go:14) CALL SI
This code grabs the iface back from the stack where it was saved, and places it in DX. It then loads 32(DX) into SI - so SI will contain the address of Less method of x, as described before. We have to pass in three arguments:
- The value itself (the method receiver) in AX
- i in BX
- i-1 in CX (you see it happening with the LEAQ instruction that's used to subtract 1 from i
There's a bit of a register dance here that's required to set up all the arguments properly, but otherwise it should be very clear.
0x006e 00110 (bubble.go:14) TESTB AL, AL
0x0070 00112 (bubble.go:14) JEQ 55
If Less returns a zero value (meaning false), we skip back to the next loop iteration without calling Swap. Otherwise, the call to Swap follows:
0x0072 00114 (bubble.go:15) MOVQ "".x+64(SP), DX
0x0077 00119 (bubble.go:15) MOVQ 40(DX), SI
0x007b 00123 (bubble.go:15) MOVQ "".x+72(SP), AX
0x0080 00128 (bubble.go:15) MOVQ "".i+32(SP), BX
0x0085 00133 (bubble.go:15) MOVQ ""..autotmp_4+40(SP), CX
0x008a 00138 (bubble.go:15) CALL SI
By now this should be familiar; this code finds the third function in the interface's itab with 40(DX), sets up its arguments and calls it.
This is the back-edge to the loop condition handler (recall that it will jump past this instruction when the loop is done):
0x008c 00140 (bubble.go:15) JMP 55
And finally the function's epilogue and return:
0x008e 00142 (bubble.go:18) MOVQ 48(SP), BP
0x0093 00147 (bubble.go:18) ADDQ $56, SP
0x0097 00151 (bubble.go:18) RET
0x0098 00152 (bubble.go:18) NOP
[1] | This is all, of course, implementation details, and can easily change in a future go release. |