Description
What version of Go are you using (go version
)?
go1.11beta1 linux/amd64
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (go env
)?
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/rk/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/rk/GoSpace/Projects"
GOPROXY=""
GORACE=""
GOROOT="/home/rk/GoSpace/GO"
GOTMPDIR=""
GOTOOLDIR="/home/rk/GoSpace/GO/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build473408654=/tmp/go-build -gno-record-gcc-switches"
VGOMODROOT=""
What did you do?
I built the code below with go 1.10 and go 1.11.
https://play.golang.org/p/22MEbiXFpzo
What did you expect to see?
The binary built by go 1.11 is as fast as that built by go 1.10.
What did you see instead?
The binary built by go 1.11 is incredibly slower than that built by go 1.10.
Go 1.11 compiles the function "down" to assembly like this:
MOVQ pos+32(SP), AX
MOVQ list+8(SP), CX
MOVL (CX)(AX*4), DX
LEAQ 1(AX)(AX*1), BX
MOVQ list+16(SP), SI
DECQ SI
JMP L42
L28:
MOVL DI, (CX)(AX*4)
LEAQ 1(BX)(BX*1), DI
MOVQ BX, AX
MOVQ DI, BX
L42:
CMPQ BX, SI
JGE L73
MOVL 4(CX)(BX*4), DI
LEAQ 1(BX), R8
MOVL (CX)(BX*4), R9
CMPL DI, R9
CMOVQHI R8, BX
// JHI L100
L66:
MOVL (CX)(BX*4), DI
CMPL DX, DI
JCS L28
L73:
CMPQ BX, SI
JNE L97
MOVL (CX)(BX*4), SI
CMPL DX, SI
JCC L92
MOVL SI, (CX)(AX*4)
L88:
MOVL DX, (CX)(BX*4)
RET
L92:
MOVQ AX, BX
JMP L88
L97:
MOVQ AX, BX
JMP L88
L100:
// MOVQ R8, BX
// JMP L66
If replacing the CMOV instruction with a branch, it can be 80% faster.