Go in Go
Gopherfest
26 May 2015
Rob Pike
Rob Pike
As of the 1.5 release of Go, the entire system is now written in Go.
(And a little assembler.)
C is gone.
Side note: gccgo
is still going strong.
This talk is about the original compiler, gc
.
Bootstrapping.
(Also Go was not intended primarily as a compiler implementation language.)
3Not for validation; we have more pragmatic motives:
Already seeing benefits, and it's early yet.
Design document: /s/go13compiler
4
We had our own C compiler just to compile the runtime.
We needed a compiler with the same ABI as Go, such as segmented stacks.
Switching it to Go means we can get rid of the C compiler.
That's more important than converting the compiler to Go.
(All the reasons for moving the compiler apply to the runtime as well.)
Now only one language in the runtime; easier integration, stack management, etc.
As always, simplicity is the overriding consideration.
5
Why do we have our own tool chain at all?
Our own ABI?
Our own file formats?
History, familiarity, and ease of moving forward. And speed.
Many of Go's big changes would be much harder with GCC or LLVM.
news.ycombinator.com/item?id=8817990
6All made easier by owning the tools and/or moving to Go:
The last three are all but impossible in C:
(Gccgo
will have segmented stacks and imprecise (stack) collection for a while yet.)
These were each huge steps, made quickly (led by khr@
).
Mostly done by hand with machine assistance.
Challenge to implement the runtime in a safe language.
Some use of unsafe
to deal with pointers as raw bits in the GC, for instance.
But less than you might think.
The translator (next sections) helped for some of the translation.
9Why translate it, not write it from scratch? Correctness, testing.
Steps:
First output was C line-by-line translated to (bad!) Go.
Tool to do this written by rsc@
(talked about at GopherCon 2014).
Custom written for this job, not a general C-to-Go translator.
Steps:
yacc
)*p++
as an expression
The Yacc
grammar was translated by sam-powered hands.
Aided by hand-written rewrite rules, such as:
Also diff-like rewrites for things such as using the standard library:
diff { - g.Rpo = obj.Calloc(g.Num*sizeof(g.Rpo[0]), 1).([]*Flow) - idom = obj.Calloc(g.Num*sizeof(idom[0]), 1).([]int32) - if g.Rpo == nil || idom == nil { - Fatal("out of memory") - } + g.Rpo = make([]*Flow, g.Num) + idom = make([]int32, g.Num) }
This one due to semantic difference between the languages.
diff { - if nreg == 64 { - mask = ^0 // can't rely on C to shift by 64 - } else { - mask = (1 << uint(nreg)) - 1 - } + mask = (1 << uint(nreg)) - 1 }
Once in Go, new tool grind
deployed (by rsc@
):
Changes guided by profiling and other analysis:
var
declarations nearer to first use
Output from translator was poor Go, and ran about 10X slower.
Most of that slowdown has been recovered.
Problems with C to Go:
for
loopsfmt.Stringer
vs. C's varargs
unions
in Go, so use structs
instead: bloat
C compiler didn't free much memory, but Go has a GC.
Adds CPU and memory overhead.
Profile! (Never done before!)
vars
closer to first usevars
into multiplemath/big
struct
fieldsdrchase@
).
Use tools like grind
, gofmt
-r
and eg
for much of this.
Removing interface argument from a debugging print library got 15% overall!
More remains to be done.
16Other benefits of the conversion:
Garbage collection means no more worry about introducing a dangling pointer.
Chance to clean up the back ends.
Unified 386
and amd64
architectures throughout the tool chain.
New architectures are easier to add.
Unified the tools: now one compiler, one assembler, one linker.
17
GOOS=YYY
GOARCH=XXX
go
tool
compile
One compiler; no more 6g
, 8g
etc.
About 50K lines of portable code.
Even the registerizer is portable now; architectures well characterized.
Non-portable: Peepholing, details like registers bound to instructions.
Typically around 10% of the portable LOC.
GOOS=YYY
GOARCH=XXX
go
tool
asm
New assembler, all in Go, written from scratch by r@
.
Clean, idiomatic Go code.
Less than 4000 lines, <10% machine-dependent.
Almost completely compatible with previous yacc
and C assemblers.
How is this possible?
liblink
, now internal/obj
)
GOOS=YYY
GOARCH=XXX
go
tool
link
Mostly hand- and machine- translated from C code.
New library, internal/obj
, part of original linker, captures details about machines, writes object files.
27000 lines summed across 4 architectures, mostly tables (plus some ugliness).
arm
: 4000arm64
: 6000ppc64
: 5000x86
: 7500 (386
and amd64
)Example benefit: one print routine to print any instruction for any architecture.
20With no C compiler, bootstrapping requires a Go compiler.
Therefore need to build or download a working Go installation to build 1.5 from source.
We use Go 1.4+ as the base to build the 1.5+ tool chain. (Newer is OK too.)
Details: /s/go15bootstrap
21Much work still to do, but 1.5 is mostly set.
Future work:
Better escape analysis.
New compiler back end using SSA (much easier in Go than C).
Will allow much more optimization.
Generate machine descriptions from PDFs (or maybe XML).
Will have a purely machine-generated instruction definition:
"Read in PDF, write out an assembler configuration".
Already deployed for the disassemblers.
Getting rid of C was a huge advance for the project.
Code is cleaner, testable, profilable, easier to work on.
New unified tool chain reduces code size, increases maintainability.
Flexible tool chain, portability still paramount.
23