Go at Google SPLASH, Tucson, Oct 25, 2012 Rob Pike Google, Inc. https://go.dev * Preamble * What is Go? Go is: - open source - concurrent - garbage-collected - efficient - scalable - simple - fun - boring (to some) .link / go.dev * History Design began in late 2007. Key players: - Robert Griesemer, Rob Pike, Ken Thompson - Later: Ian Lance Taylor, Russ Cox Became open source in November 2009. Developed entirely in the open; very active community. Language stable as of Go 1, early 2012. * Go at Google Go is a programming language designed by Google to help solve Google's problems. Google has big problems. * Big hardware .image splash/datacenter.jpg * Big software - C++ (mostly) for servers, plus lots of Java and Python - thousands of engineers - gazillions of lines of code - distributed build system - one tree And of course: - zillions of machines, which we treat as a modest number of compute clusters Development at Google can be slow, often clumsy. But it _is_ effective. * The reason for Go Goals: - eliminate slowness - eliminate clumsiness - improve effectiveness - maintain (even improve) scale Go was designed by and for people who write—and read and debug and maintain—large software systems. Go's purpose is _not_ research into programming language design. Go's purpose is to make its designers' programming lives better. * Today's theme A talk about software engineering more than language design. To be more accurate: - Language design in the service of software engineering. In short: - How does a language help software engineering? * Features? Reaction upon launch: My favorite feature isn't in Go! Go Sucks! This misses the point. * Pain What makes large-scale development hard with C++ or Java (at least): - slow builds - uncontrolled dependencies - each programmer using a different subset of the language - poor program understanding (documentation, etc.) - duplication of effort - cost of updates - version skew - difficulty of automation (auto rewriters etc.): tooling - cross-language builds Language _features_ don't usually address these. * Focus In the design of Go, we tried to focus on solutions to _these_ problems. Example: indentation for structure vs. C-like braces * Dependencies in C and C++ * A personal history of dependencies in C `#ifdef` `#ifndef` "guards": #ifndef _SYS_STAT_H_ 1984: `` times 37 ANSI C and `#ifndef` guards: - dependencies accumulate - throw includes at the program until it compiles - no way to know what can be removed * A personal history of dependencies in Plan 9's C Plan 9, another take: - no `#ifdefs` (or `#ifndefs`) - documentation and topological sort - easy to find out what can be removed Need to document dependencies, but much faster compilation. In short: - _ANSI_C_made_a_costly_mistake_ in requiring `#ifndef` guards. * A personal history of dependencies in C++ C++ exacerbated the problem: - one `#include` file per class - code (not just declarations) in `#include` files - `#ifndef` guards persist 2004: Mike Burrows and Chubby: `` times 37,000 1984: Tom Cargill and pi * A personal history of dependencies at Google Plan 9 demo: a story Early Google: one `Makefile` 2003: `Makefile` generated from per-directory `BUILD` files - explicit dependencies - 40% smaller binaries Dependencies still not checkable! * Result To build a large Google binary on a single computer is impractical. In 2007, instrumented the build of a major Google binary: - 2000 files - 4.2 megabytes - 8 gigabytes delivered to compiler - 2000 bytes sent to compiler for every C++ source byte - it's real work too: `` for example - hours to build * Tools can help New distributed build system: - no more `Makefile` (still uses `BUILD` files) - many buildbots - much caching - much complexity (a large program in its own right) Even with Google's massive distributed build system, a large build still takes minutes. (In 2007 that binary took 45 minutes; today, 27 minutes.) Poor quality of life. * Enter Go While that build runs, we have time to think. Want a language to improve the quality of life. And dependencies are only one such problem.... * Primary considerations Must work at scale: - large programs - large teams - large number of dependencies Must be familiar, roughly C-like * Modernize The old ways are _old_. Go should be: - suitable for multicore machines - suitable for networked machines - suitable for web stuff * The design of Go From a software engineering perspective. * Dependencies in Go * Dependencies Dependencies are defined (syntactically) in the language. Explicit, clear, computable. import "encoding/json" Unused dependencies cause error at compile time. Efficient: dependencies traversed once per source file... * Hoisting dependencies Consider: `A` imports `B` imports `C` but `A` does not directly import `C`. The object code for `B` includes all the information about `C` needed to import `B`. Therefore in `A` the line import "B" does not require the compiler to read `C` when compiling `A`. Also, the object files are designed so the "export" information comes first; compiler doing import does not need to read whole file. Exponentially less data read than with `#include` files. With Go in Google, about 40X fanout (recall C++ was 2000x) Plus in C++ it's general code that must be parsed; in Go it's just export data. * No circular imports Circular imports are illegal in Go. The big picture in a nutshell: - occasional minor pain, - but great reduction in annoyance overall - structural typing makes it less important than with type hierarchies - keeps us honest! Forces clear demarcation between packages. Simplifies compilation, linking, initialization. * API design Through the design of the standard library, great effort spent on controlling dependencies. It can be better to copy a little code than to pull in a big library for one function. (A test in the system build complains if new core dependencies arise.) Dependency hygiene trumps code reuse. Example: The (low-level) `net` package has own `itoa` to avoid dependency on the big formatted I/O package. * Packages * Packages Every Go source file, e.g. `"encoding/json/json.go"`, starts package json where `json` is the "package name", an identifier. Package names are usually concise. To use a package, need to identify it by path: import "encoding/json" And then the package name is used to qualify items from package: var dec = json.NewDecoder(reader) Clarity: can always tell if name is local to package from its syntax: `Name` vs. `pkg.Name`. (More on this later.) Package combines properties of library, name space, and module. * Package paths are unique, not package names The path is `"encoding/json"` but the package name is `json`. The path identifies the package and must be unique. Project or company name at root of name space. import "google/base/go/log" Package name might not be unique; can be overridden. These are both `package`log`: import "log" // Standard package import googlelog "google/base/go/log" // Google-specific package Every company might have its own `log` package; no need to make the package name unique. Another: there are many `server` packages in Google's code base. * Remote packages Package path syntax works with remote repositories. The import path is just a string. Can be a file, can be a URL: go get github.com/4ad/doozer // Command to fetch package import "github.com/4ad/doozer" // Doozer client's import statement var client doozer.Conn // Client's use of package * Go's Syntax * Syntax Syntax is not important... - unless you're programming - or writing tools Tooling is essential, so Go has a clean syntax. Not super small, just clean: - regular (mostly) - only 25 keywords - straightforward to parse (no type-specific context required) - easy to predict, reason about * Declarations Uses Pascal/Modula-style syntax: name before type, more type keywords. var fn func([]int) int type T struct { a, b int } not int (*fn)(int[]); struct T { int a, b; } Easier to parse—no symbol table needed. Tools become easier to write. One nice effect: can drop `var` and derive type of variable from expression: var buf *bytes.Buffer = bytes.NewBuffer(x) // explicit buf := bytes.NewBuffer(x) // derived For more information: .link /s/decl-syntax go.dev/s/decl-syntax * Function syntax Function on type `T`: func Abs(t T) float64 Method of type `T`: func (t T) Abs() float64 Variable (closure) of type `T`: negAbs := func(t T) float64 { return -Abs(t) } In Go, functions can return multiple values. Common case: `error`. func ReadByte() (c byte, err error) c, err := ReadByte() if err != nil { ... } More about errors later. * No default arguments Go does not support default function arguments. Why not? - too easy to throw in defaulted args to fix design problems - encourages too many args - too hard to understand the effect of the fn for different combinations of args Extra verbosity may happen but that encourages extra thought about names. Related: Go has easy-to-use, type-safe support for variadic functions. * Naming * Export syntax Simple rule: - upper case initial letter: `Name` is visible to clients of package - otherwise: `name` (or `_Name`) is not visible to clients of package Applies to variables, types, functions, methods, constants, fields.... That Is It. Not an easy decision. One of the most important things about the language. Can see the visibility of an identifier without discovering the declaration. Clarity. * Scope Go has very simple scope hierarchy: - universe - package - file (for imports only) - function - block * Locality of naming Nuances: - no implicit `this` in methods (receiver is explicit); always see `rcvr.Field` - package qualifier always present for imported names - (first component of) every name is always declared in current package No surprises when importing: - adding an exported name to my package cannot break your package! Names do not leak across boundaries. In C, C++, Java the name `y` could refer to anything In Go, `y` (or even `Y`) is always defined within the package. In Go, `x.Y` is clear: find `x` locally, `Y` belongs to it. * Function and method lookup Method lookup by name only, not type. A type cannot have two methods with the same name, ever. Easy to identify which function/method is referred to. Simple implementation, simpler program, fewer surprises. Given a method `x.M`, there's only ever one `M` associated with `x`. * Semantics * Basics Generally C-like: - statically typed - procedural - compiled - pointers etc. Should feel familiar to programmers from the C family. * But... Many small changes in the aid of robustness: - no pointer arithmetic - no implicit numeric conversions - array bounds checking - no type aliases - `++` and `--` as statements not expressions - assignment not an expression - legal (encouraged even) to take address of stack variable - many more Plus some big ones... * Bigger things Some elements of Go step farther from C, even C++ and Java: - concurrency - garbage collection - interface types - reflection - type switches * Concurrency * Concurrency Important to modern computing environment. Not well served by C++ or even Java. Go embodies a variant of CSP with first-class channels. Why CSP? - The rest of the language can be ordinary and familiar. Must be able to couple concurrency with computation. Example: concurrency and cryptography. * CSP is practical For a web server, the canonical Go program, the model is a great fit. Go _enables_ simple, safe concurrent programming. It doesn't _forbid_ bad programming. Focus on _composition_ of regular code. Caveat: not purely memory safe; sharing is legal. Passing a pointer over a channel is idiomatic. Experience shows this is a practical design. * Garbage collection * The need for garbage collection Too much programming in C and C++ is about memory allocation. But also the design revolves too much about memory management. Leaky abstractions, leaky dependencies. Go has garbage collection, only. Needed for concurrency: tracking ownership too hard otherwise. Important for abstraction: separate behavior from resource management. A key part of scalability: APIs remain local. Use of the language is much simpler because of GC. Adds run-time cost, latency, complexity to the implementation. Day 1 design decision. * Garbage collection in Go A garbage-collected systems language is heresy! Experience with Java: Uncontrollable cost, too much tuning. But Go is different. Go lets you limit allocation by controlling memory layout. Example: type X struct { a, b, c int buf [256]byte } Example: Custom arena allocator with free list. * Interior pointers Early decision: allow interior pointers such as `X.buf` from previous slide. Tradeoff: Affects which GC algorithms that can be used, but in return reduces pressure on the collector. Gives the _programmer_ tools to control GC overhead. Experience, compared to Java, shows it has significant effect on memory pressure. GC remains an active subject. Current design: parallel mark-and-sweep. With care to use memory wisely, works well in production. * Interfaces Composition not inheritance * Object orientation and big software Go is object-oriented. Go does not have classes or subtype inheritance. What does this mean? * No type hierarchy O-O is important because it provides uniformity of interface. Outrageous example: the Plan 9 kernel. Problem: subtype inheritance encourages _non-uniform_ interfaces. * O-O and program evolution Design by type inheritance oversold. Generates brittle code. Early decisions hard to change, often poorly informed. Makes every programmer an interface designer. (Plan 9 was built around a single interface everything needed to satisfy.) Therefore encourages overdesign early on: predict every eventuality. Exacerbates the problem, complicates designs. * Go: interface composition In Go an interface is _just_ a set of methods: type Hash interface { Write(p []byte) (n int, err error) Sum(b []byte) []byte Reset() Size() int BlockSize() int } No `implements` declaration. All hash implementations satisfy this implicitly. (Statically checked.) * Interfaces in practice: composition Tend to be small: one or two methods are common. Composition falls out trivially. Easy example, from package `io`: type Reader interface { Read(p []byte) (n int, err error) } `Reader` (plus the complementary `Writer`) makes it easy to chain: - files, buffers, networks, encryptors, compressors, GIF, JPEG, PNG, ... Dependency structure is not a hierarchy; these also implement other interfaces. Growth through composition is _natural_, does not need to be pre-declared. And that growth can be _ad_hoc_ and linear. * Compose with functions, not methods Hard to overstate the effect that Go's interfaces have on program design. One big effect: functions with interface arguments. func ReadAll(r io.Reader) ([]byte, error) Wrappers: func LoggingReader(r io.Reader) io.Reader func LimitingReader(r io.Reader, n int64) io.Reader func ErrorInjector(r io.Reader) io.Reader The designs are nothing like hierarchical, subtype-inherited methods. Much looser, organic, decoupled, independent. * Errors * Error handling Multiple function return values inform the design for handling errors. Go has no `try-catch` control structures for exceptions. Return `error` instead: built-in interface type that can "stringify" itself: type error interface { Error() string } Clear and simple. Philosophy: Forces you think about errors—and deal with them—when they arise. Errors are _normal_. Errors are _not_exceptional_. Use the existing language to compute based on them. Don't need a sublanguage that treats them as exceptional. Result is better code (if more verbose). * (OK, not all errors are normal. But most are.) .image splash/fire.jpg * Tools * Tools Software engineering requires tools. Go's syntax, package design, naming, etc. make tools easy to write. Standard library includes lexer and parser; type checker nearly done. * Gofmt Always intended to do automatic code formatting. Eliminates an entire class of argument. Runs as a "presubmit" to the code repositories. Training: - The community has always seen `gofmt` output. Sharing: - Uniformity of presentation simplifies sharing. Scaling: - Less time spent on formatting, more on content. Often cited as one of Go's best features. * Gofmt and other tools Surprise: The existence of `gofmt` enabled _semantic_ tools: Can rewrite the tree; `gofmt` will clean up output. Examples: - `gofmt`-r`'a[b:len(a)]`->`a[b:]'` - `gofix` And good front-end libraries enable ancillary tools: - `godoc` - `go`get`, `go`build`, etc. - `api` * Gofix The `gofix` tool allowed us to make sweeping changes to APIs and language features leading up to the release of Go 1. - change to map deletion syntax - new time API - many more Also allows us to _update_ code even if the old code still works. Recent example: Changed Go's protocol buffer implementation to use getter functions; updated _all_ Google Go code to use them with `gofix`. * Conclusion * Go at Google Go's use is growing inside Google. Several big services use it: - golang.org - youtube.com - dl.google.com Many small ones do, many using Google App Engine. * Go outside Google Many outside companies use it, including: - BBC Worldwide - Canonical - Heroku - Nokia - SoundCloud * What's wrong? Not enough experience yet to know if Go is truly successful. Not enough big programs. Some minor details wrong. Examples: - declarations still too fussy - `nil` is overloaded - lots of library details `Gofix` and `gofmt` gave us the opportunity to fix many problems, ranging from eliminating semicolons to redesigning the `time` package. But we're still learning (and the language is frozen for now). The implementation still needs work, the run-time system in particular. But all indicators are positive. * Summary Software engineering guided the design. But a productive, fun language resulted because that design enabled productivity. Clear dependencies Clear syntax Clear semantics Composition not inheritance Simplicity of model (GC, concurrency) Easy tooling (the `go` tool, `gofmt`, `godoc`, `gofix`) * Try it! .link / go.dev .image splash/appenginegophercolor.jpg