Text file
talks/2012/splash.article
1 Go at Google: Language Design in the Service of Software Engineering
2
3 Rob Pike
4 Google, Inc.
5 https://go.dev
6
7 * Abstract
8
9 (This is a modified version of the keynote talk given by Rob Pike
10 at the SPLASH 2012 conference in Tucson, Arizona, on October 25, 2012.)
11
12 The Go programming language was conceived in late 2007 as an answer to
13 some of the problems we were seeing developing software infrastructure
14 at Google.
15 The computing landscape today is almost unrelated to the environment
16 in which the languages being used, mostly C++, Java, and Python, had
17 been created.
18 The problems introduced by multicore processors, networked systems,
19 massive computation clusters, and the web programming model were being
20 worked around rather than addressed head-on.
21 Moreover, the scale has changed: today's server programs comprise tens
22 of millions of lines of code, are worked on by hundreds or even
23 thousands of programmers, and are updated literally every day.
24 To make matters worse, build times, even on large compilation
25 clusters, have stretched to many minutes, even hours.
26
27 Go was designed and developed to make working in this environment more
28 productive.
29 Besides its better-known aspects such as built-in concurrency and
30 garbage collection, Go's design considerations include rigorous
31 dependency management, the adaptability of software architecture as
32 systems grow, and robustness across the boundaries between components.
33
34 This article explains how these issues were addressed while building
35 an efficient, compiled programming language that feels lightweight and
36 pleasant.
37 Examples and explanations will be taken from the real-world problems
38 faced at Google.
39
40 * Introduction
41
42 Go is a compiled, concurrent, garbage-collected, statically typed language
43 developed at Google.
44 It is an open source project: Google
45 imports the public repository rather than the other way around.
46
47 Go is efficient, scalable, and productive. Some programmers find it fun
48 to work in; others find it unimaginative, even boring.
49 In this article we
50 will explain why those are not contradictory positions.
51 Go was designed to address the problems faced in software development
52 at Google, which led to a language that is not a breakthrough research language
53 but is nonetheless an excellent tool for engineering large software projects.
54
55 * Go at Google
56
57 Go is a programming language designed by Google to help solve Google's problems, and Google has big problems.
58
59 The hardware is big and the software is big.
60 There are many millions of lines of software, with servers mostly in C++
61 and lots of Java and Python for the other pieces.
62 Thousands of engineers work on the code,
63 at the "head" of a single tree comprising all the software,
64 so from day to day there are significant changes to all levels of the tree.
65 A large
66 [[http://google-engtools.blogspot.com/2011/06/build-in-cloud-accessing-source-code.html][custom-designed distributed build system]]
67 makes development at this scale feasible, but it's still big.
68
69 And of course, all this software runs on zillions of machines, which are treated as a modest number of independent, networked compute clusters.
70
71 .image splash/datacenter.jpg
72
73 In short, development at Google is big, can be slow, and is often clumsy. But it _is_ effective.
74
75 The goals of the Go project were to eliminate the slowness and clumsiness of software development at Google,
76 and thereby to make the process more productive and scalable.
77 The language was designed by and for people who write—and read and debug and maintain—large software systems.
78
79 Go's purpose is therefore _not_ to do research into programming language design;
80 it is to improve the working environment for its designers and their coworkers.
81 Go is more about software engineering than programming language research.
82 Or to rephrase, it is about language design in the service of software engineering.
83
84 But how can a language help software engineering?
85 The rest of this article is an answer to that question.
86
87 * Pain points
88
89 When Go launched, some claimed it was missing particular features or methodologies that were regarded as _de_rigueur_ for a modern language.
90 How could Go be worthwhile in the absence of these facilities?
91 Our answer to that is that the properties Go _does_ have address the issues that make large-scale software development difficult.
92 These issues include:
93
94 - slow builds
95 - uncontrolled dependencies
96 - each programmer using a different subset of the language
97 - poor program understanding (code hard to read, poorly documented, and so on)
98 - duplication of effort
99 - cost of updates
100 - version skew
101 - difficulty of writing automatic tools
102 - cross-language builds
103
104 Individual features of a language don't address these issues.
105 A larger view of software engineering is required, and
106 in the design of Go we tried to focus on solutions to _these_ problems.
107
108 As a simple, self-contained example, consider the representation of program structure.
109 Some observers objected to Go's C-like block structure with braces, preferring the use of spaces for indentation, in the style of Python or Haskell.
110 However, we have had extensive experience tracking down build and test failures caused by cross-language builds where a Python snippet embedded in another language,
111 for instance through a SWIG invocation,
112 is subtly and _invisibly_ broken by a change in the indentation of the surrounding code.
113 Our position is therefore that, although spaces for indentation is nice for small programs, it doesn't scale well,
114 and the bigger and more heterogeneous the code base, the more trouble it can cause.
115 It is better to forgo convenience for safety and dependability, so Go has brace-bounded blocks.
116
117 * Dependencies in C and C++
118
119 A more substantial illustration of scaling and other issues arises in the handling of package dependencies.
120 We begin the discussion with a review of how they work in C and C++.
121
122 ANSI C, first standardized in 1989, promoted the idea of `#ifndef` "guards" in the standard header files.
123 The idea, which is ubiquitous now, is that each header file be bracketed with a conditional compilation clause so that the file may be included multiple times without error.
124 For instance, the Unix header file `<sys/stat.h>` looks schematically like this:
125
126 /* Large copyright and licensing notice */
127 #ifndef _SYS_STAT_H_
128 #define _SYS_STAT_H_
129 /* Types and other definitions */
130 #endif
131
132 The intent is that the C preprocessor reads in the file but disregards the contents on
133 the second and subsequent
134 readings of the file.
135 The symbol `_SYS_STAT_H_`, defined the first time the file is read, "guards" the invocations that follow.
136
137 This design has some nice properties, most important that each header file can safely `#include`
138 all its dependencies, even if other header files will also include them.
139 If that rule is followed, it permits orderly code that, for instance, sorts the `#include`
140 clauses alphabetically.
141
142 But it scales very badly.
143
144 In 1984, a compilation of `ps.c`, the source to the Unix `ps` command, was observed
145 to `#include` `<sys/stat.h>` 37 times by the time all the preprocessing had been done.
146 Even though the contents are discarded 36 times while doing so, most C
147 implementations would open the file, read it, and scan it all 37 times.
148 Without great cleverness, in fact, that behavior is required by the potentially
149 complex macro semantics of the C preprocessor.
150
151 The effect on software is the gradual accumulation of `#include` clauses in C programs.
152 It won't break a program to add them, and it's very hard to know when they are no
153 longer needed.
154 Deleting a `#include` and compiling the program again isn't even sufficient to test that,
155 since another `#include` might itself contain a `#include` that pulls it in anyway.
156
157 Technically speaking, it does not have to be like that.
158 Realizing the long-term problems with the use of `#ifndef` guards, the designers
159 of the Plan 9 libraries took a different, non-ANSI-standard approach.
160 In Plan 9, header files were forbidden from containing further `#include` clauses; all
161 `#includes` were required to be in the top-level C file.
162 This required some discipline, of course—the programmer was required to list
163 the necessary dependencies exactly once, in the correct order—but documentation
164 helped and in practice it worked very well.
165 The result was that, no matter how many dependencies a C source file had,
166 each `#include` file was read exactly once when compiling that file.
167 And, of course, it was also easy to see if an `#include` was necessary by taking
168 it out: the edited program would compile if and only if the dependency was unnecessary.
169
170 The most important result of the Plan 9 approach was much faster compilation: the amount of
171 I/O the compilation requires can be dramatically less than when compiling a program
172 using libraries with `#ifndef` guards.
173
174 Outside of Plan 9, though, the "guarded" approach is accepted practice for C and C++.
175 In fact, C++ exacerbates the problem by using the same approach at finer granularity.
176 By convention, C++ programs are usually structured with one header file per class, or perhaps
177 small set of related classes, a grouping much smaller than, say, `<stdio.h>`.
178 The dependency tree is therefore much more intricate, reflecting not library dependencies but the full type hierarchy.
179 Moreover, C++ header files usually contain real code—type, method, and template
180 declarations—not just the simple constants and function signatures typical of a C header file.
181 Thus not only does C++ push more to the compiler, what it pushes is harder to compile,
182 and each invocation of the compiler must reprocess this information.
183 When building a large C++ binary, the compiler might be taught thousands of times how to
184 represent a string by processing the header file `<string>`.
185 (For the record, around 1984 Tom Cargill observed that the use of the
186 C preprocessor for dependency management would be a long-term liability for C++ and
187 should be addressed.)
188
189 The construction of a single C++ binary at Google can open and read hundreds of individual header files
190 tens of thousands of times.
191 In 2007, build engineers at Google instrumented the compilation of a major Google binary.
192 The file contained about two thousand files that, if simply concatenated together, totaled 4.2 megabytes.
193 By the time the `#includes` had been expanded, over 8 gigabytes were being delivered to the input of the compiler, a blow-up of 2000 bytes for every C++ source byte.
194
195 As another data point, in 2003 Google's build system was moved from a single Makefile to a per-directory design
196 with better-managed, more explicit dependencies.
197 A typical binary shrank about 40% in file size, just from having more accurate dependencies recorded.
198 Even so, the properties of C++ (or C for that matter) make it impractical to verify those dependencies automatically,
199 and today we still do not have an accurate understanding of the dependency requirements
200 of large Google C++ binaries.
201
202 The consequence of these uncontrolled dependencies and massive scale is that it is
203 impractical to build Google server binaries on a single computer, so
204 a large distributed compilation system was created.
205 With this system, involving many machines, much caching, and
206 much complexity (the build system is a large program in its own right), builds at
207 Google are practical, if still cumbersome.
208
209 Even with the distributed build system, a large Google build can still take many minutes.
210 That 2007 binary took 45 minutes using a precursor distributed build system; today's
211 version of the same program takes 27 minutes, but of course the program and its
212 dependencies have grown in the interim.
213 The engineering effort required to scale up the build system has barely been able
214 to stay ahead of the growth of the software it is constructing.
215
216 * Enter Go
217
218 When builds are slow, there is time to think.
219 The origin myth for Go states that it was during one of those 45 minute builds
220 that Go was conceived. It was believed to be worth trying to design a new language
221 suitable for writing large Google programs such as web servers,
222 with software engineering considerations that would improve the quality
223 of life of Google programmers.
224
225 Although the discussion so far has focused on dependencies,
226 there are many other issues that need attention.
227 The primary considerations for any language to succeed in this context are:
228
229 - It must work at scale, for large programs with large numbers of dependencies, with large teams of programmers working on them.
230
231 - It must be familiar, roughly C-like. Programmers working at Google are early in their careers and are most familiar with procedural languages, particularly from the C family. The need to get programmers productive quickly in a new language means that the language cannot be too radical.
232
233 - It must be modern. C, C++, and to some extent Java are quite old, designed before the advent of multicore machines, networking, and web application development. There are features of the modern world that are better met by newer approaches, such as built-in concurrency.
234
235 With that background, then, let us look at the design of Go from a software engineering perspective.
236
237 * Dependencies in Go
238
239 Since we've taken a detailed look at dependencies in C and C++, a good place to start
240 our tour is to see how Go handles them.
241 Dependencies are defined, syntactically and semantically, by the language.
242 They are explicit, clear, and "computable", which is to say, easy to write tools to analyze.
243
244 The syntax is that, after the `package` clause (the subject of the next section),
245 each source file may have one or more import statements, comprising the
246 `import` keyword and a string constant identifying the package to be imported
247 into this source file (only):
248
249 import "encoding/json"
250
251 The first step to making Go scale, dependency-wise, is that the _language_ defines
252 that unused dependencies are a compile-time error (not a warning, an _error_).
253 If the source file imports a package it does not use, the program will not compile.
254 This guarantees by construction that the dependency tree for any Go program
255 is precise, that it has no extraneous edges. That, in turn, guarantees that no
256 extra code will be compiled when building the program, which minimizes
257 compilation time.
258
259 There's another step, this time in the implementation of the compilers, that
260 goes even further to guarantee efficiency.
261 Consider a Go program with three packages and this dependency graph:
262
263 - package `A` imports package `B`;
264 - package `B` imports package `C`;
265 - package `A` does _not_ import package `C`
266
267 This means that package `A` uses `C` only transitively through its use of `B`;
268 that is, no identifiers from `C` are mentioned in the source code to `A`,
269 even if some of the items `A` is using from `B` do mention `C`.
270 For instance, package `A` might reference a `struct` type defined in `B` that has a field with
271 a type defined in `C` but that `A` does not reference itself.
272 As a motivating example, imagine that `A` imports a formatted I/O package
273 `B` that uses a buffered I/O implementation provided by `C`, but that `A` does
274 not itself invoke buffered I/O.
275
276 To build this program, first, `C` is compiled;
277 dependent packages must be built before the packages that depend on them.
278 Then `B` is compiled; finally `A` is compiled, and then the program can be linked.
279
280 When `A` is compiled, the compiler reads the object file for `B`, not its source code.
281 That object file for `B` contains all the type information necessary for the compiler
282 to execute the
283
284 import "B"
285
286 clause in the source code for `A`. That information includes whatever information
287 about `C` that clients of `B` will need at compile time.
288 In other words, when `B` is compiled, the generated object file includes type
289 information for all dependencies of `B` that affect the public interface of `B`.
290
291 This design has the important
292 effect that when the compiler executes an import clause,
293 _it_opens_exactly_one_file_, the object file identified by the string in the import clause.
294 This is, of course, reminiscent of the Plan 9 C (as opposed to ANSI C)
295 approach to dependency management, except that, in effect, the compiler
296 writes the header file when the Go source file is compiled.
297 The process is more automatic and even
298 more efficient than in Plan 9 C, though: the data being read when evaluating the import is just
299 "exported" data, not general program source code. The effect on overall
300 compilation time can be huge, and scales well as
301 the code base grows. The time to execute the dependency graph, and
302 hence to compile, can be exponentially less than in the "include of
303 include file" model of C and C++.
304
305 It's worth mentioning that this general approach to dependency management
306 is not original; the ideas go back to the 1970s and flow through languages like
307 Modula-2 and Ada. In the C family Java has elements of this approach.
308
309 To make compilation even more efficient, the object file is arranged so the export
310 data is the first thing in the file, so the compiler can stop reading as soon
311 as it reaches the end of that section.
312
313 This approach to dependency management is the single biggest reason
314 why Go compilations are faster than C or C++ compilations.
315 Another factor is that Go places the export data in the object file; some
316 languages require the author to write or the compiler to
317 generate a second file with that information. That's twice as many files
318 to open. In Go there is only one file to open to import a package.
319 Also, the single file approach means that the export data (or header
320 file, in C/C++) can never go out of date relative to the object file.
321
322 For the record, we measured the compilation of a large Google program
323 written in Go to see how the source code fanout compared to the C++
324 analysis done earlier. We found it was about 40X, which is
325 fifty times better than C++ (as well as being simpler and hence faster
326 to process), but it's still bigger than we expected. There are two reasons for
327 this. First, we found a bug: the Go compiler was generating a substantial
328 amount of data in the export section that did not need to be there. Second,
329 the export data uses a verbose encoding that could be improved.
330 We plan to address these issues.
331
332 Nonetheless, a factor of fifty less to do turns minutes into seconds,
333 coffee breaks into interactive builds.
334
335 Another feature of the Go dependency graph is that it has no cycles.
336 The language defines that there can be no circular imports in the graph,
337 and the compiler and linker both check that they do not exist.
338 Although they are occasionally useful, circular imports introduce
339 significant problems at scale.
340 They require the compiler to deal with larger sets of source files
341 all at once, which slows down incremental builds.
342 More important, when allowed, in our experience such imports end up
343 entangling huge swaths of the source tree into large subpieces that are
344 difficult to manage independently, bloating binaries and complicating
345 initialization, testing, refactoring, releasing, and other tasks of
346 software development.
347
348 The lack of circular imports causes occasional annoyance but keeps the tree clean,
349 forcing a clear demarcation between packages. As with many of the
350 design decisions in Go, it forces the programmer to think earlier about a
351 larger-scale issue (in this case, package boundaries) that if left until
352 later may never be addressed satisfactorily.
353
354 Through the design of the standard library, great effort was spent on controlling
355 dependencies. It can be better to copy a little code than to pull in a big
356 library for one function. (A test in the system build complains if new core
357 dependencies arise.) Dependency hygiene trumps code reuse.
358 One example of this in practice is that
359 the (low-level) `net` package has its own integer-to-decimal conversion routine
360 to avoid depending on the bigger and dependency-heavy formatted I/O package.
361 Another is that the string conversion package `strconv` has a private implementation
362 of the definition of 'printable' characters rather than pull in the large Unicode
363 character class tables; that `strconv` honors the Unicode standard is verified by the
364 package's tests.
365
366 * Packages
367
368 The design of Go's package system combines some of the properties of libraries,
369 name spaces, and modules into a single construct.
370
371 Every Go source file, for instance `"encoding/json/json.go"`, starts with a package clause, like this:
372
373 package json
374
375 where `json` is the "package name", a simple identifier.
376 Package names are usually concise.
377
378 To use a package, the importing source file identifies it by its _package_path_
379 in the import clause.
380 The meaning of "path" is not specified by the language, but in
381 practice and by convention it is the slash-separated directory path of the
382 source package in the repository, here:
383
384 import "encoding/json"
385
386 Then the package name (as distinct from path) is used to qualify items from
387 the package in the importing source file:
388
389 var dec = json.NewDecoder(reader)
390
391 This design provides clarity.
392 One may always tell whether a name is local to package from its syntax: `Name` vs. `pkg.Name`.
393 (More on this later.)
394
395 For our example, the package path is `"encoding/json"` while the package name is `json`.
396 Outside the standard repository, the convention is to place the
397 project or company name at the root of the name space:
398
399 import "google/base/go/log"
400
401 It's important to recognize that package _paths_ are unique,
402 but there is no such requirement for package _names_.
403 The path must uniquely identify the package to be imported, while the
404 name is just a convention for how clients of the package can refer to its
405 contents.
406 The package name need not be unique and can be overridden
407 in each importing source file by providing a local identifier in the
408 import clause. These two imports both reference packages that
409 call themselves `package` `log`, but to import them in a single source
410 file one must be (locally) renamed:
411
412 import "log" // Standard package
413 import googlelog "google/base/go/log" // Google-specific package
414
415 Every company might have its own `log` package but
416 there is no need to make the package name unique.
417 Quite the opposite: Go style suggests keeping package names short and clear
418 and obvious in preference to worrying about collisions.
419
420 Another example: there are many `server` packages in Google's code base.
421
422 * Remote packages
423
424 An important property of Go's package system is that the package path,
425 being in general an arbitrary string, can be co-opted to refer to remote
426 repositories by having it identify the URL of the site serving the repository.
427
428 Here is how to use the `doozer` package from `github`. The `go` `get` command
429 uses the `go` build tool to fetch the repository from the site and install it.
430 Once installed, it can be imported and used like any regular package.
431
432 $ go get github.com/4ad/doozer // Shell command to fetch package
433
434 import "github.com/4ad/doozer" // Doozer client's import statement
435
436 var client doozer.Conn // Client's use of package
437
438 It's worth noting that the `go` `get` command downloads dependencies
439 recursively, a property made possible only because the dependencies are
440 explicit.
441 Also, the allocation of the space of import paths is delegated to URLs,
442 which makes the naming of packages decentralized and therefore scalable,
443 in contrast to centralized registries used by other languages.
444
445 * Syntax
446
447 Syntax is the user interface of a programming language. Although it has
448 limited effect on the semantics of the language, which is arguably the
449 more important component, syntax determines the readability and hence
450 clarity of the language. Also, syntax is critical to tooling: if the language
451 is hard to parse, automated tools are hard to write.
452
453 Go was therefore designed with clarity and tooling in mind, and has
454 a clean syntax.
455 Compared to other languages in the C family, its
456 grammar is modest in size, with only 25 keywords (C99 has
457 37; C++11 has 84; the numbers continue to grow).
458 More important,
459 the grammar is regular and therefore easy to parse (mostly; there
460 are a couple of quirks we might have fixed but didn't discover early
461 enough).
462 Unlike C and Java and especially C++, Go can be parsed without
463 type information or a symbol table;
464 there is no type-specific context. The grammar is
465 easy to reason about and therefore tools are easy to write.
466
467 One of the details of Go's syntax that surprises C programmers is that
468 the declaration syntax is closer to Pascal's than to C's.
469 The declared name appears before the type and there are more keywords:
470
471 var fn func([]int) int
472 type T struct { a, b int }
473
474 as compared to C's
475
476 int (*fn)(int[]);
477 struct T { int a, b; }
478
479 Declarations introduced by keyword are easier to parse both for people and
480 for computers, and having the type syntax not be the expression syntax
481 as it is in C has a significant effect on parsing: it adds grammar
482 but eliminates ambiguity.
483 But there is a nice side effect, too: for initializing declarations,
484 one can drop the `var` keyword and just take the type of the variable
485 from that of the expression. These two declarations are equivalent;
486 the second is shorter and idiomatic:
487
488 var buf *bytes.Buffer = bytes.NewBuffer(x) // explicit
489 buf := bytes.NewBuffer(x) // derived
490
491 There is a blog post at [[/s/decl-syntax][go.dev/s/decl-syntax]] with more detail about the syntax of declarations in Go and
492 why it is so different from C.
493
494 Function syntax is straightforward for simple functions.
495 This example declares the function `Abs`, which accepts a single
496 variable `x` of type `T` and returns a single `float64` value:
497
498 func Abs(x T) float64
499
500 A method is just a function with a special parameter, its _receiver_,
501 which can be passed to the function using the standard "dot" notation.
502 Method declaration syntax places the receiver in parentheses before the
503 function name. Here is the same function, now as a method of type `T`:
504
505 func (x T) Abs() float64
506
507 And here is a variable (closure) with a type `T` argument; Go has first-class
508 functions and closures:
509
510 negAbs := func(x T) float64 { return -Abs(x) }
511
512 Finally, in Go functions can return multiple values. A common case is to
513 return the function result and an `error` value as a pair, like this:
514
515 func ReadByte() (c byte, err error)
516
517 c, err := ReadByte()
518 if err != nil { ... }
519
520 We'll talk more about errors later.
521
522 One feature missing from Go is that it
523 does not support default function arguments. This was a deliberate
524 simplification. Experience tells us that defaulted arguments make it
525 too easy to patch over API design flaws by adding more arguments,
526 resulting in too many arguments with interactions that are
527 difficult to disentangle or even understand.
528 The lack of default arguments requires more functions or methods to be defined,
529 as one function cannot hold the entire interface,
530 but that leads to a clearer API that is easier to understand.
531 Those functions all need separate names, too, which makes it clear
532 which combinations exist, as well as encouraging more
533 thought about naming, a critical aspect of clarity and readability.
534
535 One mitigating factor for the lack of default arguments is that Go
536 has easy-to-use, type-safe support for variadic functions.
537
538 * Naming
539
540 Go takes an unusual approach to defining the _visibility_ of an identifier,
541 the ability for a client of a package to use the item named by the identifier.
542 Unlike, for instance, `private` and `public` keywords, in Go the name itself
543 carries the information: the case of the initial letter of the identifier
544 determines the visibility. If the initial character is an upper case letter,
545 the identifier is _exported_ (public); otherwise it is not:
546
547 - upper case initial letter: `Name` is visible to clients of package
548 - otherwise: `name` (or `_Name`) is not visible to clients of package
549
550 This rule applies to variables, types, functions, methods, constants, fields...
551 everything. That's all there is to it.
552
553 This was not an easy design decision.
554 We spent over a year struggling to
555 define the notation to specify an identifier's visibility.
556 Once we settled on using the case of the name, we soon realized it had
557 become one of the most important properties about the language.
558 The name is, after all, what clients of the package use; putting
559 the visibility in the name rather than its type means that it's always
560 clear when looking at an identifier whether it is part of the public API.
561 After using Go for a while, it feels burdensome when going back to
562 other languages that require looking up the declaration to discover
563 this information.
564
565 The result is, again, clarity: the program source text expresses the
566 programmer's meaning simply.
567
568 Another simplification is that Go has a very compact scope hierarchy:
569
570 - universe (predeclared identifiers such as `int` and `string`)
571 - package (all the source files of a package live at the same scope)
572 - file (for package import renames only; not very important in practice)
573 - function (the usual)
574 - block (the usual)
575
576 There is no scope for name space or class or other wrapping
577 construct. Names come from very few places in Go, and all names
578 follow the same scope hierarchy: at any given location in the source,
579 an identifier denotes exactly one language object, independent of how
580 it is used. (The only exception is statement labels, the targets of `break`
581 statements and the like; they always have function scope.)
582
583 This has consequences for clarity. Notice for instance that methods
584 declare an explicit receiver and that it must be used to access fields and
585 methods of the type. There is no implicit `this`. That is, one always
586 writes
587
588 rcvr.Field
589
590 (where rcvr is whatever name is chosen for the receiver variable)
591 so all the elements of the type always appear lexically bound to
592 a value of the receiver type. Similarly, a package qualifier is always present
593 for imported names; one writes `io.Reader` not `Reader`.
594 Not only is this clear, it frees up the identifier `Reader` as a useful
595 name to be used in any package. There are in fact multiple exported
596 identifiers in the standard library with name `Reader`, or `Printf`
597 for that matter, yet which one is being referred to is always unambiguous.
598
599 Finally, these rules combine to guarantee that, other than the top-level
600 predefined names such as `int`, (the first component of) every name is
601 always declared in the current package.
602
603 In short, names are local. In C, C++, or Java the name `y` could refer to anything.
604 In Go, `y` (or even `Y`) is always defined within the package,
605 while the interpretation of `x.Y` is clear: find `x` locally, `Y` belongs to it.
606
607 These rules provide an important property for scaling because they guarantee
608 that adding an exported name to a package can never break a client
609 of that package. The naming rules decouple packages, providing
610 scaling, clarity, and robustness.
611
612 There is one more aspect of naming to be mentioned: method lookup
613 is always by name only, not by signature (type) of the method.
614 In other words, a single type can never have two methods with the same name.
615 Given a method `x.M`, there's only ever one `M` associated with `x`.
616 Again, this makes it easy to identify which method is referred to given
617 only the name.
618 It also makes the implementation of method invocation simple.
619
620 * Semantics
621
622 The semantics of Go statements is generally C-like. It is a compiled, statically typed,
623 procedural language with pointers and so on. By design, it should feel
624 familiar to programmers accustomed to languages in the C family.
625 When launching a new language
626 it is important that the target audience be able to learn it quickly; rooting Go
627 in the C family helps make sure that young programmers, most of whom
628 know Java, JavaScript, and maybe C, should find Go easy to learn.
629
630 That said, Go makes many small changes to C semantics, mostly in the
631 service of robustness. These include:
632
633 - there is no pointer arithmetic
634 - there are no implicit numeric conversions
635 - array bounds are always checked
636 - there are no type aliases (after `type`X`int`, `X` and `int` are distinct types not aliases)
637 - `++` and `--` are statements not expressions
638 - assignment is not an expression
639 - it is legal (encouraged even) to take the address of a stack variable
640 - and many more
641
642 There are some much bigger changes too, stepping far from the traditional
643 C, C++, and even Java models. These include linguistic support for:
644
645 - concurrency
646 - garbage collection
647 - interface types
648 - reflection
649 - type switches
650
651 The following sections provide brief discussions of two of these topics in Go,
652 concurrency and garbage collection,
653 mostly from a software engineering perspective.
654 For a full discussion of the language semantics and uses see the many
655 resources on the [[/][go.dev]] web site.
656
657 * Concurrency
658
659 Concurrency is important to the modern computing environment with its
660 multicore machines running web servers with multiple clients,
661 what might be called the typical Google program.
662 This kind of software is not especially well served by C++ or Java,
663 which lack sufficient concurrency support at the language level.
664
665 Go embodies a variant of CSP with first-class channels.
666 CSP was chosen partly due to familiarity (one of us had worked on
667 predecessor languages that built on CSP's ideas), but also because
668 CSP has the property that it is easy to add to a procedural programming
669 model without profound changes to that model.
670 That is, given a C-like language, CSP can be added to the language
671 in a mostly orthogonal way, providing extra expressive power without
672 constraining the language's other uses. In short, the rest of the
673 language can remain "ordinary".
674
675 The approach is thus the composition of independently executing
676 functions of otherwise regular procedural code.
677
678 The resulting language allows us to couple concurrency with computation
679 smoothly. Consider a web server that must verify security certificates for
680 each incoming client call; in Go it is easy to construct the software using
681 CSP to manage the clients as independently executing procedures but
682 to have the full power of an efficient compiled language available for
683 the expensive cryptographic calculations.
684
685 In summary, CSP is practical for Go and for Google. When writing
686 a web server, the canonical Go program, the model is a great fit.
687
688 There is one important caveat: Go is not purely memory safe in the presence
689 of concurrency. Sharing is legal and passing a pointer over a channel is idiomatic
690 (and efficient).
691
692 Some concurrency and functional programming experts are disappointed
693 that Go does not take a write-once approach to value semantics
694 in the context of concurrent computation, that Go is not more like
695 Erlang for example.
696 Again, the reason is largely about familiarity and suitability for the
697 problem domain. Go's concurrent features work well in a context
698 familiar to most programmers.
699 Go _enables_ simple, safe concurrent
700 programming but does not _forbid_ bad programming.
701 We compensate by convention, training programmers to think
702 about message passing as a version of ownership control. The motto is,
703 "Don't communicate by sharing memory, share memory by communicating."
704
705 Our limited experience with programmers new to both Go and concurrent
706 programming shows that this is a practical approach. Programmers
707 enjoy the simplicity that support for concurrency brings to network
708 software, and simplicity engenders robustness.
709
710 * Garbage collection
711
712 For a systems language, garbage collection can be a controversial feature,
713 yet we spent very little time deciding that Go would be a
714 garbage-collected language.
715 Go has no explicit memory-freeing operation: the only way allocated
716 memory returns to the pool is through the garbage collector.
717
718 It was an easy decision to make because memory management
719 has a profound effect on the way a language works in practice.
720 In C and C++, too much programming effort is spent on memory allocation
721 and freeing.
722 The resulting designs tend to expose details of memory management
723 that could well be hidden; conversely memory considerations
724 limit how they can be used. By contrast, garbage collection makes interfaces
725 easier to specify.
726
727 Moreover, in a concurrent object-oriented language it's almost essential
728 to have automatic memory management because the ownership of a piece
729 of memory can be tricky to manage as it is passed around among concurrent
730 executions. It's important to separate behavior from resource management.
731
732 The language is much easier to use because of garbage collection.
733
734 Of course, garbage collection brings significant costs: general overhead,
735 latency, and complexity of the implementation. Nonetheless, we believe
736 that the benefits, which are mostly felt by the programmer, outweigh
737 the costs, which are largely borne by the language implementer.
738
739 Experience with Java in particular as a server language has made some
740 people nervous about garbage collection in a user-facing system.
741 The overheads are uncontrollable, latencies can be large, and much
742 parameter tuning is required for good performance.
743 Go, however, is different. Properties of the language mitigate some of these
744 concerns. Not all of them of course, but some.
745
746 The key point is that Go gives the programmer tools to limit allocation
747 by controlling the layout of data structures. Consider this simple
748 type definition of a data structure containing a buffer (array) of bytes:
749
750 type X struct {
751 a, b, c int
752 buf [256]byte
753 }
754
755 In Java, the `buf` field would require a second allocation and accesses
756 to it a second level of indirection. In Go, however, the buffer is allocated
757 in a single block of memory along with the containing struct and no
758 indirection is required. For systems programming, this design can have a
759 better performance as well as reducing the number
760 of items known to the collector. At scale it can make a significant
761 difference.
762
763 As a more direct example, in Go it is easy and efficient to provide
764 second-order allocators, for instance an arena allocator that allocates
765 a large array of structs and links them together with a free list.
766 Libraries that repeatedly use many small structures like this can,
767 with modest prearrangement, generate no garbage yet
768 be efficient and responsive.
769
770 Although Go is a garbage collected language, therefore, a knowledgeable
771 programmer can limit the pressure placed on the collector and thereby
772 improve performance. (Also, the Go installation comes with good tools
773 for studying the dynamic memory performance of a running program.)
774
775 To give the programmer this flexibility, Go must support
776 what we call _interior_pointers_ to objects
777 allocated in the heap. The `X.buf` field in the example above lives
778 within the struct but it is legal to capture the address of this inner field,
779 for instance to pass it to an I/O routine. In Java, as in many garbage-collected
780 languages, it is not possible to construct an interior pointer like this,
781 but in Go it is idiomatic.
782 This design point affects which collection algorithms can be used,
783 and may make them more difficult, but after careful thought we decided
784 that it was necessary to allow interior pointers because of the benefits
785 to the programmer and the ability to reduce pressure on the (perhaps
786 harder to implement) collector.
787 So far, our experience comparing similar Go and Java programs shows
788 that use of interior pointers can have a significant effect on total arena size,
789 latency, and collection times.
790
791 In summary, Go is garbage collected but gives the programmer
792 some tools to control collection overhead.
793
794 The garbage collector remains an active area of development.
795 The current design is a parallel mark-and-sweep collector and there remain
796 opportunities to improve its performance or perhaps even its design.
797 (The language specification does not mandate any particular implementation
798 of the collector.)
799 Still, if the programmer takes care to use memory wisely,
800 the current implementation works well for production use.
801
802 * Composition not inheritance
803
804 Go takes an unusual approach to object-oriented programming, allowing
805 methods on any type, not just classes, but without any form of type-based inheritance
806 like subclassing.
807 This means there is no type hierarchy.
808 This was an intentional design choice.
809 Although type hierarchies have been used to build much successful
810 software, it is our opinion that the model has been overused and that it
811 is worth taking a step back.
812
813 Instead, Go has _interfaces_, an idea that has been discussed at length elsewhere (see
814 [[http://research.swtch.com/interfaces]]
815 for example), but here is a brief summary.
816
817 In Go an interface is _just_ a set of methods. For instance, here is the definition
818 of the `Hash` interface from the standard library.
819
820 type Hash interface {
821 Write(p []byte) (n int, err error)
822 Sum(b []byte) []byte
823 Reset()
824 Size() int
825 BlockSize() int
826 }
827
828 All data types that implement these methods satisfy this interface implicitly;
829 there is no `implements` declaration.
830 That said, interface satisfaction is statically checked at compile time
831 so despite this decoupling interfaces are type-safe.
832
833 A type will usually satisfy many interfaces, each corresponding
834 to a subset of its methods. For example, any type that satisfies the `Hash`
835 interface also satisfies the `Writer` interface:
836
837 type Writer interface {
838 Write(p []byte) (n int, err error)
839 }
840
841 This fluidity of interface satisfaction encourages a different approach
842 to software construction. But before explaining that, we should explain
843 why Go does not have subclassing.
844
845 Object-oriented programming provides a powerful insight: that the
846 _behavior_ of data can be generalized independently of the
847 _representation_ of that data.
848 The model works best when the behavior (method set) is fixed,
849 but once you subclass a type and add a method,
850 _the_behaviors_are_no_longer_identical_.
851 If instead the set of behaviors is fixed, such as in Go's statically
852 defined interfaces, the uniformity of behavior enables data and
853 programs to be composed uniformly, orthogonally, and safely.
854
855 One extreme example is the Plan 9 kernel, in which all system data items
856 implemented exactly the same interface, a file system API defined
857 by 14 methods.
858 This uniformity permitted a level of object composition seldom
859 achieved in other systems, even today.
860 Examples abound. Here's one: A system could import (in Plan 9 terminology) a TCP
861 stack to a computer that didn't have TCP or even Ethernet, and over that network
862 connect to a machine with a different CPU architecture, import its `/proc` tree,
863 and run a local debugger to do breakpoint debugging of the remote process.
864 This sort of operation was workaday on Plan 9, nothing special at all.
865 The ability to do such things fell out of the design; it required no special
866 arrangement (and was all done in plain C).
867
868 We argue that this compositional style of system construction has been
869 neglected by the languages that push for design by type hierarchy.
870 Type hierarchies result in brittle code.
871 The hierarchy must be designed early, often as the first step of
872 designing the program, and early decisions can be difficult to change once
873 the program is written.
874 As a consequence, the model encourages early overdesign as the
875 programmer tries to predict every possible use the software might
876 require, adding layers of type and abstraction just in case.
877 This is upside down.
878 The way pieces of a system interact should adapt as it grows,
879 not be fixed at the dawn of time.
880
881 Go therefore encourages _composition_ over inheritance, using
882 simple, often one-method interfaces to define trivial behaviors
883 that serve as clean, comprehensible boundaries between components.
884
885 Consider the `Writer` interface shown above, which is defined in
886 package `io`: Any item that has a `Write` method with this
887 signature works well with the complementary `Reader` interface:
888
889 type Reader interface {
890 Read(p []byte) (n int, err error)
891 }
892
893 These two complementary methods allow type-safe chaining
894 with rich behaviors, like generalized Unix pipes.
895 Files, buffers, networks,
896 encryptors, compressors, image encoders, and so on can all be
897 connected together.
898 The `Fprintf` formatted I/O routine takes an `io.Writer` rather than,
899 as in C, a `FILE*`.
900 The formatted printer has no knowledge of what it is writing to; it may
901 be a image encoder that is in turn writing to a compressor that
902 is in turn writing to an encryptor that is in turn writing to a network
903 connection.
904
905 Interface composition is a different style of programming, and
906 people accustomed to type hierarchies need to adjust their thinking to
907 do it well, but the result is an adaptability of
908 design that is harder to achieve through type hierarchies.
909
910 Note too that the elimination of the type hierarchy also eliminates
911 a form of dependency hierarchy.
912 Interface satisfaction allows the program to grow organically without
913 predetermined contracts.
914 And it is a linear form of growth; a change to an interface affects
915 only the immediate clients of that interface; there is no subtree to update.
916 The lack of `implements` declarations disturbs some people but
917 it enables programs to grow naturally, gracefully, and safely.
918
919 Go's interfaces have a major effect on program design.
920 One place we see this is in the use of functions that take interface
921 arguments. These are _not_ methods, they are functions.
922 Some examples should illustrate their power.
923 `ReadAll` returns a byte slice (array) holding all the data that can
924 be read from an `io.Reader`:
925
926 func ReadAll(r io.Reader) ([]byte, error)
927
928 Wrappers—functions that take an interface and return an interface—are
929 also widespread.
930 Here are some prototypes.
931 `LoggingReader` logs every `Read` call on the incoming `Reader`.
932 `LimitingReader` stops reading after `n` bytes.
933 `ErrorInjector` aids testing by simulating I/O errors.
934 And there are many more.
935
936 func LoggingReader(r io.Reader) io.Reader
937 func LimitingReader(r io.Reader, n int64) io.Reader
938 func ErrorInjector(r io.Reader) io.Reader
939
940 The designs are nothing like hierarchical, subtype-inherited methods.
941 They are looser (even _ad_hoc_), organic, decoupled, independent, and therefore scalable.
942
943 * Errors
944
945 Go does not have an exception facility in the conventional sense,
946 that is, there is no control structure associated with error handling.
947 (Go does provide mechanisms for handling exceptional situations
948 such as division by zero. A pair of built-in functions
949 called `panic` and `recover` allow the programmer to protect
950 against such things. However, these functions
951 are intentionally clumsy, rarely used, and not integrated
952 into the library the way, say, Java libraries use exceptions.)
953
954 The key language feature for error handling is a pre-defined
955 interface type called `error` that represents a value that has an
956 `Error` method returning a string:
957
958 type error interface {
959 Error() string
960 }
961
962 Libraries use the `error` type to return a description of the error.
963 Combined with the ability for functions to return multiple
964 values, it's easy to return the computed result along with an
965 error value, if any.
966 For instance, the equivalent
967 to C's `getchar` does not return an out-of-band value at EOF,
968 nor does it throw an exception; it just returns an `error` value
969 alongside the character, with a `nil` `error` value signifying success.
970 Here is the signature of the `ReadByte` method of the buffered
971 I/O package's `bufio.Reader` type:
972
973 func (b *Reader) ReadByte() (c byte, err error)
974
975 This is a clear and simple design, easily understood.
976 Errors are just values and programs compute with
977 them as they would compute with values of any other type.
978
979 It was a deliberate choice not to incorporate exceptions in Go.
980 Although a number of critics disagree with this decision, there
981 are several reasons we believe it makes for better software.
982
983 First, there is nothing truly exceptional about errors in computer programs.
984 For instance, the inability to open a file is a common issue that
985 does not deserve special linguistic constructs; `if` and `return` are fine.
986
987 f, err := os.Open(fileName)
988 if err != nil {
989 return err
990 }
991
992 Also, if errors use special control structures, error handling distorts
993 the control flow for a program that handles errors.
994 The Java-like style of `try-catch-finally` blocks interlaces multiple overlapping flows
995 of control that interact in complex ways.
996 Although in contrast Go makes it more
997 verbose to check errors, the explicit design keeps the flow of control
998 straightforward—literally.
999
1000 There is no question the resulting code can be longer,
1001 but the clarity and simplicity of such code offsets its verbosity.
1002 Explicit error checking forces the programmer to think about
1003 errors—and deal with them—when they arise. Exceptions make
1004 it too easy to _ignore_ them rather than _handle_ them, passing
1005 the buck up the call stack until it is too late to fix the problem or
1006 diagnose it well.
1007
1008 * Tools
1009
1010 Software engineering requires tools.
1011 Every language operates in an environment with other languages
1012 and myriad tools to compile, edit, debug, profile, test, and run programs.
1013
1014 Go's syntax, package system, naming conventions, and other features
1015 were designed to make tools easy to write, and the library
1016 includes a lexer, parser, and type checker for the language.
1017
1018 Tools to manipulate Go programs are so easy to write that
1019 many such tools have been created,
1020 some with interesting consequences for software engineering.
1021
1022 The best known of these is `gofmt`, the Go source code formatter.
1023 From the beginning of the project, we intended Go programs
1024 to be formatted by machine, eliminating an entire class of argument
1025 between programmers: how do I lay out my code?
1026 `Gofmt` is run on all Go programs we write, and most of the open
1027 source community uses it too.
1028 It is run as a "presubmit" check for the code repositories to
1029 make sure that all checked-in Go programs are formatted the same.
1030
1031 `Gofmt` is often cited by users as one of Go's best features even
1032 though it is not part of the language.
1033 The existence and use of `gofmt` means that
1034 from the beginning, the community has always
1035 seen Go code as `gofmt` formats it, so Go programs have a single
1036 style that is now familiar to everyone. Uniform presentation
1037 makes code easier to read and therefore faster to work on.
1038 Time not spent on formatting is time saved.
1039 `Gofmt` also affects scalability: since all code looks the same,
1040 teams find it easier to work together or with others' code.
1041
1042 `Gofmt` enabled another class of tools that we did not foresee as clearly.
1043 The program works by parsing the source code and reformatting it
1044 from the parse tree itself.
1045 This makes it possible to _edit_ the parse tree before formatting it,
1046 so a suite of automatic refactoring tools sprang up.
1047 These are easy to write, can be semantically rich because they work
1048 directly on the parse tree, and automatically produce canonically
1049 formatted code.
1050
1051 The first example was a `-r` (rewrite) flag on `gofmt` itself, which
1052 uses a simple pattern-matching language to enable expression-level
1053 rewrites. For instance, one day we introduced a default value for the
1054 right-hand side of a slice expression: the length itself. The entire
1055 Go source tree was updated to use this default with the single
1056 command:
1057
1058 gofmt -r 'a[b:len(a)] -> a[b:]'
1059
1060 A key point about this transformation is that, because the input and
1061 output are both in the canonical format, the only changes made to
1062 the source code are semantic ones.
1063
1064 A similar but more intricate process allowed `gofmt` to be used to
1065 update the tree when the language no longer required semicolons
1066 as statement terminators if the statement ended at a newline.
1067
1068 Another important tool is `gofix`, which runs tree-rewriting modules
1069 written in Go itself that are therefore are capable of more advanced
1070 refactorings.
1071 The `gofix` tool allowed us to make sweeping changes to APIs and language
1072 features leading up to the release of Go 1, including a change to the syntax
1073 for deleting entries from a map, a radically different API for manipulating
1074 time values, and many more.
1075 As these changes rolled out, users could update all their code by running
1076 the simple command
1077
1078 gofix
1079
1080 Note that these tools allow us to _update_ code even if the old code still
1081 works.
1082 As a result, Go repositories are easy to keep up to date as libraries evolve.
1083 Old APIs can be deprecated quickly and automatically so only one version
1084 of the API needs to be maintained.
1085 For example, we recently changed Go's protocol buffer implementation to use
1086 "getter" functions, which were not in the interface before.
1087 We ran `gofix` on _all_ of Google's Go code to update all programs that
1088 use protocol buffers, and now there is only one version of the API in use.
1089 Similar sweeping changes to the C++ or Java libraries are almost infeasible
1090 at the scale of Google's code base.
1091
1092 The existence of a parsing package in the standard Go library has enabled
1093 a number of other tools as well. Examples include the `go` tool, which
1094 manages program construction including acquiring packages from
1095 remote repositories;
1096 the `godoc` document extractor,
1097 a program to verify that the API compatibility contract is maintained as
1098 the library is updated, and many more.
1099
1100 Although tools like these are rarely mentioned in the context of language
1101 design, they are an integral part of a language's ecosystem and the fact
1102 that Go was designed with tooling in mind has a huge effect on the
1103 development of the language, its libraries, and its community.
1104
1105 * Conclusion
1106
1107 Go's use is growing inside Google.
1108
1109 Several big user-facing services use it, including `youtube.com` and `dl.google.com`
1110 (the download server that delivers Chrome, Android and other downloads),
1111 as well as our own [[/][go.dev]].
1112 And of course many small ones do, mostly
1113 built using Google App Engine's native support for Go.
1114
1115 Many other companies use Go as well; the list is very long, but a few of the
1116 better known are:
1117
1118 - BBC Worldwide
1119 - Canonical
1120 - Heroku
1121 - Nokia
1122 - SoundCloud
1123
1124 It looks like Go is meeting its goals. Still, it's too early to declare it a success.
1125 We don't have enough experience yet, especially with big programs (millions
1126 of lines of code) to know whether the attempts to build a scalable language
1127 have paid off. All the indicators are positive though.
1128
1129 On a smaller scale, some minor things aren't quite right and might get
1130 tweaked in a later (Go 2?) version of the language. For instance, there are
1131 too many forms of variable declaration syntax, programmers are
1132 easily confused by the behavior of nil values inside non-nil interfaces,
1133 and there are many library and interface details that could use another
1134 round of design.
1135
1136 It's worth noting, though, that `gofix` and `gofmt` gave us the opportunity to
1137 fix many other problems during the leadup to Go version 1.
1138 Go as it is today is therefore much closer to what the designers wanted
1139 than it would have been without these tools, which were themselves
1140 enabled by the language's design.
1141
1142 Not everything was fixed, though. We're still learning (but the language
1143 is frozen for now).
1144
1145 A significant weakness of the language is that the implementation still
1146 needs work. The compilers' generated code and the performance of the
1147 runtime in particular should be better, and work continues on them.
1148 There is progress already; in fact some benchmarks show a
1149 doubling of performance with the development version today compared
1150 to the first release of Go version 1 early in 2012.
1151
1152 * Summary
1153
1154 Software engineering guided the design of Go.
1155 More than most general-purpose
1156 programming languages, Go was designed to address a set of software engineering
1157 issues that we had been exposed to in the construction of large server software.
1158 Offhand, that might make Go sound rather dull and industrial, but in fact
1159 the focus on clarity, simplicity and composability throughout the design
1160 instead resulted in a productive, fun language that many programmers
1161 find expressive and powerful.
1162
1163 The properties that led to that include:
1164
1165 - Clear dependencies
1166 - Clear syntax
1167 - Clear semantics
1168 - Composition over inheritance
1169 - Simplicity provided by the programming model (garbage collection, concurrency)
1170 - Easy tooling (the `go` tool, `gofmt`, `godoc`, `gofix`)
1171
1172 If you haven't tried Go already, we suggest you do.
1173
1174
1175 .link / go.dev
1176
1177 .image splash/appenginegophercolor.jpg
1178
1179
View as plain text