Go for C programmers
ACCU, Silicon Valley Chapter, Nov. 14, 2012
Robert Griesemer
Google Inc.
Robert Griesemer
Google Inc.
Part 1: Introduction to Go
Break
Part 2: Object-oriented and concurrent programming in Go
2Incomprehensible and unsafe code.
Extremely slow builds.
Missing concurrency support.
Outdated tools.
4Readable, safe, and efficient code.
A build system that scales.
Good concurrency support.
Tools that can operate at Google-scale.
5Compact, yet expressive
Statically typed and garbage collected
Object- but not type-oriented
Strong concurrency support
Efficient implementation
Rich standard library
Fast compilers
Scalable tools
6package main import "fmt" func main() { fmt.Println("Hello, 世界") }
Syntax is not important... - unless you are a programmer.
Rob Pike.
The readability of programs is immeasurably more important than their writeability.
Hints on Programming Language Design
C. A. R. Hoare 1973
scoped_ptr<logStats::LogStats> logStats(logStats::LogStats::NewLogStats(FLAGS_logStats, logStats::LogStats::kFIFO));
(n: Int) => (2 to n) |> (r => r.foldLeft(r.toSet)((ps, x) => if (ps(x)) ps -- (x * x to n by x) else ps))
t := time.Now() switch { case t.Hour() < 12: return "morning" case t.Hour() < 18: return "afternoon" default: return "evening" }
A Go program consists of packages.
A package consists of one or more source files (.go
files).
Each source file starts with a package clause followed by declarations.
package main import "fmt" func main() { fmt.Println("Hello, 世界") }
By convention, all files belonging to a single package are located in a single directory.
14Pattern: keyword names [type] [initializer]
import "fmt" const digits = "0123456789abcdef" type Point struct { x, y int tag string } var s [32]byte var msgs = []string{"Hello, 世界", "Ciao, Mondo"} func itoa(x, base int) string
p, q *Point func adder(delta int) func(x int) int
In Go, constants are mathematically precise.
There is no need for qualifiers (-42LL
, 7UL
, etc.)
const ( MaxUInt = 1<<64 - 1 Pi = 3.14159265358979323846264338327950288419716939937510582097494459 Pi2 = Pi * Pi Delta = 2.0 )
Only when used, constants are truncated to size:
// +build ignore,OMIT
package main
import "fmt"
// START1 OMIT
const (
MaxUInt = 1<<64 - 1
Pi = 3.14159265358979323846264338327950288419716939937510582097494459
Pi2 = Pi * Pi
Delta = 2.0
)
// STOP1 OMIT
func main() {
var x uint64 = MaxUInt var pi2 float32 = Pi2 var delta int = Delta
fmt.Println("x =", x)
fmt.Println("pi2 =", pi2)
fmt.Println("delta =", delta)
}
Huge win in readability and ease of use.
17
Familiar:
- Basic types, arrays, structs, pointers, functions.
But:
- string
is a basic type.
- No automatic conversion of basic types in expressions.
- No pointer arithmetic; pointers and arrays are different.
- A function type represents a function; context and all.
New:
- Slices instead of array pointers + separate length: []int
- Maps because everybody needs them: map[string]int
- Interfaces for polymorphism: interface{}
- Channels for communication between goroutines: chan
int
Familiar:
var i int var p, q *Point var threshold float64 = 0.75
New: Type can be inferred from type of initializing expression.
var i = 42 // type of i is int var z = 1 + 2.3i // type of z is complex128
Shortcut (inside functions):
i := 42 // type of i is int
It is safe to take the address of any variable:
return &i
Functions may have multiple return values:
func atoi(s string) (i int, err error)
Functions are first-class citizens (closures):
func adder(delta int) func(x int) int { f := func(x int) int { return x + delta } return f }
// +build ignore,OMIT
package main
import "fmt"
// START1 OMIT
func adder(delta int) func(x int) int {
f := func(x int) int { // HL
return x + delta // HL
} // HL
return f
}
// STOP1 OMIT
func main() {
var inc = adder(1) fmt.Println(inc(0)) fmt.Println(adder(-1)(10))
}
t := x switch { case x == 0: return "0" case x < 0: t = -x } var s [32]byte i := len(s) for t != 0 { // Look, ma, no ()'s! i-- s[i] = digits[t%base] t /= base } if x < 0 { i-- s[i] = '-' } return string(s[i:])
Cleanups:
++
and --
are statementsbreak
in switch
; explicit fallthrough
New:
for
range
switch
go
, select
defer
Assignments may assign multiple values simultaneously:
a, b = x, y
Equivalent to:
t1 := x t2 := y a = t1 b = t2
For instance:
a, b = b, a // swap a and b i, err = atoi(s) // assign results of atoi i, _ = atoi(991) // discard 2nd value
Switch statements may have multiple case values, and break
is implicit:
switch day { case 1, 2, 3, 4, 5: tag = "workday" case 0, 6: tag = "weekend" default: tag = "invalid" }
The case values don't have to be constants:
switch { case day < 0 || day > 6: tag = "invalid" case day == 0 || day == 6: tag = "weekend" default: tag = "workday" }
for i := 0; i < len(primes); i++ { fmt.Println(i, primes[i]) }
A range
clause permits easy iteration over arrays and slices:
// +build ignore,OMIT
package main
import "fmt"
var primes = [...]int{2, 3, 5, 7, 11, 13, 17, 19}
func _() {
// START1 OMIT
for i := 0; i < len(primes); i++ {
fmt.Println(i, primes[i])
}
// STOP1 OMIT
// START2 OMIT
var sum int
for _, x := range primes {
sum += x
}
// STOP2 OMIT
}
func main() {
for i, x := range primes { fmt.Println(i, x) }
}
Unused values are discarded by assigning to the blank (_) identifier:
var sum int for _, x := range primes { sum += x }
An import declaration is used to express a dependency on another package:
import "net/rpc"
Here, the importing package depends on the Go package "rpc".
The import path ("net/rpc") uniquely identifies a package; multiple packages may have the same name, but they all reside at different locations (directories).
By convention, the package name matches the last element of the import path (here: "rpc").
Exported functionality of the rpc
package is available via the package qualifier (rpc
):
rpc.Call
A Go import declaration takes the place of a C include.
27
How names work in a programming language is critical to readability.
Scopes control how names work.
Go has very simple scope hierarchy:
Upper case names are exported: Name
vs. name
.
The package qualifier is always present for imported names.
(First component of) every name is always declared in the current package.
One of the best (and hardest!) decisions made in Go.
29No surprises when importing:
Names do not leak across boundaries.
In C, C++, Java the name y
could refer to anything.
In Go, y
(or even Y
) is always defined within the package.
In Go, x.Y
is clear: find x
locally, Y
belongs to it, and there is only one such Y
.
Immediate consequences for readability.
30Importing a package means reading a package's exported API.
This export information is self-contained. For instance:
B's export data contains all necessary information about C. There is no need for a compiler to read the export data of C.
This has huge consequences for build times!
31
.h
files are not self-contained.
As a result, a compiler ends up reading core header files over and over again.
ifdef
still requires the preprocessor to read a lot of code.
No wonder it takes a long time to compile...
At Google scale: dependencies explode, are exponential, become almost non-computable.
A large Google C++ build can read the same header file tens of thousands of times!
32Two compilers: gc, gccgo
Support for multiple platforms: x86 (32/64bit), ARM (32bit), Linux, BSD, OS X, ...
Automatic formatting of source code: gofmt
Automatic documentation extraction: godoc
Automatic API adaption: gofix
All (and more!) integrated into the go
command.
A Go program can be compiled and linked without additional build information (make
files, etc.).
By convention, all files belonging to a package are found in the same directory.
All depending packages are found by inspecting the import paths of the top-most (main) package.
A single integrated tool takes care of compiling individual files or entire systems.
35Usage:
go command [arguments]
Selected commands:
build compile packages and dependencies clean remove object files doc run godoc on package sources fix run go tool fix on packages fmt run gofmt on package sources get download and install packages and dependencies install compile and install packages and dependencies list list packages run compile and run Go program test test packages vet run go tool vet on packages
"Object-oriented programming (OOP) is a programming paradigm using objects – usually instances of a class – consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction, encapsulation, messaging, modularity, polymorphism, and inheritance. Many modern programming languages now support forms of OOP, at least as an option."
(Wikipedia)
39We only need
Claim: Data abstraction, encapsulation, and modularity are mechanisms
in their own right, not OOP specific, and a modern language (OO or not)
should have support for them independently.
Methods without classes
Interfaces without hierarchies
Code reuse without inheritance
Specifically:
package main import "fmt" type Point struct{ x, y int } func PointToString(p Point) string { return fmt.Sprintf("Point{%d, %d}", p.x, p.y) } func (p Point) String() string { return fmt.Sprintf("Point{%d, %d}", p.x, p.y) } func main() { p := Point{3, 5} fmt.Println(PointToString(p)) // static dispatch fmt.Println(p.String()) // static dispatch fmt.Println(p) }
package main import "fmt" type Celsius float32 type Fahrenheit float32 func (t Celsius) String() string { return fmt.Sprintf("%g°C", t) } func (t Fahrenheit) String() string { return fmt.Sprintf("%g°F", t) } func (t Celsius) ToFahrenheit() Fahrenheit { return Fahrenheit(t*9/5 + 32) } func main() { var t Celsius = 21 fmt.Println(t.String()) fmt.Println(t) fmt.Println(t.ToFahrenheit()) }
type Stringer interface { String() string } type Reader interface { Read(p []byte) (n int, err error) } type Writer interface { Write(p []byte) (n int, err error) } type Empty interface{}
An interface defines a set of methods.
A type that implements all methods of an interface is said to implement the interface.
All types implement the empty interface interface{}.
// +build ignore,OMIT
package main
import "fmt"
type Point struct{ x, y int }
func (p Point) String() string { return fmt.Sprintf("Point{%d, %d}", p.x, p.y) }
type Celsius float32
type Fahrenheit float32
func (t Celsius) String() string { return fmt.Sprintf("%g°C", t) }
func (t Fahrenheit) String() string { return fmt.Sprintf("%g°F", t) }
func (t Celsius) ToFahrenheit() Fahrenheit { return Fahrenheit(t*9/5 + 32) }
func main() {
type Stringer interface { String() string } var v Stringer var corner = Point{1, 1} var boiling = Celsius(100) v = corner fmt.Println(v.String()) // dynamic dispatch fmt.Println(v) v = boiling.ToFahrenheit() fmt.Println(v.String()) // dynamic dispatch fmt.Println(v)
}
A value (here: corner
, boiling
) of a type (Point
, Celsius
) that implements
an interface (Stringer
) can be assigned to a variable (v
) of that interface type.
Typically, interfaces are small (1-3 methods).
Pervasive use of key interfaces in the standard library make it easy to chain APIs together.
package io func Copy(dst Writer, src Reader) (int64, error)
The io.Copy function copies by reading from any Reader and writing to any Writer.
Interfaces are often introduced ad-hoc, and after the fact.
There is no explicit hierarchy and thus no need to design one!
46package main import ( "flag" "io" "os" ) func main() { flag.Parse() for _, arg := range flag.Args() { f, err := os.Open(arg) if err != nil { panic(err) } defer f.Close() _, err = io.Copy(os.Stdout, f) if err != nil { panic(err) } } }
Methods on any types and ad hoc interfaces make for a light-weight OO programming style.
Go interfaces enable post-facto abstraction.
No explicit type hierarchies.
"Plug'n play" in a type-safe way.
48Concurrency is the composition of independently executing computations.
Concurrency is a way to structure software, particularly as a way to write clean code that interacts well with the real world.
It is not parallelism.
50Concurrency is not parallelism, although it enables parallelism.
If you have only one processor, your program can still be concurrent but it cannot be parallel.
On the other hand, a well-written concurrent program might run efficiently in parallel on a multiprocessor. That property could be important...
For more on that distinction, see the link below. Too much to discuss here.
go.dev/s/concurrency-is-not-parallelism
51Easy to understand.
Easy to use.
Easy to reason about.
You don't need to be an expert!
(Much nicer than dealing with the minutiae of parallelism (threads, semaphores, locks, barriers, etc.))
There is a long history behind Go's concurrency features, going back to Hoare's CSP in 1978 and even Dijkstra's guarded commands (1975).
52We need an example to show the interesting properties of the concurrency primitives.
To avoid distraction, we make it a boring example.
func main() { f("Hello, World", 500*time.Millisecond) }
// +build ignore,OMIT
package main
import (
"fmt"
"time"
)
// START1 OMIT
func main() {
f("Hello, World", 500*time.Millisecond)
}
// STOP1 OMIT
func f(msg string, delay time.Duration) { for i := 0; ; i++ { fmt.Println(msg, i) time.Sleep(delay) } }
The go
statement runs the function as usual, but doesn't make the caller wait.
It launches a goroutine.
The functionality is analogous to the & on the end of a shell command.
// +build ignore,OMIT
package main
import (
"fmt"
"time"
)
func main() { go f("three", 300*time.Millisecond) go f("six", 600*time.Millisecond) go f("nine", 900*time.Millisecond) }
func f(msg string, delay time.Duration) {
for i := 0; ; i++ {
fmt.Println(msg, i)
time.Sleep(delay)
}
}
When main returns, the program exits and takes the function f
down with it.
We can hang around a little, and on the way show that both main and the launched goroutine are running.
// +build ignore,OMIT
package main
import (
"fmt"
"time"
)
func main() { go f("three", 300*time.Millisecond) go f("six", 600*time.Millisecond) go f("nine", 900*time.Millisecond) time.Sleep(3 * time.Second) fmt.Println("Done.") }
func f(msg string, delay time.Duration) {
for i := 0; ; i++ {
fmt.Println(msg, i)
time.Sleep(delay)
}
}
What is a goroutine? It's an independently executing function, launched by a go
statement.
It has its own call stack, which grows and shrinks as required.
It's very cheap. It's practical to have thousands, even hundreds of thousands of goroutines.
It's not a thread.
There might be only one thread in a program with thousands of goroutines.
Instead, goroutines are multiplexed dynamically onto threads as needed to keep all the goroutines running.
But if you think of it as a very cheap thread, you won't be far off.
57A channel in Go provides a connection between two goroutines, allowing them to communicate.
// Declaring and initializing. var c chan int c = make(chan int) // or c := make(chan int)
// Sending on a channel. c <- 1
// Receiving from a channel. // The "arrow" indicates the direction of data flow. value = <-c
A channel connects the main
and f
goroutines so they can communicate.
// +build ignore,OMIT
package main
import (
"fmt"
"time"
)
func main() { c := make(chan string) go f("three", 300*time.Millisecond, c) for i := 0; i < 10; i++ { fmt.Println("Received", <-c) // Receive expression is just a value. } fmt.Println("Done.") }
// START2 OMIT
func f(msg string, delay time.Duration, c chan string) {
for i := 0; ; i++ {
c <- fmt.Sprintf("%s %d", msg, i) // Any suitable value can be sent. // HL
time.Sleep(delay)
}
}
// STOP2 OMIT
func f(msg string, delay time.Duration, c chan string) { for i := 0; ; i++ { c <- fmt.Sprintf("%s %d", msg, i) // Any suitable value can be sent. time.Sleep(delay) } }
When the main function executes <–c
, it will wait for a value to be sent.
Similarly, when the function f
executes c<–value
, it waits for a receiver to be ready.
A sender and receiver must both be ready to play their part in the communication. Otherwise we wait until they are.
Thus channels both communicate and synchronize.
Channels can be unbuffered or buffered.
60// +build ignore,OMIT
package main
import (
"fmt"
"time"
)
func main() { c := make(chan string) go f("three", 300*time.Millisecond, c) go f("six", 600*time.Millisecond, c) go f("nine", 900*time.Millisecond, c) for i := 0; i < 10; i++ { fmt.Println("Received", <-c) } fmt.Println("Done.") }
// START2 OMIT
func f(msg string, delay time.Duration, c chan string) {
for i := 0; ; i++ {
c <- fmt.Sprintf("%s %d", msg, i) // Any suitable value can be sent. // HL
time.Sleep(delay)
}
}
// STOP2 OMIT
A single channel may be used to communicate between many (not just two) goroutines; many goroutines may communicate via one or multiple channels.
This enables a rich variety of concurrency patterns.
61func worker(in chan int, out chan []int) { for { order := <-in // Receive a work order. result := factor(order) // Do some work. out <- result // Send the result back. } }
The worker
uses two channels to communicate:
- The in
channel waits for some work order.
- The out
channel communicates the result.
- As work load, a worker (very slowly) computes the list of prime factors for a given order.
func producer(out chan int) { for order := 0; ; order++ { out <- order // Produce a work order. } } func consumer(in chan []int, n int) { for i := 0; i < n; i++ { result := <-in // Consume a result. fmt.Println("Consumed", result) } }
The producer
produces and endless supply of work orders and sends them out
.
The consumer
receives n
results from the in
channel and then terminates.
// +build ignore,OMIT
package main
import "fmt"
import "time"
func main() { start := time.Now() in := make(chan int) // Channel on which work orders are received. out := make(chan []int) // Channel on which results are returned. go producer(in) go worker(in, out) // Launch one worker. consumer(out, 100) fmt.Println(time.Since(start)) }
// START2 OMIT
func worker(in chan int, out chan []int) {
for {
order := <-in // Receive a work order. // HL
result := factor(order) // Do some work. // HL
out <- result // Send the result back. // HL
}
}
// STOP2 OMIT
// START3 OMIT
func producer(out chan int) {
for order := 0; ; order++ {
out <- order // Produce a work order. // HL
}
}
func consumer(in chan []int, n int) {
for i := 0; i < n; i++ {
result := <-in // Consume a result. // HL
fmt.Println("Consumed", result)
}
}
// STOP3 OMIT
// factor returns a list containing x and its prime factors.
func factor(x int) (list []int) {
list = append(list, x)
for f := 2; x >= f; f++ {
for x%f == 0 {
x /= f
list = append(list, f)
// Slow down so we can see what happens.
time.Sleep(50 * time.Millisecond)
}
}
return
}
We use one worker to handle the entire work load.
Because there is only one worker, we see the result coming back in order.
This is running rather slow...
64// +build ignore,OMIT
package main
import "fmt"
import "time"
func main() {
start := time.Now()
in := make(chan int) out := make(chan []int) go producer(in) // Launch 10 workers. for i := 0; i < 10; i++ { go worker(in, out) } consumer(out, 100)
fmt.Println(time.Since(start))
}
// START2 OMIT
func worker(in chan int, out chan []int) {
for {
order := <-in // Receive a work order. // HL
result := factor(order) // Do some work. // HL
out <- result // Send the result back. // HL
}
}
// STOP2 OMIT
// START3 OMIT
func producer(out chan int) {
for order := 0; ; order++ {
out <- order // Produce a work order. // HL
}
}
func consumer(in chan []int, n int) {
for i := 0; i < n; i++ {
result := <-in // Consume a result. // HL
fmt.Println("Consumed", result)
}
}
// STOP3 OMIT
// factor returns a list containing x and its prime factors.
func factor(x int) (list []int) {
list = append(list, x)
for f := 2; x >= f; f++ {
for x%f == 0 {
x /= f
list = append(list, f)
// Slow down so we can see what happens.
time.Sleep(50 * time.Millisecond)
}
}
return
}
A ready worker will read the next order from the in
channel and start working on it. Another ready worker will proceed with the next order, and so forth.
Because we have many workers and since different orders take different amounts of time to work on, we see the results coming back out-of-order.
On a multi-core system, many workers may truly run in parallel.
This is running much faster...
65Don't communicate by sharing memory, share memory by communicating.
66Go Home Page:
Go Tour (learn Go in your browser)
Package documentation:
Articles galore:
68