The Research Problems of Implementing Go
Russ Cox
Russ Cox
I gave this talk at Google's Cambridge, Massachusetts office at an event for area Ph.D. students. The purpose of the event and the talk was to give a sense of some of the research that goes on at Google. The talk presents some research questions motivated by Go. We have answered some well, but others remain open.
2Go is an open source programming language that makes it easy to build simple, reliable, and efficient software.
Design began in late 2007.
Became open source in November 2009.
Developed entirely in the open; very active community.
Language stable as of Go 1, early 2012.
Work continues.
Started as an answer to software problems at Google:
Deployed: parts of YouTube, dl.google.com, Blogger, Google Code, Google Fiber, ...
5A simple but powerful and fun language.
For more background on design:
6Go is designed for building production systems at Google.
Plenty of research questions about how to implement Go well.
Go provides two important concepts:
A goroutine is a thread of control within the program, with its own local variables and stack. Cheap, easy to create.
A channel carries typed messages between goroutines.
9package main import "fmt" func main() { c := make(chan string) go func() { c <- "Hello" c <- "World" }() fmt.Println(<-c, <-c) }
Channels adopted from Hoare's Communicating Sequential Processes.
Go enables simple, safe concurrent programming.
It doesn't forbid bad programming.
Caveat: not purely memory safe; sharing is legal.
Passing a pointer over a channel is idiomatic.
Experience shows this is practical.
11Sequential network address resolution, given a work list:
// +build ignore,OMIT
package main
import (
"fmt"
"math/rand"
"time"
)
func lookup() {
for _, w := range worklist { w.addrs, w.err = LookupHost(w.host) }
}
func main() {
rand.Seed(time.Now().UnixNano())
t0 := time.Now()
lookup()
fmt.Printf("\n")
for _, w := range worklist {
if w.err != nil {
fmt.Printf("%s: error: %v\n", w.host, w.err)
continue
}
fmt.Printf("%s: %v\n", w.host, w.addrs)
}
fmt.Printf("total lookup time: %.3f seconds\n", time.Since(t0).Seconds())
}
var worklist = []*Work{
{host: "fast.com"},
{host: "slow.com"},
{host: "fast.missing.com"},
{host: "slow.missing.com"},
}
type Work struct {
host string
addrs []string
err error
}
func LookupHost(name string) (addrs []string, err error) {
t0 := time.Now()
defer func() {
fmt.Printf("lookup %s: %.3f seconds\n", name, time.Since(t0).Seconds())
}()
h := hosts[name]
if h == nil {
h = failure
}
return h(name)
}
type resolver func(string) ([]string, error)
var hosts = map[string]resolver{
"fast.com": delay(10*time.Millisecond, fixedAddrs("10.0.0.1")),
"slow.com": delay(2*time.Second, fixedAddrs("10.0.0.4")),
"fast.missing.com": delay(10*time.Millisecond, failure),
"slow.missing.com": delay(2*time.Second, failure),
}
func fixedAddrs(addrs ...string) resolver {
return func(string) ([]string, error) {
return addrs, nil
}
}
func delay(d time.Duration, f resolver) resolver {
return func(name string) ([]string, error) {
time.Sleep(d/2 + time.Duration(rand.Int63n(int64(d/2))))
return f(name)
}
}
func failure(name string) ([]string, error) {
return nil, fmt.Errorf("unknown host %v", name)
}
Parallel network address resolution, given a work list:
// +build ignore,OMIT
package main
import (
"fmt"
"math/rand"
"time"
)
func lookup() {
done := make(chan bool, len(worklist)) for _, w := range worklist { go func(w *Work) { w.addrs, w.err = LookupHost(w.host) done <- true }(w) } for i := 0; i < len(worklist); i++ { <-done }
}
func main() {
rand.Seed(time.Now().UnixNano())
t0 := time.Now()
lookup()
fmt.Printf("\n")
for _, w := range worklist {
if w.err != nil {
fmt.Printf("%s: error: %v\n", w.host, w.err)
continue
}
fmt.Printf("%s: %v\n", w.host, w.addrs)
}
fmt.Printf("total lookup time: %.3f seconds\n", time.Since(t0).Seconds())
}
var worklist = []*Work{
{host: "fast.com"},
{host: "slow.com"},
{host: "fast.missing.com"},
{host: "slow.missing.com"},
}
type Work struct {
host string
addrs []string
err error
}
func LookupHost(name string) (addrs []string, err error) {
t0 := time.Now()
defer func() {
fmt.Printf("lookup %s: %.3f seconds\n", name, time.Since(t0).Seconds())
}()
h := hosts[name]
if h == nil {
h = failure
}
return h(name)
}
type resolver func(string) ([]string, error)
var hosts = map[string]resolver{
"fast.com": delay(10*time.Millisecond, fixedAddrs("10.0.0.1")),
"slow.com": delay(2*time.Second, fixedAddrs("10.0.0.4")),
"fast.missing.com": delay(10*time.Millisecond, failure),
"slow.missing.com": delay(2*time.Second, failure),
}
func fixedAddrs(addrs ...string) resolver {
return func(string) ([]string, error) {
return addrs, nil
}
}
func delay(d time.Duration, f resolver) resolver {
return func(name string) ([]string, error) {
time.Sleep(d/2 + time.Duration(rand.Int63n(int64(d/2))))
return f(name)
}
}
func failure(name string) ([]string, error) {
return nil, fmt.Errorf("unknown host %v", name)
}
Challenge: Make channel communication scale
Research question: lock-free channels?
14An interface defines a set of methods.
package io type Writer interface { Write(data []byte) (n int, err error) }
A type implements the interface by implementing the methods.
package bytes type Buffer struct { ... } func (b *Buffer) Write(data []byte) (n int, err error) { ... }
An implementation of an interface can be assigned to a variable of that interface type.
package fmt func Fprintf(w io.Writer, format string, args ...interface{})
// +build ignore,OMIT
package main
import (
"bytes"
"fmt"
"io"
"os"
)
var _ = io.Copy
func main() {
b := new(bytes.Buffer) var w io.Writer w = b fmt.Fprintf(w, "hello, %s\n", "world") os.Stdout.Write(b.Bytes())
}
The source of all generality in the Go language.
20How do you make method dispatch efficient?
b := new(bytes.Buffer) var w io.Writer w = b fmt.Fprintf(w, "hello, %s\n", "world") ... w.Write(text) // what happens here?
At w.Write call, how does the runtime find the method to call?
21How do you make method dispatch efficient?
b := new(bytes.Buffer) var w io.Writer w = b // do the work here! fmt.Fprintf(w, "hello, %s\n", "world") ... w.Write(text) // plain indirect function call
Interface holds two words: "itable" and actual value (or pointer to value).
Itable contains type information plus list of function pointers for methods in interface.
w.itable.fn[1](w.data, text)
Conversion sites usually trivial to cache.
22package sort type Interface interface { Len() int // return number of elements, len(x) Less(i, j int) bool // report whether x[i] < x[j] Swap(i, j int) // x[i], x[j] = x[j], x[i] } func Sort(data Interface)
Requires some boilerplate for each use:
type bySubject []Thread func (x bySubject) Less(i, j int) bool { return x[i].Subject < x[j].Subject } func (x bySubject) Swap(i, j int) { x[i], x[j] = x[j], x[i] } func (x bySubject) Len() int { return len(x) } sort.Sort(bySubject(threads))
func Sort(data []T, less func(x, y *T) bool) sort.Sort(threads, func(x, y *Thread) bool { return x.Subject < y.Subject })
Research question: what's a reasonable semantics?
Research question: what's a reasonable implementation?
Do you want slow programmers, slow compilers and bloated binaries, or slow execution?
24Garbage collection simplifies APIs.
Fundamental to concurrency: too hard to track ownership otherwise.
Fundamental to interfaces: memory management details do not bifurcate otherwise-similar APIs.
Of course, adds cost, latency, complexity in run time system.
26Observation: garbage collection is a service, and like any service it can be overloaded, oversubscribed.
Go lets you limit allocation by controlling memory layout.
type Point struct { X, Y int } type Rectangle struct { Min, Max Point }
Language decision: interior pointers are allowed, as are foreign pointers
Allocator: objects are allocated in pages with other objects of the same size.
Current GC: stop the world, parallel mark, start the world, background sweep.
Research question: how to make collector lower latency, possibly incremental?
28
Go programs can be parsed without context (unlike C and C++).
Go ships with a standard program formatter.
Makes automated edits indistinguishable from manual edits.
$ cat x.go package main var b bytes.Buffer $ gofmt -r 'bytes.Buffer -> bytes.Writer' x.go package main var b bytes.Writer $
More advanced rewrites: "go fix" for API adjustments.
30Research Question: What about translating other programs to Go?
Exploring the conversion of C programs to Go today.
What about other languages?
31Plenty of research questions about how to implement Go well.
Russ Cox