Go Dynamic Tools
Gophercon 2015, July 9, 2015
Dmitry Vyukov
Google
dvyukov@
* Video
A video of this talk was recorded at GopherCon in Denver.
.link https://www.youtube.com/watch?v=a9xrxRsIbSU Watch the talk on YouTube
* About me
Did a bunch of work on Go:
- Scalable goroutine scheduler
- Integrated network poller
- Parallel GC, concurrent sweeping
- Memory allocator speed/space improvements
- Sync primitives
- Race detector
- Blocking profile
- 800+ commits, filed 500+ bugs
But actually on dynamic testing tools team:
- Thread/Address/MemorySanitizer
* Agenda
- Data race detector
- Go-fuzz, randomized testing system
- Execution tracer
* Data race detector
.image dynamic-tools/philosoraptor.png
* What is a data race?
A data race occurs when two goroutines access the same variable concurrently and at least one of the accesses is a write.
*All*bets*are*off*!*
Any data race can destroy the memory/type-safety of a Go program.
* There are no "benign" data races
// goroutine 1 // goroutine 2
m[k1] = v1 m[k2] = v2
Bad!
// goroutine 1 // goroutine 2
stat++ stat++
Also bad!
Compilers assume race-free programs and do aggressive optimizations
based on that assumption (e.g. assume "ownership" over written-to variables).
Races are non-deterministic and hard to debug.
* Usage
$ go test -race mypkg // to test the package
$ go run -race mysrc.go // to run the source file
$ go build -race mycmd // to build the command
$ go install -race mypkg // to install the package
That's it!
* Example
package main
func main() {
m := make(map[int]int)
go func() {
m[1] = 1
}()
m[2] = 2
}
* Example report
WARNING: DATA RACE
Write by goroutine 5:
runtime.mapassign1()
runtime/hashmap.go:411 +0x0
main.main.func1()
race.go:6 +0x60
Previous write by main goroutine:
runtime.mapassign1()
runtime/hashmap.go:411 +0x0
main.main()
race.go:8 +0xb6
Goroutine 5 (running) created at:
main.main()
race.go:7 +0x76
* Achievements
- 70+ bugs in std lib
- 350+??? bugs in Google internal code base
- ??? bugs found in the wild
* Instrumentation
Compiler instrumentation pass enabled by -race.
func foo(p *int) {
*p = 1
}
Becomes:
func foo(p *int) {
runtime.funcenter(caller_pc)
runtime.racewrite(p)
*p = 1
runtime.funcexit()
}
* Run-time module
Handles:
- memory accesses (to catch racy accesses)
- synchronization (to not produce false reports)
- function calls (to collect stack traces)
- goroutine creation/exit (to keep track of live goroutines)
Algorithm is based on dynamic modelling of happens-before relation:
- no false positives
- false negatives are possible
* Usage tips
Dynamic tools are only as good as your tests are.
- write good *concurrent* tests
- have continuous build with race detector
- run integration tests
- run race-enabled canaries in production
* Go-fuzz
.image dynamic-tools/go-fuzz.png
* Randomized testing
A different approach to testing that finds [lots of] bugs that other testing approaches do not. Intended mostly for programs that parse complex inputs.
Generate random blob -> feed into program -> see if it crashes -> profit!
- cheap to use
- does not have any bias
Completely random blobs won't uncover lots of bugs.
How can we generate diverse but meaningful inputs that will trigger
nil derefs, off-by-ones, etc?
* Coverage-guided fuzzing
Genetic algorithms to the rescue!
Instrument program for code coverage
Collect initial corpus of inputs
for {
Randomly mutate an input from the corpus
Execute and collect coverage
If the input gives new coverage, add it to corpus
}
* Example
The following code wants "ABCD" input:
if input[0] == 'A' {
if input[1] == 'B' {
if input[2] == 'C' {
if input[3] == 'D' {
panic("input must not be ABCD")
}
}
}
}
Corpus progression:
""
"", "A"
"", "A", "AB"
"", "A", "AB", "ABC"
"", "A", "AB", "ABC", "ABCD"
* Game over
CRC32 checksum verification in `image/png/reader.go`
func (d *decoder) verifyChecksum() error {
if binary.BigEndian.Uint32(d.tmp[:4]) != d.crc.Sum32() {
return FormatError("invalid checksum")
}
return nil
}
Probability that random mutations will alter input in an interesting way and
guess CRC32 at the same time is basically ZERO.
* Sonar
Don't need to guess, program knows it!
+ v1 := binary.BigEndian.Uint32(d.tmp[:4])
+ v2 := d.crc.Sum32()
+ __go_fuzz.Sonar(v1, v2)
if v1 != v2 {
return FormatError("invalid checksum")
}
Then, find v1 in the input and replace it with v2. Done!
* Game over 2
Mutations and sonar do low-level changes ("bit-flipping"):
Original:
`- 100
`
Mutated:
`- 100<
- `
Also want high-level changes!
* Versifier
Versifier reverse-engineers [text] protocol and learns its _structure_.
abc -> alphanum token
123, 1e-2 -> number
"..." -> quoted
[...] -> parenthesized
...,...,... -> list
...\n...\n -> lines
Then, applies _structural_ mutations to inputs.
* Versifier example
Original:
`
- 100
`
Versified (all valid xml):
- =
- -026023767521520230564132665e0333302100
- 510-9e-07036514
- prop name="p"/}01e-6
- 100
8
* Algorithm
.image dynamic-tools/algo.png
* Achievements
- 115 bugs in std lib (66 fixed)
- 43 bugs in golang.org/x/... (24 fixed)
- 134 elsewhere
* Achievements
fmt.Sprintf("%.[]")
panic: runtime error: index out of range
regexp.MustCompile("((0(0){0}))").ReplaceAllString("00000", "00$00")
panic: runtime error: slice bounds out of range
ioutil.ReadAll(flate.NewReader(strings.NewReader("4LJNIMK\a\x00\x00\xff..\xff.....\xff")))
runs forever
var x = 1/"."[0]
crashes compiler
archive/tar: hang
archive/zip: cap out of range
encoding/gob: stack overflow
encoding/asn1: index out of range
image/jpeg: Decode hangs
image/png: nil deref
math/big: incorrect string->Float conversion
crypto/x509: division by zero
...
* Usage
- go get github.com/dvyukov/go-fuzz/...
- write test:
func Fuzz(data []byte) int {
gob.NewDecoder(bytes.NewReader(data))
return 0
}
- build
$ go-fuzz-build github.com/dvyukov/go-fuzz/examples/gob
- collect corpus
- run
$ go-fuzz -bin=gob-fuzz.zip -workdir=examples/gob
* Execution tracer
.image dynamic-tools/tracer.png
* Execution tracer
Gives insight into dynamic execution of a program.
Captures with nanosecond precision:
- goroutine creation/start/end
- goroutine blocking/unblocking
- network blocking
- system calls
- GC events
* Execution tracer
.image dynamic-tools/trace.png 450 _
* Recap
- race detector: always use for testing (-race)
- go-fuzz: parsing of complex inputs (github.com/dvyukov/go-fuzz)
- execution tracer: deep dive into execution (-trace)