Go Dynamic Tools
Gophercon 2015, July 9, 2015
Dmitry Vyukov
Dmitry Vyukov
A video of this talk was recorded at GopherCon in Denver.
2Did a bunch of work on Go:
But actually on dynamic testing tools team:
A data race occurs when two goroutines access the same variable concurrently and at least one of the accesses is a write.
All bets are off !
Any data race can destroy the memory/type-safety of a Go program.
6// goroutine 1 // goroutine 2 m[k1] = v1 m[k2] = v2
Bad!
// goroutine 1 // goroutine 2 stat++ stat++
Also bad!
Compilers assume race-free programs and do aggressive optimizations
based on that assumption (e.g. assume "ownership" over written-to variables).
Races are non-deterministic and hard to debug.
7$ go test -race mypkg // to test the package $ go run -race mysrc.go // to run the source file $ go build -race mycmd // to build the command $ go install -race mypkg // to install the package
That's it!
8package main func main() { m := make(map[int]int) go func() { m[1] = 1 }() m[2] = 2 }
WARNING: DATA RACE Write by goroutine 5: runtime.mapassign1() runtime/hashmap.go:411 +0x0 main.main.func1() race.go:6 +0x60 Previous write by main goroutine: runtime.mapassign1() runtime/hashmap.go:411 +0x0 main.main() race.go:8 +0xb6 Goroutine 5 (running) created at: main.main() race.go:7 +0x76
Compiler instrumentation pass enabled by -race.
func foo(p *int) { *p = 1 }
Becomes:
func foo(p *int) { runtime.funcenter(caller_pc) runtime.racewrite(p) *p = 1 runtime.funcexit() }
Handles:
Algorithm is based on dynamic modelling of happens-before relation:
Dynamic tools are only as good as your tests are.
A different approach to testing that finds [lots of] bugs that other testing approaches do not. Intended mostly for programs that parse complex inputs.
Generate random blob -> feed into program -> see if it crashes -> profit!
Completely random blobs won't uncover lots of bugs.
How can we generate diverse but meaningful inputs that will trigger
nil derefs, off-by-ones, etc?
Genetic algorithms to the rescue!
Instrument program for code coverage Collect initial corpus of inputs for { Randomly mutate an input from the corpus Execute and collect coverage If the input gives new coverage, add it to corpus }
The following code wants "ABCD" input:
if input[0] == 'A' { if input[1] == 'B' { if input[2] == 'C' { if input[3] == 'D' { panic("input must not be ABCD") } } } }
Corpus progression:
"" "", "A" "", "A", "AB" "", "A", "AB", "ABC" "", "A", "AB", "ABC", "ABCD"
CRC32 checksum verification in image/png/reader.go
func (d *decoder) verifyChecksum() error { if binary.BigEndian.Uint32(d.tmp[:4]) != d.crc.Sum32() { return FormatError("invalid checksum") } return nil }
Probability that random mutations will alter input in an interesting way and
guess CRC32 at the same time is basically ZERO.
Don't need to guess, program knows it!
+ v1 := binary.BigEndian.Uint32(d.tmp[:4]) + v2 := d.crc.Sum32() + __go_fuzz.Sonar(v1, v2) if v1 != v2 { return FormatError("invalid checksum") }
Then, find v1 in the input and replace it with v2. Done!
20Mutations and sonar do low-level changes ("bit-flipping"):
Original:
`<item name="foo"><prop name="price">100</prop></item>`
Mutated:
`<item name="foo"><prop name="price">100</prop><<item>`
Also want high-level changes!
21Versifier reverse-engineers [text] protocol and learns its structure.
abc -> alphanum token 123, 1e-2 -> number "..." -> quoted [...] -> parenthesized ...,...,... -> list ...\n...\n -> lines
Then, applies structural mutations to inputs.
22Original:
`<item name="foo"><prop name="price">100</prop></item>`
Versified (all valid xml):
<item name="rb54ana"><item name="foo"><prop name="price"></prop><prop/></item></item> <item name=""><prop name="price">=</prop><prop/> </item> <item name=""><prop F="">-026023767521520230564132665e0333302100</prop><prop/></item> <item SN="foo_P"><prop name="_G_nx">510</prop><prop name="vC">-9e-07036514</prop></item> <item name="foo"><prop name="c8">prop name="p"</prop>/}<prop name="price">01e-6</prop></item> <item name="foo"><item name="foo"><prop JY="">100</prop></item>8<prop/></item>
fmt.Sprintf("%.[]") panic: runtime error: index out of range regexp.MustCompile("((0(0){0}))").ReplaceAllString("00000", "00$00") panic: runtime error: slice bounds out of range ioutil.ReadAll(flate.NewReader(strings.NewReader("4LJNIMK\a\x00\x00\xff..\xff.....\xff"))) runs forever var x = 1/"."[0] crashes compiler archive/tar: hang archive/zip: cap out of range encoding/gob: stack overflow encoding/asn1: index out of range image/jpeg: Decode hangs image/png: nil deref math/big: incorrect string->Float conversion crypto/x509: division by zero ...
func Fuzz(data []byte) int { gob.NewDecoder(bytes.NewReader(data)) return 0 }
$ go-fuzz-build github.com/dvyukov/go-fuzz/examples/gob
$ go-fuzz -bin=gob-fuzz.zip -workdir=examples/gob
Gives insight into dynamic execution of a program.
Captures with nanosecond precision: