Text file talks/2015/dynamic-tools.slide

     1  Go Dynamic Tools
     2  Gophercon 2015, July 9, 2015
     3  
     4  Dmitry Vyukov
     5  Google
     6  dvyukov@
     7  
     8  
     9  * Video
    10  
    11  A video of this talk was recorded at GopherCon in Denver.
    12  
    13  .link https://www.youtube.com/watch?v=a9xrxRsIbSU Watch the talk on YouTube
    14  
    15  
    16  * About me
    17  
    18  Did a bunch of work on Go:
    19  
    20  - Scalable goroutine scheduler
    21  - Integrated network poller
    22  - Parallel GC, concurrent sweeping
    23  - Memory allocator speed/space improvements
    24  - Sync primitives
    25  - Race detector
    26  - Blocking profile
    27  - 800+ commits, filed 500+ bugs
    28  
    29  But actually on dynamic testing tools team:
    30  
    31  - Thread/Address/MemorySanitizer
    32  
    33  * Agenda
    34  
    35  - Data race detector
    36  - Go-fuzz, randomized testing system
    37  - Execution tracer
    38  
    39  * Data race detector
    40  
    41  .image dynamic-tools/philosoraptor.png
    42  
    43  
    44  * What is a data race?
    45  
    46  A data race occurs when two goroutines access the same variable concurrently and at least one of the accesses is a write.
    47  
    48  *All*bets*are*off*!*
    49  
    50  Any data race can destroy the memory/type-safety of a Go program.
    51  
    52  * There are no "benign" data races
    53  
    54  	// goroutine 1       // goroutine 2
    55  	m[k1] = v1           m[k2] = v2
    56  
    57  Bad!
    58  
    59  	// goroutine 1       // goroutine 2
    60  	stat++               stat++
    61  
    62  Also bad!
    63  
    64  Compilers assume race-free programs and do aggressive optimizations
    65  based on that assumption (e.g. assume "ownership" over written-to variables).
    66  
    67  Races are non-deterministic and hard to debug.
    68  
    69  * Usage
    70  
    71  	$ go test -race mypkg    // to test the package
    72  	$ go run -race mysrc.go  // to run the source file
    73  	$ go build -race mycmd   // to build the command
    74  	$ go install -race mypkg // to install the package
    75  
    76  That's it!
    77  
    78  * Example
    79  
    80  	package main
    81  
    82  	func main() {
    83  		m := make(map[int]int)
    84  		go func() {
    85  			m[1] = 1
    86  		}()
    87  		m[2] = 2
    88  	}
    89  
    90  * Example report
    91  
    92   WARNING: DATA RACE
    93   Write by goroutine 5:
    94     runtime.mapassign1()
    95         runtime/hashmap.go:411 +0x0
    96     main.main.func1()
    97         race.go:6 +0x60
    98  
    99   Previous write by main goroutine:
   100     runtime.mapassign1()
   101         runtime/hashmap.go:411 +0x0
   102     main.main()
   103         race.go:8 +0xb6
   104  
   105   Goroutine 5 (running) created at:
   106     main.main()
   107         race.go:7 +0x76
   108  
   109  * Achievements
   110  
   111  - 70+ bugs in std lib
   112  - 350+??? bugs in Google internal code base
   113  - ??? bugs found in the wild
   114  
   115  * Instrumentation
   116  
   117  Compiler instrumentation pass enabled by -race.
   118  
   119  	func foo(p *int) {
   120  		*p = 1
   121  	}
   122  
   123  Becomes:
   124  
   125  	func foo(p *int) {
   126  		runtime.funcenter(caller_pc)
   127  		runtime.racewrite(p)
   128  		*p = 1
   129  		runtime.funcexit()
   130  	}
   131  
   132  * Run-time module
   133  
   134  Handles:
   135  
   136  - memory accesses (to catch racy accesses)
   137  - synchronization (to not produce false reports)
   138  - function calls (to collect stack traces)
   139  - goroutine creation/exit (to keep track of live goroutines)
   140  
   141  Algorithm is based on dynamic modelling of happens-before relation:
   142  
   143  - no false positives
   144  - false negatives are possible
   145  
   146  * Usage tips
   147  
   148  Dynamic tools are only as good as your tests are.
   149  
   150  - write good *concurrent* tests
   151  - have continuous build with race detector
   152  - run integration tests
   153  - run race-enabled canaries in production
   154  
   155  * Go-fuzz
   156  
   157  .image dynamic-tools/go-fuzz.png
   158  
   159  * Randomized testing
   160  
   161  A different approach to testing that finds [lots of] bugs that other testing approaches do not. Intended mostly for programs that parse complex inputs.
   162  
   163  Generate random blob -> feed into program -> see if it crashes -> profit!
   164  
   165  - cheap to use
   166  - does not have any bias
   167  
   168  Completely random blobs won't uncover lots of bugs.
   169  
   170  How can we generate diverse but meaningful inputs that will trigger
   171  nil derefs, off-by-ones, etc?
   172  
   173  * Coverage-guided fuzzing
   174  
   175  Genetic algorithms to the rescue!
   176  
   177  	Instrument program for code coverage
   178  	Collect initial corpus of inputs
   179  	for {
   180  		Randomly mutate an input from the corpus
   181  		Execute and collect coverage
   182  		If the input gives new coverage, add it to corpus
   183  	}
   184  
   185  * Example
   186  
   187  The following code wants "ABCD" input:
   188  
   189  	if input[0] == 'A' {
   190  		if input[1] == 'B' {
   191  			if input[2] == 'C' {
   192  				if input[3] == 'D' {
   193  					panic("input must not be ABCD")
   194  				}
   195  			}
   196  		}
   197  	}
   198  
   199  Corpus progression:
   200  
   201  	""
   202  	"", "A"
   203  	"", "A", "AB"
   204  	"", "A", "AB", "ABC"
   205  	"", "A", "AB", "ABC", "ABCD"
   206  
   207  * Game over
   208  
   209  CRC32 checksum verification in `image/png/reader.go`
   210  
   211  	func (d *decoder) verifyChecksum() error {
   212  		if binary.BigEndian.Uint32(d.tmp[:4]) != d.crc.Sum32() {
   213  			return FormatError("invalid checksum")
   214  		}
   215  		return nil
   216  	}
   217  
   218  Probability that random mutations will alter input in an interesting way and
   219  guess CRC32 at the same time is basically ZERO.
   220  
   221  * Sonar
   222  
   223  Don't need to guess, program knows it!
   224  
   225  	+ v1 := binary.BigEndian.Uint32(d.tmp[:4])
   226  	+ v2 := d.crc.Sum32()
   227  	+ __go_fuzz.Sonar(v1, v2)
   228  	if v1 != v2 {
   229  		return FormatError("invalid checksum")
   230  	}
   231  
   232  Then, find v1 in the input and replace it with v2. Done!
   233  
   234  * Game over 2
   235  
   236  Mutations and sonar do low-level changes ("bit-flipping"):
   237  
   238  Original:
   239  
   240  	`<item name="foo"><prop name="price">100</prop></item>`
   241  
   242  Mutated:
   243  
   244  	`<item name="foo"><prop name="price">100</prop><<item>`
   245  
   246  Also want high-level changes!
   247  
   248  * Versifier
   249  
   250  Versifier reverse-engineers [text] protocol and learns its _structure_.
   251  
   252  	abc          -> alphanum token
   253  	123, 1e-2    -> number
   254  	"..."        -> quoted
   255  	[...]        -> parenthesized
   256  	...,...,...  -> list
   257  	...\n...\n   -> lines
   258  
   259  Then, applies _structural_ mutations to inputs.
   260  
   261  * Versifier example
   262  
   263  Original:
   264  
   265  	`<item name="foo"><prop name="price">100</prop></item>`
   266  
   267  Versified (all valid xml):
   268  
   269  	<item	name="rb54ana"><item  name="foo"><prop name="price"></prop><prop/></item></item>
   270  	<item name=""><prop name="price">=</prop><prop/> </item>
   271  	<item name=""><prop F="">-026023767521520230564132665e0333302100</prop><prop/></item>
   272  	<item SN="foo_P"><prop name="_G_nx">510</prop><prop name="vC">-9e-07036514</prop></item>
   273  	<item name="foo"><prop name="c8">prop name="p"</prop>/}<prop name="price">01e-6</prop></item>
   274  	<item name="foo"><item name="foo"><prop JY="">100</prop></item>8<prop/></item>
   275  
   276  * Algorithm
   277  
   278  .image dynamic-tools/algo.png
   279  
   280  * Achievements
   281  
   282  - 115 bugs in std lib (66 fixed)
   283  - 43 bugs in golang.org/x/... (24 fixed)
   284  - 134 elsewhere
   285  
   286  * Achievements
   287  
   288  	fmt.Sprintf("%.[]")
   289  	panic: runtime error: index out of range
   290  
   291  	regexp.MustCompile("((0(0){0}))").ReplaceAllString("00000", "00$00")
   292  	panic: runtime error: slice bounds out of range
   293  
   294  	ioutil.ReadAll(flate.NewReader(strings.NewReader("4LJNIMK\a\x00\x00\xff..\xff.....\xff")))
   295  	runs forever
   296  
   297  	var x = 1/"."[0]
   298  	crashes compiler
   299  
   300  	archive/tar: hang
   301  	archive/zip: cap out of range
   302  	encoding/gob: stack overflow
   303  	encoding/asn1: index out of range
   304  	image/jpeg: Decode hangs
   305  	image/png: nil deref
   306  	math/big: incorrect string->Float conversion
   307  	crypto/x509: division by zero
   308  	...
   309  
   310  * Usage
   311  
   312  - go get github.com/dvyukov/go-fuzz/...
   313  - write test:
   314  
   315  	func Fuzz(data []byte) int {
   316  		gob.NewDecoder(bytes.NewReader(data))
   317  		return 0
   318  	}
   319  
   320  - build
   321  
   322  	$ go-fuzz-build github.com/dvyukov/go-fuzz/examples/gob
   323  
   324  - collect corpus
   325  - run
   326  
   327  	$ go-fuzz -bin=gob-fuzz.zip -workdir=examples/gob
   328  
   329  * Execution tracer
   330  
   331  .image dynamic-tools/tracer.png
   332  
   333  * Execution tracer
   334  
   335  Gives insight into dynamic execution of a program.
   336  
   337  Captures with nanosecond precision:
   338  
   339  - goroutine creation/start/end
   340  - goroutine blocking/unblocking
   341  - network blocking
   342  - system calls
   343  - GC events
   344  
   345  * Execution tracer
   346  
   347  .image dynamic-tools/trace.png 450 _
   348  
   349  * Recap
   350  
   351  - race detector: always use for testing (-race)
   352  - go-fuzz: parsing of complex inputs (github.com/dvyukov/go-fuzz)
   353  - execution tracer: deep dive into execution (-trace)
   354  
   355  
   356  

View as plain text