Text file talks/2012/waza.slide

     1  #_This presentation was the closing keynote of the Heroku Waza conference in January, 2012.
     2  #_It has been slightly modified here for clarity and for use in the "present" format; the original
     3  #_used a precursor to that tool.
     4  
     5  Concurrency is not Parallelism
     6  Waza Jan 11, 2012
     7  
     8  Rob Pike
     9  r@golang.org
    10  
    11  * Video
    12  
    13  This talk was presented at Heroku's Waza conference in January 2012.
    14  
    15  .link http://vimeo.com/49718712 Watch the talk on Vimeo
    16  
    17  * The modern world is parallel
    18  
    19  Multicore.
    20  
    21  Networks.
    22  
    23  Clouds of CPUs.
    24  
    25  Loads of users.
    26  
    27  Our technology should help.
    28  That's where concurrency comes in.
    29  
    30  * Go supports concurrency
    31  
    32  Go provides:
    33  
    34  - concurrent execution (goroutines)
    35  - synchronization and messaging (channels)
    36  - multi-way concurrent control (select)
    37  
    38  * Concurrency is cool! Yay parallelism!!
    39  
    40  NO! A fallacy.
    41  
    42  When Go was announced, many were confused by the distinction.
    43  
    44  "I ran the prime sieve with 4 processors and it got slower!"
    45  
    46  * Concurrency
    47  
    48  Programming as the composition of independently executing processes.
    49  
    50  (Processes in the general sense, not Linux processes. Famously hard to define.)
    51  
    52  * Parallelism
    53  
    54  Programming as the simultaneous execution of (possibly related) computations.
    55  
    56  * Concurrency vs. parallelism
    57  
    58  Concurrency is about dealing with lots of things at once.
    59  
    60  Parallelism is about doing lots of things at once.
    61  
    62  Not the same, but related.
    63  
    64  Concurrency is about structure, parallelism is about execution.
    65  
    66  Concurrency provides a way to structure a solution to solve a problem that may (but not necessarily) be parallelizable.
    67  
    68  * An analogy
    69  
    70  Concurrent: Mouse, keyboard, display, and disk drivers.
    71  
    72  Parallel: Vector dot product.
    73  
    74  * Concurrency plus communication
    75  
    76  Concurrency is a way to structure a program by breaking it into pieces that can be executed independently.
    77  
    78  Communication is the means to coordinate the independent executions.
    79  
    80  This is the Go model and (like Erlang and others) it's based on CSP:
    81  
    82  C. A. R. Hoare: Communicating Sequential Processes (CACM 1978)
    83  
    84  * Gophers
    85  
    86  This is too abstract. Let's get concrete.
    87  
    88  * Our problem
    89  
    90  Move a pile of obsolete language manuals to the incinerator.
    91  
    92  .image waza/gophersimple1.jpg
    93  
    94  With only one gopher this will take too long.
    95  
    96  * More gophers!
    97  
    98  .image waza/gophersimple3.jpg
    99  
   100  More gophers are not enough; they need more carts.
   101  
   102  * More gophers and more carts
   103  
   104  .image waza/gophersimple2.jpg
   105  
   106  This will go faster, but there will be bottlenecks at the pile and incinerator.
   107  Also need to synchronize the gophers.
   108  A message (that is, a communication between the gophers) will do.
   109  
   110  * Double everything
   111  
   112  Remove the bottleneck; make them really independent.
   113  
   114  .image waza/gophersimple4.jpg
   115  
   116  This will consume input twice as fast.
   117  
   118  * Concurrent composition
   119  
   120  .image waza/gophersimple4.jpg
   121  The concurrent composition of two gopher procedures.
   122  
   123  * Concurrent composition
   124  
   125  This design is not automatically parallel!
   126  
   127  What if only one gopher is moving at a time?
   128  Then it's still concurrent (that's in the design), just not parallel.
   129  
   130  However, it's automatically parallelizable!
   131  
   132  Moreover the concurrent composition suggests other models.
   133  
   134  * Another design
   135  
   136  .image waza/gophercomplex0.jpg
   137  
   138  Three gophers in action, but with likely delays.
   139  Each gopher is an independently executing procedure,
   140  plus coordination (communication).
   141  
   142  * Finer-grained concurrency
   143  
   144  Add another gopher procedure to return the empty carts.
   145  
   146  .image waza/gophercomplex1.jpg
   147  
   148  Four gophers in action for better flow, each doing one simple task.
   149  
   150  If we arrange everything right (implausible but not impossible), that's four times faster than our original one-gopher design.
   151  
   152  * Observation
   153  
   154  We improved performance by adding a concurrent procedure to the existing design.
   155  
   156  More gophers doing more work; it runs better.
   157  
   158  This is a deeper insight than mere parallelism.
   159  
   160  * Concurrent procedures
   161  
   162  Four distinct gopher procedures:
   163  
   164  - load books onto cart
   165  - move cart to incinerator
   166  - unload cart into incinerator
   167  - return empty cart
   168  
   169  Different concurrent designs enable different ways to parallelize.
   170  
   171  * More parallelization!
   172  
   173  We can now parallelize on the other axis; the concurrent design makes it easy. Eight gophers, all busy.
   174  
   175  .image waza/gophercomplex2.jpg
   176  
   177  * Or maybe no parallelization at all
   178  
   179  Keep in mind, even if only one gopher is active at a time (zero parallelism), it's still a correct and concurrent solution.
   180  
   181  .image waza/gophercomplex2.jpg
   182  
   183  * Another design
   184  
   185  Here's another way to structure the problem as the concurrent composition of gopher procedures.
   186  
   187  Two gopher procedures, plus a staging pile.
   188  
   189  .image waza/gophercomplex3.jpg
   190  
   191  * Parallelize the usual way
   192  
   193  Run more concurrent procedures to get more throughput.
   194  
   195  .image waza/gophercomplex4.jpg
   196  
   197  * Or a different way
   198  
   199  Bring the staging pile to the multi-gopher concurrent model:
   200  
   201  .image waza/gophercomplex5.jpg
   202  
   203  * Full on optimization
   204  
   205  Use all our techniques. Sixteen gophers hard at work!
   206  
   207  .image waza/gophercomplex6.jpg
   208  
   209  * Lesson
   210  
   211  There are many ways to break the processing down.
   212  
   213  That's concurrent design.
   214  
   215  Once we have the breakdown, parallelization can fall out and correctness is easy.
   216  
   217  * Back to Computing
   218  
   219  In our book transport problem, substitute:
   220  
   221  - book pile => web content
   222  - gopher => CPU
   223  - cart => marshaling, rendering, or networking
   224  - incinerator => proxy, browser, or other consumer
   225  
   226  It becomes a concurrent design for a scalable web service.
   227  Gophers serving web content.
   228  
   229  * A little background about Go
   230  
   231  Not the place for a tutorial, just quick highlights.
   232  
   233  * Goroutines
   234  
   235  A goroutine is a function running independently  in the same address space as other goroutines
   236  
   237  .code waza/snippets /f.runs/
   238  
   239  .code waza/snippets /f.starts.running/,/return/
   240  
   241  Like launching a function with shell's `&` notation.
   242  
   243  * Goroutines are not threads
   244  
   245  (They're a bit like threads, but they're much cheaper.)
   246  
   247  Goroutines are multiplexed onto OS threads as required.
   248  
   249  When a goroutine blocks, that thread blocks but no other goroutine blocks.
   250  
   251  * Channels
   252  
   253  Channels are typed values that allow goroutines to synchronize and exchange information.
   254  
   255  .code waza/snippets /make.*chan/,/completedAt/
   256  
   257  * Select
   258  
   259  The `select` statement is like a `switch`, but the decision is based on ability to communicate rather than equal values.
   260  
   261  .code waza/snippets /select/,/}/
   262  
   263  * Go really supports concurrency
   264  
   265  Really.
   266  
   267  It's routine to create thousands of goroutines in one program.
   268  (Once debugged a program after it had created 1.3 million.)
   269  
   270  Stacks start small, but grow and shrink as required.
   271  
   272  Goroutines aren't free, but they're very cheap.
   273  
   274  * Closures are also part of the story
   275  
   276  Make some concurrent calculations easier to express.
   277  
   278  They are just local functions.
   279  Here's a non-concurrent example:
   280  
   281  .code waza/snippets /Compose/,/sin,/
   282  
   283  * Some examples
   284  
   285  Learn concurrent Go by osmosis.
   286  
   287  * Launching daemons
   288  
   289  Use a closure to wrap a background operation.
   290  
   291  This copies items from the input channel to the output channel:
   292  
   293  .code waza/snippets /copy.input/,/^}/
   294  
   295  The `for` `range` operation runs until channel is drained.
   296  
   297  * A simple load balancer (1)
   298  
   299  A unit of work:
   300  
   301  .code waza/load1 /type/,/^}/
   302  
   303  * A simple load balancer (2)
   304  
   305  A worker task
   306  
   307  .code waza/load1 /worker/,/^}/
   308  
   309  Must make sure other workers can run when one blocks.
   310  
   311  * A simple load balancer (3)
   312  
   313  The runner
   314  
   315  .code waza/load1 /Run/,/^}/
   316  
   317  Easy problem but also hard to solve concisely without concurrency.
   318  
   319  * Concurrency enables parallelism
   320  
   321  The load balancer is implicitly parallel and scalable.
   322  
   323  `NumWorkers` could be huge.
   324  
   325  The tools of concurrency make it almost trivial to build a safe, working, scalable, parallel design.
   326  
   327  * Concurrency simplifies synchronization
   328  
   329  No explicit synchronization needed.
   330  
   331  The structure of the program is implicitly synchronized.
   332  
   333  * That was too easy
   334  
   335  Let's do a more realistic load balancer.
   336  
   337  * Load balancer
   338  
   339  .image waza/gopherchart.jpg
   340  
   341  * Request definition
   342  
   343  The requester sends Requests to the balancer
   344  
   345  .code waza/load2 /^type.Request/,/^}/
   346  
   347  Note the return channel inside the request.
   348  Channels are first-class values.
   349  
   350  * Requester function
   351  
   352  An artificial but illustrative simulation of a requester, a load generator.
   353  
   354  .code waza/load2 /^func.requester/,/^}/
   355  
   356  * Worker definition
   357  
   358  A channel of requests, plus some load tracking data.
   359  
   360  .code waza/load2 /type.Worker/,/^}/
   361  
   362  * Worker
   363  
   364  Balancer sends request to most lightly loaded worker
   365  
   366  .code waza/load2 /^func.*work.*done/,/^}/
   367  
   368  The channel of requests (`w.requests`) delivers requests to each worker.  The balancer tracks the number of pending requests as a measure of load.
   369  Each response goes directly to its requester.
   370  
   371  Could run the loop body as a goroutine for parallelism.
   372  
   373  * Balancer definition
   374  
   375  The load balancer needs a pool of workers and a single channel to which requesters can report task completion.
   376  
   377  .code waza/load2 /type.Pool/,/^}/
   378  
   379  * Balancer function
   380  
   381  Easy!
   382  
   383  .code waza/load2 /func.*balance/,/^}/
   384  
   385  Just need to implement dispatch and completed.
   386  
   387  * A heap of channels
   388  
   389  Make Pool an implementation of the `Heap` interface by providing a few methods such as:
   390  
   391  .code waza/load2 /func.*Less/,/^}/
   392  
   393  Now we balance by making the `Pool` a heap tracked by load.
   394  
   395  * Dispatch
   396  
   397  All the pieces are in place.
   398  
   399  .code waza/load2 /Send.Request/,/^}/
   400  
   401  * Completed
   402  
   403  .code waza/load2 /Job.is.complete/,/^}/
   404  
   405  * Lesson
   406  
   407  A complex problem can be broken down into easy-to-understand components.
   408  
   409  The pieces can be composed concurrently.
   410  
   411  The result is easy to understand, efficient, scalable, and correct.
   412  
   413  Maybe even parallel.
   414  
   415  * One more example
   416  
   417  We have a replicated database and want to minimize latency by asking them all and returning the first response to arrive.
   418  
   419  * Query a replicated database
   420  
   421  .code waza/snippets /func.Query/,/^}/
   422  Concurrent tools and garbage collection make this an easy solution to a subtle problem.
   423  
   424  (Teardown of late finishers is left as an exercise.)
   425  
   426  
   427  * Conclusion
   428  
   429  
   430  Concurrency is powerful.
   431  
   432  Concurrency is not parallelism.
   433  
   434  Concurrency enables parallelism.
   435  
   436  Concurrency makes parallelism (and scaling and everything else) easy.
   437  
   438  * For more information
   439  
   440  Go: golang.org
   441  
   442  Some history: swtch.com/~rsc/thread/
   443  
   444  A previous talk (video): tinyurl.com/newsqueak1
   445  
   446  Parallelism is not concurrency (Harper): tinyurl.com/pincharper
   447  
   448  A concurrent window system (Pike): tinyurl.com/pikecws
   449  
   450  Concurrent power series (McIlroy): tinyurl.com/powser
   451  
   452  And finally, parallel but not concurrent:
   453  research.google.com/archive/sawzall.html
   454  

View as plain text