Go, from C to Go
GopherCon
25 Apr 2014
Russ Cox
GopherCon
25 Apr 2014
Russ Cox
A video of this talk was recorded at GopherCon in Denver.
280,000+ lines of C.
4Programming in Go is fun.
Programming in C is not.
5Writing a Go compiler requires Go expertise.
Writing any program in C requires C expertise.
Writing a Go compiler in C requires Go and C expertise.
6Write the Go compiler in Go.
7Why not write the Go compiler in Go on day one?
Why do it today?
Crazy idea: mechanical conversion.
“One big gofix.”
10do...while
, for
, switch
, while
goto
#define
, #include
Automated conversion of our C code to Go.
Target: our C code, not all C code.
go/src/cmd/gc/go.h
struct Val { short ctype; union { short reg; // OREGISTER short bval; // bool value CTBOOL Mpint* xval; // int CTINT, rune CTRUNE Mpflt* fval; // float CTFLT Mpcplx* cval; // float CTCPLX Strlit* sval; // string CTSTR } u; };
go/include/link.h
struct Addr { short type; union { char sval[8]; float64 dval; Prog* branch; // for 5g, 6g, 8g } u; ... };
#define
struct
union
/*
Great
space
saver
*/
#define
union
struct
/*
legal
in
C!
*/
And anyway, there are only two.
23Can't just expand during parsing.
24Not many.
/* * defined macros * you need super-gopher-guru privilege * to add this list. */ #define nelem(x) (sizeof(x)/sizeof((x)[0])) #define nil ((void*)0) ...
Extend parser to recognize special cases.
25Annotate some.
#define BOM 0xFEFF /*c2go enum { BOM = 0xFEFF }; */
Rewrite others.
enum { BOM = 0xFEFF, };
Can't just discard during parsing.
/* * If the new process paused because it was * swapped out, set the stack level to the last call * to savu(u_ssav). This means that the return * which is executed immediately after the call to aretu * actually returns from the last routine which did * the savu. * * You are not expected to understand this. */ if(rp->p_flag&SSWAP) { rp->p_flag =& ~SSWAP; aretu(u.u_ssav); }
Record precise source locations.
case OMAPLIT: n->esc = EscNone; // until proven otherwise e->noesc = list(e->noesc, n); n->escloopdepth = e->loopdepth; // Keys and values make it to memory, lose track. for(ll=n->list; ll; ll=ll->next) { escassign(e, &e->theSink, ll->n->left); escassign(e, &e->theSink, ll->n->right); } break;
Whole-line comments attach to syntax immediately following (or EOF).
Suffix comments attach to syntax immediately before.
Syntax carries comments if it moves.
28“27. Horrors! goto’s and labels
C has a goto statement and labels, so you can branch about the way you used to. But most of the time goto’s aren’t needed... The code can almost always be more clearly expressed by for/while, if/else, and compound statements.
30One use of goto’s with some legitimacy is in a program which contains a long loop, where a while(1) would be too extended. Then you might write
mainloop: ... goto mainloop;
Another use is to implement a break out of more than one level of for or while. goto’s can only branch to labels within the same function.”
— Kernighan, Programming in C – A Tutorial
31. if x { goto Done } y := f() print(y) Done: close(c) return
. var y int if x { goto Done } y = f() print(y) Done: close(c) return
if bad { Bad: printError() return err } ... if other bad thing { goto Bad }
switch x { case 1: F() goto Common; case 2: G() goto Common case 3: Common: H() }
1032 goto statements
241 labels
35 indented labels
18 switch case
6 multilevel break/continue
5 ‘else’ statement
4 cleanup/error labels
1 loop
1 difficult to explain
switch(r->op) { case OINDEXMAP: n->op = OAS2MAPR; goto common; case ORECV: n->op = OAS2RECV; goto common; case ODOTTYPE: n->op = OAS2DOTTYPE; r->op = ODOTTYPE2; common: ... }
switch r.op { case OINDEXMAP, ORECV, ODOTTYPE: switch r.op { case OINDEXMAP: n.op = OAS2MAPR case ORECV: n.op = OAS2RECV case ODOTTYPE: n.op = OAS2DOTTYPE r.op = ODOTTYPE2 } ... }
Baker, An Algorithm for Structuring Flowgraphs, JACM 1977
But we don't need it.
Handle trivial rewrites in converter.
Rewrite problematic gotos by hand.
General question: what type to use in the Go translation?
Build graph of “assigned” value flow and extract clusters.
x = y; int f(void) { return x; } w = f(); void g(int z); g(x); g(y);
Apply to entire compiler (all files). Exclude some functions.
43int islvalue(Node *n) { switch(n->op) { case OINDEX: if(isfixedarray(n->left->type)) return islvalue(n->left); if(n->left->type != T && n->left->type->etype == TSTRING) return 0; // fall through case OIND: case ODOTPTR: case OCLOSUREVAR: return 1; case ODOT: return islvalue(n->left); case ONAME: if(n->class == PFUNC) return 0; return 1; } return 0; }
int islvalue(Node *n) { ... return islvalue(n->left); ... return 0; ... return 1; ... return islvalue(n->left); ... return 0; ... return 1; ... return 0; }
cluster types: int values: return from islvalue 0 1 islvalue(n) islvalue(n->left) islvalue(n->right) contexts: bool condition /* if(islvalue(n)), if(!islvalue(n)), ... */
Translation: bool.
46cluster types: int values: return from checksliceconst 0 -1 contexts: checksliceconst(lo, hi) < 0 checksliceconst(lo, mid) < 0 checksliceconst(mid, hi) < 0
Translation: bool or error.
47cluster types: Val* values: var Val *v va_arg(fp->args, Val*) contexts: v->ctype v->u
Translation: pointer.
48cluster types: long* values: var long* a1 &a->a[0] contexts: *a1 a1++
Translation: slice.
49Cluster statistics
Clustering does not rely on C type information at all.
50By the way, please try the Go 1.3 beta!
51GopherCon
25 Apr 2014
Russ Cox