scm.go, a Scheme interpreter in Go, as in SICP and lis.py

This post is intended for people who know already a little bit of Go, and a little bit of Lisp. scm.go is a Scheme interpreter in Golang, inspired by implementations of Abelson and Sussman in Structure and Interpretation of Computer Programs (SICP) and Norvig in lis.py. Thousands of other Lisp interpreters already exist, though not so many in Go. Few of them seemed to be designed with the simplicity of SICP or accessibility of lis.py in mind, and I wanted something for easy tinkering.

The goal was scm.go to be as simple and idiomatic Golang as possible. I hoped that clarity would then come for free, but feel that some documentation is still appropriate.


The interpreter consists of three parts: a parser for Lisp expressions, a metacircular evaluator, and a Lisp environment (constructed on top of Go objects).

The parser is stolen from Peter Norvig’s lis.py. It takes an expression and parses it into a list, number, symbol, or any combination of those. Just as lispy’s parser it expects one line of syntactically perfect Lisp from the user. It first splits the input in tokens, then parses the tokens into an internal representation of the expression. Expressions are Lisp objects all the way down, but in scm.go, constant literals (numbers) are typed by float64, and variable references (symbols) by string. Although all these types share the common (empty) interface scmer, so differently typed values (in Go) can internally be stored in a list or passed along in the interpreter. Lists of Lisp objects are stored as slices of scmer.

The boring parser code can be found over here.

Once the expression is parsed and exists in memory, the metacircular evaluator can evaluate its contents. A case analysis in eval specifies how to determine an expressions value. If the expression is a list, and the first symbol in the list is not one of the six predefined special forms (quote, if, set!, define, lambda, begin), the interpreter assumes the expression is application. All subexpressions are first evaluated, then apply is called with the value of the first subexpression as operator (procedure), the others as operands (arguments).

/*
 Eval / Apply
*/
 
func eval(expression scmer, en *env) (value scmer) {
	switch e := expression.(type) {
	case number:
		value = e
	case symbol:
		value = en.Find(e).vars[e]
	case []scmer:
		switch car, _ := e[0].(symbol); car {
		case "quote":
			value = e[1]
		case "if":
			if eval(e[1], en).(bool) {
				value = eval(e[2], en)
			} else {
				value = eval(e[3], en)
			}
		case "set!":
			v := e[1].(symbol)
			en.Find(v).vars[v] = eval(e[2], en)
			value = "ok"
		case "define":
			en.vars[e[1].(symbol)] = eval(e[2], en)
			value = "ok"
		case "lambda":
			value = proc{e[1], e[2], en}
		case "begin":
			for _, i := range e[1:] {
				value = eval(i, en)
			}
		default:
			operands := e[1:]
			values := make([]scmer, len(operands))
			for i, x := range operands {
				values[i] = eval(x, en)
			}
			value = apply(eval(e[0], en), values)
		}
	default:
		log.Println("Unknown expression type - EVAL", e)
	}
	return
}


In apply, it is first determined if the procedure is a primitive or a compound procedure. The procedure is then applied to the list of arguments. A primitive procedure is a Go function with signature func(...scmer) scmer, that can be called there and then.

func apply(procedure scmer, args []scmer) (value scmer) {
	switch p := procedure.(type) {
	case func(...scmer) scmer:
		value = p(args...)
	case proc:
		en := &env{make(vars), p.en}
		switch params := p.params.(type) {
		case []scmer:
			for i, param := range params {
				en.vars[param.(symbol)] = args[i]
			}
		default:
			en.vars[params.(symbol)] = args
		}
		value = eval(p.body, en)
	default:
		log.Println("Unknown procedure type - APPLY", p)
	}
	return
}

The compound procedure case is a bit more involved. Because Go doesn’t have lambda expressions like Python, I had to create a struct to represent a Lisp procedure: proc. Every proc object contains the expressions of the functions parameters and body, and a pointer to the environment in which it was created.

type proc struct {
	params, body scmer
	en           *env
}

When applying such a compound procedure, a new Lisp environment is constructed in which the parameters are variables bound to the arguments. The actual execution of the procedure will be the evaluation of its body in that environment. Because other symbols than the formal parameters can occur in the procedure’s body, that environment points to the creators environment, where other symbols can be found.
Parameter-to-argument binding depends on the form of the parameter expression. If it is a list, parameters are bound to the arguments element by element. Otherwise, the parameter expression can be a symbol, then bound to the entire list of arguments, allowing e.g. defining list as (lambda x x).

That leaves us with implementing environments, where Lisp objects can be bound to variable names (symbols). Each environment consists of a map[symbol]scmer named vars that provides this binding, and a pointer to its outer environment. The Find method returns the first environment where a given symbol is in the map, following the chain of outer environments.

/*
 Environments
*/
 
type vars map[symbol]scmer
type env struct {
	vars
	outer *env
}
 
func (e *env) Find(s symbol) *env {
	if _, ok := e.vars[s]; ok {
		return e
	} else {
		return e.outer.Find(s)
	}
}

For example expression (define x 3) parses to ["define" "x" 3], which eventually evaluates to a call en.vars["x"] = 3, effectively binding 3 to symbol "x" in the current environment.
Eventually, because in the define section of eval, there is another call to eval for the third element of the expression. In this case, that subexpression is number 3, which evaluates to itself.
Evaluating x will now find symbol "x" in the current environment and return value 3. To illustrate outer environments, let’s define procedure plusx by evaluating (define plusx (lambda (y) (+ y x))). Each time plusx is applied, e.g. evaluating (plusx 4), a new environment is created in which only "y" is bound (to 4), having as outer environment the current one. Upon evaluating the body (+ y x) in that environment, Find will not see "x" there, so looks for it in the outer environment, there finding the value 3 defined above.

Note that the symbol "+" is also not found in the new environment of the previous example… It is however among the symbols already in the current environment, because I put it there.

This environment, called globalenv, is initialized in init() at the start of the Go package, and contains definitions of a set of basic procedures: +, -, equal?, cons, car, cdr, all primitive procedures. An exception is list, defined as evaluating and parsing "(lamda z z)", but you could just as well define it as a compound procedure proc{symbol("z"), symbol("z"), &globalenv}.

/*
 Primitives
*/
 
var globalenv env
 
func init() {
	globalenv = env{
		vars{ //aka an incomplete set of compiled-in functions
			"+": func(a ...scmer) scmer {
				v := a[0].(number)
				for _, i := range a[1:] {
					v += i.(number)
				}
				return v
			},
			"-": func(a ...scmer) scmer {
				v := a[0].(number)
				for _, i := range a[1:] {
					v -= i.(number)
				}
				return v
			},
			"*": func(a ...scmer) scmer {
				v := a[0].(number)
				for _, i := range a[1:] {
					v *= i.(number)
				}
				return v
			},
			"/": func(a ...scmer) scmer {
				v := a[0].(number)
				for _, i := range a[1:] {
					v /= i.(number)
				}
				return v
			},
			"<=": func(a ...scmer) scmer {
				return a[0].(number) <= a[1].(number)
			},
			"equal?": func(a ...scmer) scmer {
				return reflect.DeepEqual(a[0], a[1])
			},
			"cons": func(a ...scmer) scmer {
				switch car := a[0]; cdr := a[1].(type) {
				case []scmer:
					return append([]scmer{car}, cdr...)
				default:
					return []scmer{car, cdr}
				}
			},
			"car": func(a ...scmer) scmer {
				return a[0].([]scmer)[0]
			},
			"cdr": func(a ...scmer) scmer {
				return a[0].([]scmer)[1:]
			},
			"list": eval(read(
				"(lambda z z)"),
				&globalenv),
		},
		nil}
}

You can directly evaluate a fine number of expressions in this environment, or implement your own Lisp starting from the core provided here. So let’s not stop yet and give the user an interactive shell, a read-eval-print loop (REPL). A simple one, where input expressions are line per line read and evaluated. A string representation of the result is printed for each evaluation, after which the loop starts reading the input again.

/*
 Interactivity
*/
 
func String(v scmer) string {
	switch v := v.(type) {
	case []scmer:
		l := make([]string, len(v))
		for i, x := range v {
			l[i] = String(x)
		}
		return "(" + strings.Join(l, " ") + ")"
	default:
		return fmt.Sprint(v)
	}
}
 
func Repl() {
	scanner := bufio.NewScanner(os.Stdin)
	for fmt.Print("> "); scanner.Scan(); fmt.Print("> ") {
		fmt.Println("==>", String(eval(read(scanner.Text()), &globalenv)))
	}
}

You can find the code as a whole in this github gist

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s