Most tutorials on Go tooling (and probably most other tooling) tend to focus on the happy path - the input code is perfectly valid and contains no errors. This is often a reasonable assumption, because we tend to run tools on existing code bases, and these are likely to compile correctly in their steady state. However, sometimes knowing how to handle errors is important, and good tooling infrastructure will enable that.
In this quick post I'll discuss how Go tooling handles potentially erroneous code. The full code is available on GitHub.
Our sample tool starts with the basic setup described in my previous post (Step 0), and for now just a bare-bones processPackage that dumps the package's AST:
func processPackage(pkg *packages.Package) {
for _, fileAst := range pkg.Syntax {
ast.Print(fset, fileAst)
}
}
Let's look at an input file with some errors in it:
package main
func util(x int) {
}
func main() {
util(5 6)
var s string = 5.5
}
Time for a brief self-evaluation :-) Most Go programmers will immediately identify two errors in this code; somewhat more experienced gophers will also find the third. Give yourself a few moments to play the compiler here - the answers are further down in the post.
Despite this code having some errors, if we run our tool on it - it will happily dump an AST with no complaints! If you're new to Go tooling, this can be somewhat surprising; wouldn't you expect the tool to error out somehow?
If you look closely at the dumped AST you'll notice that it's somewhat malformed, which is reasonable given the errors in the code. The call util(5 6) is syntactically incorrect - there's an extra parameter in a function call, without a separating comma. The AST dumped for this call is:
X: *ast.CallExpr {
. Fun: *ast.Ident {
. . NamePos: sample-module/main.go:9:2
. . Name: "util"
. . Obj: *(obj @ 23)
. }
. Lparen: sample-module/main.go:9:6
. Args: []ast.Expr (len = 1) {
. . 0: *ast.BasicLit {
. . . ValuePos: sample-module/main.go:9:7
. . . Kind: INT
. . . Value: "5"
. . }
. }
The parser chose to include the first parameter and ignore the second. Why does it bother to produce an AST at all here, even for code that doesn't parse correctly?
The answer is that tooling is often used in things like IDEs. Imagine you have a long file open in an IDE and there's an error somewhere in the middle [1]; had the IDE stopped at the error, it would not syntax highlight the rest of the code, or offer any intellisense features for it like "jump to definition". This would not be great for ergonomics. Therefore tools attempt to recover from compilation errors and continue churning through the code, to provide maximal utility.
This isn't always possible and some syntax errors confuse the tool entirely; we're all familiar with what happens when we forget a closing ) or } somewhere.
Error reports from Go tooling
Back to the original goal of the post... given a tool we're writing, is there a way to know that there were errors in the code before we start analyzing the AST?
Yes. The packages.Package structure has a couple of fields set by XTGP for this purpose. The most important one is Errors. Let's add this code to our processPackage function:
if len(pkg.Errors) > 0 {
fmt.Printf("package %v has %v errors\n", pkg.PkgPath, len(pkg.Errors))
for _, e := range pkg.Errors {
var errtype string
switch e.Kind {
case packages.ListError:
errtype = "listing/driver"
case packages.ParseError:
errtype = "parser"
case packages.TypeError:
errtype = "type checker"
default:
errtype = "unknown"
}
fmt.Printf("Error [%v]: %s\n", errtype, e)
}
}
When invoked on the erroneous module, we'll see:
package example.com has 3 errors
Error [parser]: sample-module/main.go:9:9: missing ',' in argument list
Error [type checker]: sample-module/main.go:11:17: cannot use 5.5 (untyped float constant) as string value in variable declaration
Error [type checker]: sample-module/main.go:11:6: s declared but not used
<AST dump, if still enabled>
Each error has the type packages.Error. An additional field that's set (but only if the packages.NeedTypes flag is set while configuring the tool) is IllTyped:
// IllTyped indicates whether the package or any dependency contains errors.
// It is set only when Types is set.
IllTyped bool
How should this be used?
OK, so now we know how to see what errors were encountered when parsing the input module. What's next?
What you do with this information is entirely dependent on the nature of your tool. As mentioned earlier, most tools assume that the code they're working on is already correct; in this case, erroring out if len(pkg.Errors) > 0 may be a good idea since we don't want to produce phantom results from incorrect code. Other tools will specifically not care about errors and will try to churn through as much AST as possible, even for partially incorrect code. YMMV.
[1] | Why would there be an error in the middle? Consider that IDEs often repeatedly parse the file, even while we're typing, to be able to provide intellisense features on the fly. While the code is being typed in, it's often un-parsable or has type errors (think of being in the middle of a parameter list fo a function call). This also highlights an interesting topic in compiler frontends - error recovery. While classical compilers can be forgiven for printing out a few errors and bailing out, tools really do need to recover as quickly as possible to be able to process the rest of the code correctly, even if there's some issue in the middle. |