Search

Digging into the Swift compiler: Nerdcamp is the Shovel

Mark Dalrymple

7 min read

Jan 29, 2018

iOS

Digging into the Swift compiler: Nerdcamp is the Shovel

One of the perks working at Big Nerd Ranch is we can attend one of our bootcamps. Not only do you learn a lot of stuff, they’re a lot of fun.  But what can you do if you’ve already taken All The Things?  A sleep-away Nerdcamp!  A Nerdcamp is an immersive learning experience like our bootcamps, but self-directed on a topic of our choosing.

In January 2018, I participated in a nerdcamp with my colleague, Step Christopher. The topic was “Digging into the Swift compiler”.  Step and I were having a chat one day and I mentioned that I wanted to attain some more swift proficiency, and mentioned the compiler. One thing led to another so we carved out a week in January.  I definitely wanted a week away from my remote office (Böb the cat is very cute, but can be demanding at times) vs say staying at our respective home bases and doing the telepresence thing.

The Plan

Not only is Swift open source, there is a public bug tracker (https://bugs.swift.org). Our plan was to grab some starter bugs and go from there.

This was attractive to me because I’ve had zero formal exposure to compiler innards. The small college I attended didn’t have have a compiler design course. I’ve skimmed through the first half the Dragon Book a couple of times. That’s about it.  So, a big win for me would be navigating around the swift compiler’s code base enough to locate where the bug is, understand what’s going on, and perhaps fix it.

The Venue

Step found us an airBnb cabin up near Ellijay GA, north of Atlanta. We knew we’d have necessities at hand such as a Wal Mart, a Waffle House, a bbq joint, and a big farmer’s market that sold fried pies.  The cabin was down near the Coosawatte river along a scenically twisty / windey / no-guard-raily / scary / somewhat-gravely road.  There was a kitchen, sleeping areas and a large living room.  Most of the work occurred around the kitchen table, or sprawled on the gigantic sofa.

Starter Bug

I asked Jordan Rose (@UINT_MIN on Twitter) from the Swift compiler team at Apple if there were any interesting starter bugs.  He recommended two – Step picked one regarding NSProxy subclassing, and I grabbed SR-1557 – Unused function-typed return values result in a hard error

The bug: given a function that returns a function and the return value is unused, it should be a warning (like other unused function values) rather than an error:

thingie() // yields error "expression resolves to an unused function"

After reproducing the bug with the current official release (Swift 4), I started hacking on a 4.1 branch expecting it’d be somewhat more stable than master.  I figured out all the important stuff like how to build all the things, how to rebuild all the things, how to run all the tests, how to run individual tests, how to run my custom compiler, and how to learn to accept my laptop fans running 24/7.  After a fair amount of code spelunking and breakpointing in lldb, came to the place in the code where that error was being generated. This was my favorite part of the process – the detective work.

It was with great satisfaction that I ended up at the same place a commenter on the bug suggested where to look.  Removing an explicit error diagnostic generation removed the error and provided decent enough warnings.

Then came unit test whack-a-mole, a cycle of “run the test suite, watch it fail.  Fix the failing tests (usually by changing the expectation of error/warnings), then repeat.  I found this process frustrating.  After getting to the point of an error in a test file I couldn’t find, I declared victory-enough: I could find things in the code if I wanted, understand what was going on around it, and even kibbitz on Step’s bug.  I kind of stopped learning new things at this point. And learning was the point of the week.  Time to pivot.

LibSyntax

I’d been curious about some of Swift’s auxiliary tools such as source kit.  They recently opened up a new thing: libSyntax. Harlan Haskin has a great introductory video, Improving Swift Tools with libSyntax

I was hoping this was a tool for taking swift code and extracting All The Information from it,  but it’s primarily a whitespace-preserving lexer with a nice API.  You can feed it Swift code and it’ll generate a parse tree from it: “Here is a struct-keyword-token with three spaces after it. Here’s a curly brace token with a newline in front of it. Here is an attribute-start token.”  Given one of these parse trees, you can regenerate byte-for-byte the original source text.  Goals of the project include using it for the compiler’s internal parsing, to help support editor tooling, and also to make a swift-format tool.

Once you’ve got one of these libsyntax trees, you can walk it looking for things like function or protocol declarations. You can also do surgery like rename functions, clean up whitespace, or insert other tokens.

In the course of exploring libsyntax, I learned about installing custom toolchains, getting Xcode and its UI affordances like documentation lookup to work with said custom toolchains. Safety tip: always try restarting Xcode a couple of times before you start reverse-engineering the swiftdoc file format to get at the library’s documentation.

To start out learning the API I ported the C++ sample code from the libsyntax README to Swift. It’s really easy to use, if a bit tedious:

For example, to get libsyntax to serialize out this statement:

@greeble(bork) typealias Element = Int

Involves making a syntax tree piece by piece:

import SwiftSyntax

@greeble(bork) typealias Element = Int

let typeAliasKeyword = SyntaxFactory.makeTypealiasKeyword(leadingTrivia: .spaces(1),
                                                                 trailingTrivia: .spaces(1))
let elementID = SyntaxFactory.makeIdentifier("Element", leadingTrivia: .zero, trailingTrivia: .spaces(1))
let equal = SyntaxFactory.makeEqualToken(leadingTrivia: Trivia.zero, trailingTrivia: .spaces(1))
let intType = SyntaxFactory.makeTypeIdentifier("Int", leadingTrivia: .zero, trailingTrivia: .zero)
let initializer = SyntaxFactory.makeTypeInitializerClause(equal: equal, value: intType)

let openParen = SyntaxFactory.makeLeftParenToken()
let closeParen = SyntaxFactory.makeRightParenToken()

let balancedTokens = [openParen, SyntaxFactory.makeIdentifier("bork", leadingTrivia: .zero,
                                                              trailingTrivia: .zero), closeParen]
let balancedTokenSyntax = SyntaxFactory.makeTokenList(balancedTokens)

let atsign = SyntaxFactory.makeAtSignToken(leadingTrivia: .zero, trailingTrivia: .zero)
let attributeName = SyntaxFactory.makeIdentifier("greeble", leadingTrivia: .zero, trailingTrivia: .zero)

let attribute = SyntaxFactory.makeAttribute(atSignToken: atsign,
                                            attributeName: attributeName,
                                            balancedTokens: balancedTokenSyntax)
let attributes = SyntaxFactory.makeAttributeList([attribute])

let typeAlias = SyntaxFactory.makeTypealiasDecl(attributes: attributes,
                                                accessLevelModifier: nil,
                                                typealiasKeyword: typeAliasKeyword,
                                                identifier: elementID,
                                                genericParameterClause: nil,
                                                initializer: initializer)

This isn’t stuff you’d write a lot of by hand. But, it’s a great candidate for being driven by something else.  For example, invent a “generate your borkerplate” domain-specific language, and then make an interpreter that calls the libsyntax SyntaxFactory methods to blort out generated code.

You can also process existing code with libSyntax, discussed in Harlan’s libsyntax video.

This syntax rewriter:

class Renamer: SyntaxRewriter {
    static let nospaceTrivia = Trivia.spaces(0)
    static let spaceTrivia = Trivia.spaces(1)

    let bork = SyntaxFactory.makeIdentifier("bork",
                                            leadingTrivia: nospaceTrivia,
                                            trailingTrivia: spaceTrivia)

    override func visit(_ node: StructDeclSyntax) -> DeclSyntax {
        return super.visit(node.withIdentifier(bork))
    }

    override func visit(_ node: ClassDeclSyntax) -> DeclSyntax {
        return super.visit(node.withIdentifier(bork))
    }

    override func visit(_ node: FunctionDeclSyntax) -> DeclSyntax {
        return super.visit(node.withIdentifier(bork))
}

Will rename struct, class, and function names to bork.  You can see I stick to very practical examples when exploring new tools.

I had a ball with this stuff, even given my general distaste for tools that generate reams of code. I have some ideas for some tools that might be fun, such as one that could help with out course material development pipeline for marking up source code, or perhaps something that’ll look for adopting of particular  classes or protocols of a given name.  For that I usually reach for a find command or a search regex in Xcode.

Decompression

Knowing the cabin was kind of out of the way (and that snow was coming), we went to Wal Mart and got provisions for the week. We even made healthy choices – cheese sticks and goldfish crackers were as naughty as I got, avoiding the giant tubs of cheesypoofs.  Figuring we’d be snowed in, we got meal makings for the week.

We also brought some board games, so in the evenings after our brains were exhausted we played Forbidden Island, Settlers of Cataan (two-player variant), and SmallWorld. Step brought a Nintendo Switch (which I hadn’t seen before), so I got to experience Rocket League, Splattoon, the new Mario / and Zelda.

Fin

I had a good time – it was a nice time-away-from-everything similar to Big Nerd Ranch bootcamps. My brain was generally hurting by the end of the day, just like our bootcamps.  The bootcamp time dilation effect was in full force as well.  This is where the first day seems to go on for-ev-er, and by the end of the week time is just flying by.  I learned some things: compiler development isn’t my cup of tea – I’m definitely happier in app-land.  I can navigate effectively in a big foreign code base.  C++ still produces gigantic error messages. LibSyntax is pretty neat.  I’d totally do something like this again.

Mark Dalrymple

Author Big Nerd Ranch

MarkD is a long-time Unix and Mac developer, having worked at AOL, Google, and several start-ups over the years.  He’s the author of Advanced Mac OS X Programming: The Big Nerd Ranch Guide, over 100 blog posts for Big Nerd Ranch, and an occasional speaker at conferences. Believing in the power of community, he’s a co-founder of CocoaHeads, an international Mac and iPhone meetup, and runs the Pittsburgh PA chapter. In his spare time, he plays orchestral and swing band music.

Speak with a Nerd

Schedule a call today! Our team of Nerds are ready to help

Let's Talk

Related Posts

We are ready to discuss your needs.

Not applicable? Click here to schedule a call.

Stay in Touch WITH Big Nerd Ranch News