dispatch_once upon a time

Mark Dalrymple's Headshot
Mark Dalrymple

Grand Central Dispatch, a.k.a libdispatch and usually referred to as GCD, is a low-level API known for performing asynchronous background work. dispatch_async is its poster child: "Throw this block on a background thread to do some work, and inside of that block toss another block on the main thread to update the UI."

Not all of GCD is asynchronous, though. There's dispatch_sync to do some work synchronously. There's also dispatch_once that's used to guarantee that something happens exactly once, no matter how violent the program's threading becomes. It's actually a very simple idiom:

    static dispatch_once_t onceToken;

    dispatch_once (&onceToken, ^{
        // Do some work that happens once
    });

You first declare a static or global variable of type dispatch_once_t. This is an opaque type that stores the "done did run" state of something. It's important that your dispatchoncet be a global or a static. If you forget the static, you may have weird behavior at run time.

Then you pass that dispatch_once_t token to dispatch_once, along with a block. GCD will guarantee that the block will run no more than one time, no matter how many threads you have contending for this one spot.

The usual example you see for dispatchonce is creating shared instances, such as the object returned from calls like -[NSFileManager defaultManager]. Wrap the allocation and initialization of your shared instance in a `dispatchonce`, and return it. Done.

Recently, though, I had an opportunity to use dispatch_once outside of a sharedBlah situation. Another Rancher and I were working on some sample code for a class. It populated a scrolling view with Lots And Lots Of Stuff. Rather than manually coming up with labels for everything, we used the list of words at /usr/share/dict/words to construct random names. Just a couple of words and string them together. The results were often nonsensical, but sometimes we'd get something delightfully random. Here's the function:

static NSString *RandomName (int wordCount) {

    static NSArray *words;

    if (!words) {
        NSString *allTheWords =
            [NSString stringWithContentsOfFile: @"/usr/share/dict/words"
                      encoding: NSUTF8StringEncoding
                      error: nil];

        NSPredicate *shortWords = [NSPredicate predicateWithFormat:@"length < 8"];
        words = [[allTheWords componentsSeparatedByString:@"\n"]
                    filteredArrayUsingPredicate: shortWords];
    }

    NSMutableArray *nameParts = [NSMutableArray array];

    for (int i = 0; i < wordCount; i++) {
        NSString *word = [words objectAtIndex: random() % words.count];
        [nameParts addObject: word];
    }

    NSString *name = [nameParts componentsJoinedByString: @" "];

    return name;

} // RandomName

Pretty straightforward. A static local variable that points to an NSArray of words. Make a check for nilness, then load the file and remove the long words. And it worked great.

Then we decided to emulate network latency by using dispatch_async and coded delays to act like words were dribbling in over a network connection. Performance took an insane nose-dive, as in "there is no way I am checking this in and keeping my job". A quick check with Instruments showed RandomName being the bottleneck. Every thread was running it. Whoa.

In retrospect, it's an obvious mistake: accessing global state unprotected in a threaded environment. Here's the scenario:

Thread A starts doing stuff. It goes to get a RandomName. It sees that words is nil, so it starts loading the words. GCD, when it sees a thread start blocking (say by going into the kernel reading a largish file), it realizes that it can start another thread running to keep those CPUs busy. So Thread B goes to get a RandomName. Thread A isn't done loading the words, so words is still nil. Therefore Thread B starts reading the words file. It blocks, and goes to sleep, and Thread C starts up. Eventually all of the reads complete, and they all start processing this 235,886 line file. That's a crazy amount of work.

It's pretty to fix. You can slap an @synchronized around it. Or use NSLock, pthread_mutex, etc. I didn't like those options because you do pay a locking price on each access. Granted, it's a toy app purely for demonstration purposes, but I still think about that stuff. You can also put stuff like that into +initialize (with the proper class check), knowing the limited circumstances +initialize would get called. That didn't excite me either. It was nice having RandomName being entirely self-contained and not dependent on some other entity initializing the set of words.

Taking a step back and evaluating the problem: words needs to be loaded and initialized exactly once, and then used forever more. What's an existing library call that lets you do something exactly once? dispatch_once. We threw a dispatch_once at it, and now performance was decent again:

static NSString *RandomName (int wordCount) {
    static NSArray *words;

    <b>static dispatch_once_t onceToken;</b>

    <b>dispatch_once (&onceToken;, ^</b>{
        NSString *allTheWords =
            [NSString stringWithContentsOfFile: @"/usr/share/dict/words"
                      encoding: NSUTF8StringEncoding
                      error: nil];

        NSPredicate *shortWords = [NSPredicate predicateWithFormat:@"length < 8"];
        words = [[allTheWords componentsSeparatedByString:@"\n"]
                    filteredArrayUsingPredicate: shortWords];
    }<b>);</b>

    NSMutableArray *nameParts = [NSMutableArray array];
    ...

We didn't even have to modify the code in the block. Performance was back to reasonable levels, and we could get back to demonstrating our concept.

So what's the point of all of this? Mainly that GCD is not just for running things concurrently - it's a small pile of useful concurrency tools. dispatch_once is one of those tools, and has applicability outside of making shared class instances. It's very low overhead, with dispatchoncet being four or eight bytes, and not requiring a heavyweight lock every time it's run.

Recent Comments

comments powered by Disqus