I usually encounter two classes of bugs on a regular basis. The first is of the form “I think I know where this is” which won’t take long to find. The steps are pretty easy: Figure out how to reproduce it. Set a couple of breakpoints. Add some caveman debugging. Find the problem and fix it. These are my favorite kind of bugs because they’re over and done with quickly, I can get a quick hit of that “you done did good” glow from making a software system better, and then move on to some more interesting problem.
The other class of bug is of the form “whoa. Uh… I’m not sure how to even start approaching this problem.” These are the bugs where you know you’re in for a long haul. Maybe it’s one you know is going to be tough to track down because of threading. Maybe it’s an unfamiliar code base.
The fact is, you know it might be an hour or two. It might be a day or two. Or even longer. My first professional bug took a full month to track down. These are the kinds of bugs I start keeping a log of what I do. With the log you’re building a record, a stream of evidence about the bug. You might have jotted down something yesterday or the day before, which at the time didn’t mean much. But in light of fresh evidence, it becomes the key that unlocks the problem and points to a fix. It’s also really nice to have when you go talk to a colleague and they ask “ok, what have you tried so far?”
I’ve been keeping logs a long time. Back in the day, I used to use those hardbound notebooks you find in office supply stores:
This is from March of 1997, working on performance problems in AOLserver, a high throughput threaded webserver that powered most of AOL’s web content at the time. Here I’m tracking down some speed issues that happened on certain classes of HP machines. The most interesting things are the first question “Good Lord, where to start?”, and the colored annotations (hot pink for interesting data, orange for crude hyperlinks to other pages.) I had a really nice set of colored pens. I’m a total office supply nerd.
Since then I’ve also kept logs of day to day work life, which have come in handy during annual review time. But for debugging, I like having more than just a simple plain text log. I’ve been using Flying Meat’s VoodooPad Pro as my design and debugging notebook for years and years. It’s a wiki-in-a-document. I usually keep one document for each client, thereby isolating any sensitive information, and if the terms of the contract involve handing over engineering notebooks, I can give them the file, and delete it from my machine. Design notes, debugging notes, and technology research get dumped into VoodooPad for later reference if needed.
So what does this have to do with debugging? I’ve just finished fixing a couple of “Where do I even start?” bugs, and figured I’d pass on what I do when tracking down issues.
My logs take the form of a conversation. One voice in my head asks a question, and the other voice in my head answers it. Almost like rubber ducking with yourself (honestly, that’s not a dirty phrase) while keeping a record of it. By going back and forth between interesting quesions and their answers, I can explore the problem space. By using VoodooPad’s wiki nature, I can fork the discussion into multiple pages if there are a couple of different debugging paths I can go down.
What does one of these debugging discussions actually look like? Here’s a screen shot from my VoodooPad of a recent bug I tracked down in the Big Nerd Ranch Camp Where product.
The morbidly curious can view the complete unexpurgated log file for this bug. The questions are larger so it’s easier to skim down to get the general gist of the plan of attack, and then read the normal-sized text for the details.
I was tracking down an problem that only manifested itself on iOS6: the iOS6 simulator is on the left, and the iOS5 simulator is on the right. Notice that the iOS6 version doesn’t have the cool mountain-style background image behind the right-hand-side tableview.
That’s the bug. I kind of chewed on it when I had a couple of spare minutes here and there, but didn’t get any traction. I moved it to the long-haul bucket because I’m not totally familiar with the code. This is one of the older code bases floating around the Ranch. I carved out a big block of time and dove in.
For me, bug fixes tend to follow a pattern. First is just some flailing around trying to get a handle on the problem. Can I reproduce it easily? If something isn’t drawing right, is it actually there? (debuggers or caveman debugging is handy for that).
If there’s something promising, get more detail. Some view is there, should it be visible? (Add code to set its background color to something obnoxious, like [
UIColor purpleColor]). If my image is still not appearing, is there something weird with the whole
UIView containment hierarchy? (Stick a BlahView at the top of the view stack and see if it appears.)
And then by chewing away at the code, eliminating code paths or areas where the bug might be hiding, evidence eventually starts building up: “Hey, look here!”
The bug turned out to be
UITableView grew a new view living in the default background view in iOS6. With our nib files, this new view is opaque (but not in standalone tests I made, so there may be something up with our Xcode 3-era nibs causing the problem. A bug for another day.)
The original code was using
insertSubview:atIndex: 0 to insert the background image view, along with another image view that shows what kind of camp has been selected after a search. Inserting at index zero adds the view to the bottom of the stack, which ended up being underneath the new opaque view. Changing the code to insert on the top of the stack made things look right again.
The places where the log came in very handy for this particular bug was that it kept me on track. I’d feel some other ideas nibbling on my brain, but I kept going on this promising path because it was still yielding data. I also jotted down important data (what’s the class of that new tabieview background view again?) It also gave me a bookmark of sorts. In real life I had to interrupt the diagnosis for other Real Work. With the log there, I could pretty much pick up where I left off.