Checking Assumptions: -measuring isEqual[ToString]: performance

Mark Dalrymple

5 min read

Mar 28, 2012

iOS Mac

Checking Assumptions: -measuring isEqual[ToString]: performance

The last post about isEqual: vs isEqualToString: included some timings I made to test the performance of those two calls, along with compare:. That posting mentioned going down a rabbit hole, verifying commonly held beliefs of about isEqualToString:. The other rabbit hole I went down related to the performance tuning. A couple of commenters on the post asked some good questions relating to the timings, especially about literal strings.

(Edit: Added tests for strings of unequal length, thanks to a suggestion by Steve Weller, a.k.a. Bagelturf)

The First Test

All of the tests I run here have this pattern, with the interesting part in the middle:

#define LOOPAGE 10000000
NSString *thing1 = @"hi";
NSString *thing2 = @"hello";
time = BNRTimeBlock (^{
    for (int i = 0; i < LOOPAGE; i++) {
        <b>[thing1 isEqual: thing2];</b>
    }
});
printf ("equal: time: %fn", time);

Along with a loop for isEqualToString: and later with compare:. (Here’s the complete program)

My first test was just to compare two strings. Because I was lazy, I just copy and pasted two literal strings:

NSString *thing1 = @"hello";
NSString *thing2 = @"hello";

The comparison times were very close:

Same small literal
Time for         isEqual: 0.091905
Time for isEqualToString: 0.105689

So I tried longer, equal, literal strings:

NSString *thing1 = @"helloo...repeated 2000 times...o";
NSString *thing2 = @"helloo...repeated 2000 times...o";

And the times weren’t much different, both in the same ballpark:

Same large literal
Time for         isEqual: 0.128642
Time for isEqualToString: 0.153970

A bit slower. That makes sense because the strings were longer. It seems like it’s too short a time – string comparisons are usually an O(n) operation (runtime is proportional to the length of the string), so those numbers are actually kind of suspicious. But maybe I was unaware of a different string equality algorithm that’s not O(n).

Don’t Stop

I could have stopped here with my conclusions “justified”. Luckily I didn’t. I thought “I better throw in compare:”, and got these times:

Same large literal
Time for         isEqual: 0.128642
Time for isEqualToString: 0.153970
Time for         compare: 2.902745

Whoa! compare: is really awful! How could that happen!?

Then it dawned on me that Magical Stuff was happening with the literal strings. This makes sense when you think about it – literal strings are known by the compiler and linker. The tools can (and do) coalesce identical strings to point to the same chunks of memory or to the same objects. Therefore, the equal literal strings were pointing to the same physical object, and therefore a pointer comparison could be used to see if they were equal. This is not a fair test of the two equality methods.

Sure enough, printing out the addresses for the two strings shows that the pointers are the same:

thing1: 0x100416170 vs thing2: 0x100416170:

compare: doesn’t seem to have that optimization, so it is spinning over the entire string.â€¨
So, that set of timings were completely worthless for the purposes of determining the real performance of the two calls. Ugh.

Time to come up with some different scenarios. What about different literal strings? Different literal strings will live at different addresses, and you can’t reliably assume that two strings at different addresses are different and so you would need to check every character until you find a difference. But could there be some optimizations under the hood? Never underestimate the cleverness (or perverseness) of toolkit implementers.

I changed the last character of one of the small literal strings, and compared them again:

Late-difference, small literal
Time for         isEqual: 0.764268
Time for isEqualToString: 0.710544
Time for         compare: 0.824537

Definitely slower. (this was the case I used in the previous article, BTW). Changing the last character of one of the large literal strings also results in a correspondingly large run time:

Late-difference, big literal
Time for         isEqual: 2.677573
Time for isEqualToString: 2.714567
Time for         compare: 2.829126

My conclusion now is that the calls are visiting every character, even for literal strings. One more test to verify this – having two large literal strings that differ in the first character should have a short run time because it should find a difference between the strings very early on. That idea seems to hold due to short runtimes:

Early-difference, big literal
Time for         isEqual: 0.703624
Time for isEqualToString: 0.683351
Time for         compare: 0.807125

What about the Mutables?

A valid question to ask now, is “are there any difference between literal and mutable strings?” Could there be some secret information in the literals that won’t apply when using real-world strings?

I ran all the tests, using the same strings, but made them mutable by running them through this function before timing them:

NSMutableString *MakeMutable (NSString *string) {
    NSMutableString *mutableString = [NSMutableString string];
    [mutableString appendString: string];
    return mutableString;
} // makeMutable

Everything came out as expected: The three timings were very close together. Identical strings did not show the pointer-optimization because the mutable strings would be at different addresses. You can see the results in the table at the end.

What did you Forget?

(edit: this section is new since the first posting)

A friend pointed out in the comments that I missed a test case – different size strings. I didn’t think about it because it would have an obvious optimization because I can’t think of a way for two strings of unequal length to actually be equal. I added two additional tests – take five characters out of the equal big literal/mutable strings, and got these timings:

Different Size Big Literal
thing1: 0x10db21230 vs thing2: 0x10db21190:
Time for         isEqual: 0.638952
Time for isEqualToString: 0.636888
Time for         compare: 2.862555
Different Size Big Mutable
thing1: 0x10dc15450 vs thing2: 0x10dc15490:
Time for         isEqual: 0.620498
Time for isEqualToString: 0.610661
Time for         compare: 2.813782

And sure enough, there’s a marked difference between the the equality calls and the compare call. compare: takes so much longer because it is iterating over all the characters.

So, is it perfect?

So, after doing all of these tests, how confident am I in my conclusion that there is no real performance difference between isEqual: and isEqualToString: (the same-literal optimization notwithstanding). I’m pretty confident, seeing similar runtimes for all three of the comparison functions. Long strings take longer, short strings don’t take as long, and early differences have short runtimes too. (There is one case that I didn’t cover – non-mutable yet non-literal strings. Could there be a difference?)

The Results

Here are some values. This table was made from a different run than the earlier numbers, so the values won’t be identical to what you saw earlier, but everything is close enough to draw the same conclusions.

isEqual:isEqualToString:compare:

Same Small Literal

0.103232
0.115902
0.879067

Same Big Literal

0.129785
0.167012
2.778323

Different Small Literal

0.721744
0.714164
0.830643

Different Big Literal

2.782132
2.659704
2.762068

Early Different Big Literal

0.668607
0.690045
0.789696

Different Size Big Literal

0.645190
0.708712
2.955381

Same Small Mutable

0.631484
0.666148
0.833394

Same Big Mutable

2.708722
2.711567
2.890808

Different Small Mutable

0.640090
0.675951
0.809333

Different Big Mutable

2.703726
2.694859
2.844665

Early Different Big Mutable

0.617404
0.639799
0.784704

Different Size Big Mutable

0.620498
0.602333
2.836567

Like performance tuning? Advanced Mac OS X Programming: The Big Nerd Ranch Guide devotes a chapter on performance tuning, tools, and Instruments.

Mark Dalrymple

Author Big Nerd Ranch

MarkD is a long-time Unix and Mac developer, having worked at AOL, Google, and several start-ups over the years. He’s the author of Advanced Mac OS X Programming: The Big Nerd Ranch Guide, over 100 blog posts for Big Nerd Ranch, and an occasional speaker at conferences. Believing in the power of community, he’s a co-founder of CocoaHeads, an international Mac and iPhone meetup, and runs the Pittsburgh PA chapter. In his spare time, he plays orchestral and swing band music.