Search

Inside the Bracket, part 2 – data invocation

Mark Dalrymple

10 min read

Jun 5, 2013

Inside the Bracket, part 2 – data invocation

Want to learn more about what’s really happening inside those square brackets? Read the entire Inside the Bracket series.

Last time we took a look at the motivations behind Objective-C message sending. It’s a layer of indirection that lets one chunk of code treat other, perhaps unrelated, chunks of code in a uniform manner. “I have a pile of views here, I shall draw them all with -drawRect:, and I don’t care if the views are Buttons, Sliders, or World Maps.” There’s a loop that hits a collection and sends the same message to a bunch of objects:

for (NSView *view in visibleviews) {
    [view drawRect: view.bounds];
}

Objective-C performs this magic by having a collection associated with each object. This collection has a mapping of names (like drawRect:) to chunks of code. The mapping of names to code happens at runtime, rather than compile or link time. This deferral comes at a slight performance penalty, but gives us access to interesting under-the-hood data at runtime, and also gives us a lot of flexibility.

What is this extra data? What is this flexibility? Glad you asked.

The Moving Pieces

Each object that’s floating around in memory is described by a class. In Objective-C, a class is an object too – it can receive messages. The class holds all the metadata that describes the instances of that class – things like the set of methods it implements, what protocols it adopts, any @properties, its instance variables, and so on. The class also has a reference to the one-and-only superclass of the class. Say UIButton inherits from UIView. This means that UIButton’s superclass is UIView.

An individual object is a dynamically allocated chunk of memory that contains the instance variables of the object, such as the view’s bounds or the background fill color for a layer. The object also has a pointer, called isa, at a predictable location (the first 4 or 8 bytes of the object) . This points to the class. “isa” comes from “is-a”. This object in the memory is-a button because its isa pointer points to the UIButton class. It’s now very easy for you, when given any random Objective-C object, to find out where its class lives. Just look at the first pointer.

Here’s how all the chunks of data relate:

Object class

What’s going on in this diagram? A button’s isa pointer points to the UIButton class. The class has (amongst other stuff) a reference to its map of methods. It has a reference to its superclass (UIView) as well. UIButton implements some button-specific methods like drawRect and style, as well as the mysterious “blah” method. You can see that UIView implements some generic housekeeping methods, like the frame handling, background color storage, and it has its own version of blah.

So, that method map. It’s a dictionary of names and methods.

The name is called the selector. What’s a selector then? It’s the key into that dictionary. A selector is actually a char *, so you can print one out in gdb or lldb. This is an implementation detail, but makes for a nice debugging feature. The current selector is passed in a hidden argument called _cmd, so you can “print _cmd” inside of your debugger. If you’re going to and from string representation of selectors for actual work (say to some selector names in a Cocoa collection), use the NSStringFromSelector an NSSelectorFromString functions.

The selector is one side of the method map. The method implementation is on the other side. It’s the address of a function to call. It’s an IMP, short (and shouty) type that stands for “implementation”. An IMP is a function pointer which points to a function that takes an id and a selector:

typedef id (*IMP)(id, SEL, ...);

Look familiar? That’s the same as the arguments to objc_msgSend. It’s also the same two hidden parameters to methods: self and _cmd.

You can ask an object for that function pointer that’s behind a particular message. It’s a function pointer, so you can jump through it like any other function pointer. Here’s some code that takes a string, gets the function pointer behind -uppercaseString, and then jumps through it directly. (Some code at this gist)

NSString *string = @"Bork";
IMP uppercase = [string methodForSelector: @selector(uppercaseString)];
NSString *upcase = uppercase (string, @selector(flonknozzle));
NSLog (@"%@ -> %@", string, upcase);

And a sample run:

Bork -> BORK

Notice that I passed a nonsense selector as the second argument (_cmd). That shows that uppercase is just a function pointer and not actually vectoring through objc_msgSend.

Ordinarily you won’t be grabbing the IMP and jumping through it because it defeats the whole idea of polymorphism – you’re short-circuiting the method look up process. But if you know that all the objects in a collection are the same, you can get the IMP and jump directly to the method implementation. Be sure you’ve profiled your app before doing this kind of micro-optimization. It could become the source of bugs if your heterogenous collection starts having different kinds of objects in it. “Why is UIButton's -drawRect suddenly trying to draw sliders?”

Signature Move

A “signature” is the term for the types that a method or function takes as parameters and what its return value is. The signature of NSData’s dataWithContentsOfFile: is “Takes a string (path) and returns an object (an NSData of the contents of the file at the path)”.

Method signatures are part of a class’s metadata. You can ask a class for the signature of a method using methodSignatureForSelector. The signature is encapsulated in a NSMethodSignature object that you can then poke around. Here’s an NSString method with some parameters:

- (NSRange) rangeOfCharacterFromSet: (NSCharacterSet *) aSet
                            options: (NSStringCompareOptions) mask
                              range: (NSRange) searchRange;

This method takes an object, a bit mask, and an NSRange struct. It returns an NSRange. You get the signature by asking the class:

NSMethodSignature *signature =
    [NSString instanceMethodSignatureForSelector: @selector(rangeOfCharacterFromSet:options:range:)];

Now poke around the signature:

NSLog (@"%ld arguments", [signature numberOfArguments]);
for (NSUInteger i = 0; i < [signature numberOfArguments]; i++) {
    NSLog (@"%ld -> %s", i, [signature getArgumentTypeAtIndex: i]);
}
NSLog (@"returning %s", [signature methodReturnType]);

Running this yields this extremely illuminating output:

5 arguments
0 -> @
1 -> :
2 -> @
3 -> Q
4 -> {_NSRange=QQ}
returning {_NSRange=QQ}

Ummm… Yeah. Moving on then!

Type Encodings

These character strings are “type encodings”. They’re character sequences that describe individual types. You can ask the compiler for a type’s encoding string by using the @encode directive: (this stuff is at this gist)

NSLog (@"int:        %s", @encode(int));
NSLog (@"CGRect:     %s", @encode(CGRect));
NSLog (@"NSString *: %s", @encode(NSString *));

This prints out:

int:        i
CGRect:     {CGRect={CGPoint=dd}{CGSize=dd}}
NSString *: @

Lower-case i for an int. CGRect is a {struct} with two structs, each of which has a double. @is an object pointer. You can see the list of encodings, or you can also ask Uncle Google for “objective-C runtime programming guide type encodings” for when Apple breaks this documentation link. The particular characters and what they correspond to are an implementation detail, so don’t go hardcoding “{_NSRange=QQ}”. You can use @encode to get the proper encoding string in a robust manner.

Here, again, is the signature for rangeOfCharacterFromSet…, annotated

5 arguments
0 -> @                       object pointer
1 -> :                       selector
2 -> @                       object pointer
3 -> Q                       unsigned long long
4 -> {_NSRange=QQ}       NSRange struct, with two unsigned long longs
returning {_NSRange=QQ       NSRange struct with two unsigned long longs

And now things should make more sense. The first two arguments are, you guessed it, self and _cmd. Then follow the three arguments to the method – an object pointer (to an NSCharacterSet), a big int used for a bit mask, and an NSRange. It returns an NSRange.

There’s one caveat: the type encodings don’t handle variable argument lists. You can’t tell with NSMethodSignature if something is a varargs method or not.

Armed with this, you can figure out at run time what the calling convention is for an arbitrary method so long as you know its selector. Sure, that’s interesting trivia, but can it be useful information?

Equivoinvocations

If you know the signature for a method, and savvy with the platform ABI , you can package up message-sends into an object. In essence, creating the potential for a message send. Then, in the future, you can take this package and cause it to actually send a message. You’re freeze-drying a method invocation for later thawing. Apple gives us a class to do this, hiding the grody details : NSInvocation.

Be warned, NSInvocation is kind of a pain to deal with. And its performance is terrible. Mike Ash has a test program (that you can run for yourself) which times various common operations. Here’s a subset, with the time for each operation in nanoseconds.

IMP-cached message send     0.7
C++ virtual method call     1.1
Objective-C message send    4.9
NSInvocation message send  77.3

The timing of the first three are unsurprising – an IMP-cached message send is just a function pointer call, so it’s very fast. A C++ virtual method call is a pointer+offset (find the vtable) followed by a pointer + offset (find the function pointer in the vtable) and then a function pointer call. It’s a little more work so takes a bat more time. An Objective-C message send does a fair amount of work as you’ve seen already.

NSInvocation takes 15 times longer to use an already-existing NSInvocation object and invoke it than to call a method directly. 77 nanoseconds is actually not a long time, so don’t avoid invocations it if it can lead to elegant designs.

Making an NSInvocation is a multi-step process. Before showing the invocation, here’s some code that uses rangeOfCharacterInSet:… that we’ll invocationize.

//  (character indexes             11111111112222
//   in the string)      012345678901234567890123
NSString *baseString = @"Why hello there, Hoover.";
//  (randomRange)             |-------------|
NSRange randomRange = (NSRange){5, 15};
NSCharacterSet *set = [NSCharacterSet whitespaceCharacterSet];
NSStringCompareOptions options = NSBackwardsSearch;
NSRange lastSpace =
    [baseString rangeOfCharacterFromSet: set
                options: options
                range: randomRange];

This is saying “given this string, look in the range of [5,20) for the first whitespace character, but start searching from the end”. In other words, what is the last whitespace character in that range? This call returns {16, 1}, which is the space right before Hoover.

First you need a method signature, otherwise how do you know what arguments to send to the IMP that backs rangeOfCharacterInSet:… ?

NSMethodSignature *rangeSignature =
    [NSString instanceMethodSignatureForSelector:
                  @selector(rangeOfCharacterFromSet:options:range:)];

Then make an invocation:

NSInvocation *spaceFinder =
    [NSInvocation invocationWithMethodSignature: rangeSignature];

Next, tell the invocation to retain any object arguments. Before ARC you would retain any objects that you put into invocations (and remembered to release them when done). We can’t do that with ARC. So tell the invocation to retain its arguments so they don’t disappear:

[spaceFinder retainArguments];

Set the target (self) and selector (_cmd):

[spaceFinder setTarget: baseString];
[spaceFinder setSelector: @selector(rangeOfCharacterFromSet:options:range:)];

Then set the three arguments for the method. Start with argument index 2.

[spaceFinder setArgument: &set  atIndex: 2];  // target=0, selector=1
[spaceFinder setArgument: &options  atIndex: 3];
[spaceFinder setArgument: &randomRange  atIndex: 4];

And you’re done! Invoke it to cause the message send to happen.

[spaceFinder invoke];

And then print the return value.

Uh… Where did the return value go? It gets stuffed into the invocation:

NSRange anotherLastSpace;
[spaceFinder getReturnValue: &anotherLastSpace];

This also returns {16,1], so life is good. You can re-use the invocation and point it at different strings by using setTarget:

//                                   1111111111
//                         01234567890123456789
NSString *secondString = @"<a href="http://www.amazon.com/Seem-Be-Verb-Environment-Future/dp/B0006CZBHO">I seem to be a verb!</a>";
//                              |-------------|
[spaceFinder setTarget: secondString];
[spaceFinder invoke];
NSRange yetAnotherLastSpace;
[spaceFinder getReturnValue: &yetAnotherLastSpace];

This returns a range of {14, 1}, which is the space right before “verb”.

You can reuse an invocation any number of times.

What’s neat about NSInvocation is it takes message-sends, which are fundamentally verb-like in nature, and converts them into objects, which are fundamentally noun-like in nature. You can put these invocations into collections, where they sit, lurking, until called into action.

NSUndoManager is fundamentally a couple of NSArrays filled with NSInvocations. You can use invocations to make C callback handling easier. Rather than writing a thunk method that casts a context pointer to an object, and then calling a hard-coded method, you could instead use an NSInvocation as the context pointer, and have a single generic callback. They’re also used under some circumstances when messages are sent to objects, as you’ll see in the next installment. You can also use an invocation as an operation by putting an NSInvocationOperation onto a NSOperationQueue.

The second is just an observation. Notice that the setArgument: methods take addresses of stuff, like the address of the search options mask, or the address of the range to limit the character search in. There’s no sizeof’s anywhere to let NSInvocation know how many bytes to grab from memory. There’s no need to – all that information is in the NSMethodSignature!

Data Today, Methods Tomorrow

Here ends the tour of some of the bits of information you can get at run-time given Objective-C’s rich metadata. Next time, a tour of some of the methods you can use to put this metadata to good use._cmd

Mark Dalrymple

Author Big Nerd Ranch

MarkD is a long-time Unix and Mac developer, having worked at AOL, Google, and several start-ups over the years.  He’s the author of Advanced Mac OS X Programming: The Big Nerd Ranch Guide, over 100 blog posts for Big Nerd Ranch, and an occasional speaker at conferences. Believing in the power of community, he’s a co-founder of CocoaHeads, an international Mac and iPhone meetup, and runs the Pittsburgh PA chapter. In his spare time, he plays orchestral and swing band music.

Speak with a Nerd

Schedule a call today! Our team of Nerds are ready to help

Let's Talk

Related Posts

We are ready to discuss your needs.

Not applicable? Click here to schedule a call.

Stay in Touch WITH Big Nerd Ranch News