Check out our Bootcamp Schedule View Schedule

File it Away

Mark Dalrymple

Sometimes you stumble across a file. It might be something random in your Documents folder. It might be something a parent or a client sent you. Unfortunately, you have no idea what it might be. Files don’t have to have extensions on the Mac, so there’s not much hint what “Flongnozzle-2012” might contain. But if you’re comfortable in the Terminal, you have some built-in tools to help you identify files.

File it Away

file is my go-to command for this kind of work. file examines the contents of the file and tries to figure out what it is:

% file launchHandler.m
launchHandler.m: ASCII C++ program text

Well, it’s Objective-C text, but Terminal came pretty close, identifying it as a file of code. “But MarkD, can’t it just look at the file extension?” That’s certainly one of the tools file can use, but it’s not necessary:

% cp launchHandler.m splunge
% file splunge
splunge: ASCII C++ program text

No file extension, but we figured out what the file is. Point file at something that could contain executable code, and it will tell you the included architectures:

% file /bin/ls
/bin/ls: Mach-O 64-bit executable x86_64

You can tell if you have a fat binary (a.k.a Universal App in its original sense):

% file /Applications/Reason/Reason.app/Contents/MacOS/Reason
Reason.app/Contents/MacOS/Reason: Mach-O universal binary with 2 architectures
Reason.app/Contents/MacOS/Reason (for architecture i386): Mach-O executable i386
Reason.app/Contents/MacOS/Reason (for architecture x86_64): Mach-O 64-bit executable x86_64

Point it at an image file to see the image dimensions:

% file Flongnozzle-2012
Flongnozzle-2012: PNG image data, 1932 x 904, 8-bit/color RGB, non-interlaced

OBTW, here’s a handy Terminal tip: You can drag icons from the finder into a Terminal window. This will paste in the full path to the file or folder you drag in.

Peering Inside

Sometimes file lets you down, or maybe you want more information about a file. You can always try QuickLook in the finder. If that doesn’t work, you can use hexdump to show a file’s bytes. Pass the -C option to show an ASCII translation as well.

For example, back to that image file:

% hexdump -C Flongnozzle-2012 | head
00000000  89 50 4e 47 0d 0a 1a 0a  00 00 00 0d 49 48 44 52  |.PNG........IHDR|
00000010  00 00 07 8c 00 00 03 88  08 02 00 00 00 a2 e0 9b  |................|
00000020  61 00 00 0c 45 69 43 43  50 49 43 43 20 50 72 6f  |a...EiCCPICC Pro|
00000030  66 69 6c 65 00 00 48 0d  ad 57 77 54 53 c9 17 be  |file..H..WwTS...|
00000040  af 24 81 90 84 12 88 80  94 d0 9b 28 bd 4a ef 82  |.$.........(.J..|

Not much useful in the data area, but you can see PNG in all its glory.
Some files have more string content in them. Here’s a hexdump from a patch file from the Reason digital audio workstation:

% hexdump -C CV-Spy--md.cmb
00000000  46 4f 52 4d 00 00 03 d8  50 54 43 48 43 41 54 20  |FORM....PTCHCAT |
00000010  00 00 00 04 52 45 46 53  43 4f 49 4e 00 00 00 06  |....REFSCOIN....|
00000020  bc 01 00 00 00 01 43 41  54 20 00 00 00 fc 44 45  |......CAT ....DE|
00000030  56 4c 46 4f 52 4d 00 00  00 f0 44 45 56 49 44 45  |VLFORM....DEVIDE|
00000040  53 43 00 00 00 47 bc 02  01 00 00 07 00 00 00 10  |SC...G..........|
00000050  00 00 00 12 43 56 20 56  61 6c 75 65 73 20 28 30  |....CV Values (0|
00000060  2d 3e 32 35 36 29 00 00  00 00 00 00 00 00 00 00  |->256)..........|
00000070  00 00 16 44 44 4c 20 44  69 67 69 74 61 6c 20 44  |...DDL Digital D|
00000080  65 6c 61 79 20 4c 69 6e  65 00 00 00 04 00 50 41  |elay Line.....PA|
...

If you’ve used Reason, the terms “CV Values” and “DDL Digital Delay Line” will be familiar.

The strings command extracts string-like sequences of bytes from a file:

% strings CV-Spy--md.cmb
FORM
PTCHCAT
REFSCOIN
CAT
DEVLFORM
DEVIDESC
CV Values (0->256)
DDL Digital Delay Line
...

Property Values

Property lists are a standard Mac and iOS file type, constructed of structured data of predictable types. Most property lists on the system you’ll come across are in a compressed binary format that is fast to load. User preferences are stored as plists:

% pwd
/Users/markd/Library/Preferences
% file com.apple.iphonesimulator.plist
com.apple.iphonesimulator.plist: Apple binary property list

Unfortunately the compressed plist file is kind of hard to read:

% hexdump -C com.apple.iphonesimulator.plist
00000000  62 70 6c 69 73 74 30 30  dc 01 02 03 04 05 06 07  |bplist00........|
00000010  08 09 0a 0b 0c 0d 0e 0f  10 11 12 13 14 15 16 17  |................|
00000020  18 5e 53 69 6d 75 6c 61  74 65 44 65 76 69 63 65  |.^SimulateDevice|
00000030  5f 10 2f 4e 53 57 69 6e  64 6f 77 20 46 72 61 6d  |_./NSWindow Fram|
00000040  65 20 69 50 68 6f 6e 65  53 69 6d 75 6c 61 74 6f  |e iPhoneSimulato|
00000050  72 57 69 6e 64 6f 77 2e  32 2e 30 2e 37 35 30 30  |rWindow.2.0.7500|
00000060  30 30 5f 10 2f 4e 53 57  69 6e 64 6f 77 20 46 72  |00_./NSWindow Fr|
00000070  61 6d 65 20 69 50 68 6f  6e 65 53 69 6d 75 6c 61  |ame iPhoneSimula|
...

Luckily, there is the plutil utility that will convert between this binary format and something more human readable:

% plutil -convert xml1 com.apple.iphonesimulator.plist
% head !$
head com.apple.iphonesimulator.plist
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
    "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
        <key>CurrentDeviceUDID</key>
        <string>66E762FE-C171-481D-8DCF-BE908BB2945B</string>
        <key>LocationMode</key>
        <integer>3102</integer>
        <key>NSWindow Frame iPhoneSimulatorWindow.0.0.500000</key>
        <string>1264 312 368 716 0 0 1680 1028 </string>

(The !$ shortcut grabs the last argument from the previous command)

Spotlight

The OS may know more about a particular file than you might think. Spotlight’s job is to index files on your disk and make it easy to Find Stuff by querying metadata. There’s command-line access to this metadata via the mdls command, so you can ask Spotlight what goods it has about a file:

% mdls launchHandler.m
kMDItemContentCreationDate     = 2014-07-02 19:22:02 +0000
kMDItemContentModificationDate = 2014-07-02 19:23:58 +0000
kMDItemContentType             = "public.objective-c-source"
kMDItemContentTypeTree         = (
    "public.objective-c-source",
    "public.source-code",
    "public.plain-text",
    "public.text",
    "public.data",
    "public.item",
    "public.content"
)
...
kMDItemKind                    = "Objective-C Source"
kMDItemLastUsedDate            = 2014-07-02 19:32:46 +0000
kMDItemLogicalSize             = 1443
kMDItemPhysicalSize            = 4096
kMDItemUseCount                = 2
kMDItemUsedDates               = (
    "2014-07-02 10:00:00 +0000"

Here mdls tells you that this file is Objective-C source code, along with other UTIs (Uniform Type Identifiers) that describe the data. It is indeed source code, and plain text. There’s also some interesting data such as how much space on disk it actually consumes, vs how many bytes comprise the file.

Launch Services

Another system database of information is maintained by Launch Services, which has the last word on what program will open which file. Double-click on a file to open it? The Finder asks Launch Services. Use open to open a file from the command-line? It too uses Launch Services to figure out who to actually launch.

lsappinfo is a utility that uses Launch Services (as well as Core Application Services) to give you information about currently running applications. This is tangential to figuring out what a file actually is, but you can learn some cool stuff with it. Try lsappinfo sharedmemory to get some shared memory information, or lsappinfo visibleProcessList for a list of visible applications, (front-to back window ordering)

The other Launch Services features are accessed either through API, or through lsregister, a well-known but fundamentally undocumented utility. lsregister lives in the Support directory of the LaunchServices framework that lives inside of the CoreServices framework, most likely at this path on your machine:

/System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/LaunchServices.framework/Versions/A/Support/lsregister

lsregister is primarily used by the OS to register the files that’ll be handled by a particular application, but you can get a dump of its database by using

% lsregister -dump > services-db.txt

(To run this command, you’ll need to extend your PATH with that Support directory).

This produced about 61,000 lines out output, so it’s a bit unwieldy to use on a day-to-day basis, but it an be fun to poke around.

But some real power comes from a single call: LSCopyApplicationURLsForURL. Give this call a URL to a file, and it will return the list of applications that can deal with it. There are different query modes, like “What are all of the applications that can open this file?” or “What are all of the applications that can edit this file?” Launch services doesn’t actually introspect the files like file does. Instead it uses file extensions, creator codes and the like to match files to eligible applications.

Here’s a little utility that takes a filename on the command line, calls LSCopyApplicationURLsForURL and prints out an array of matching applications. You can find the code at this gist.

@import Foundation;
@import CoreServices;

// clang -g -fobjc-arc -fmodules launchHandler.m -o launchHandler

int main (int argc, const char *argv[]) {

    // Rudimentary argument checking.
    if (argc != 2) {
        printf ("usage: %s filenamen", argv[0]);
        return -1;
    }

    const char *filename = argv[1];

    // Get a string of the full path of the file, using realpath() as the workhorse
    char pathbuffer[MAXPATHLEN];
    char *fullpath = realpath (filename, pathbuffer);
    if (fullpath == NULL) {
        fprintf (stderr, "could not find %sn", filename);
        return -1;
    }

    NSURL *url = [NSURL fileURLWithPath: @( fullpath )];

    // Ask launch services for the different apps that it thinks could edit this file.
    // This is usually a more useful list than what can view the file.
    LSRolesMask roles = kLSRolesEditor;
    CFArrayRef urls = LSCopyApplicationURLsForURL((__bridge CFURLRef)url, roles);
    NSArray *appUrls = CFBridgingRelease(urls);

    // Extract the app names and sort them for prettiness.
    NSMutableArray *appNames = [NSMutableArray arrayWithCapacity: appUrls.count];

    for (NSURL *url in appUrls) {
        [appNames addObject: url.lastPathComponent];
    }
    [appNames sortUsingSelector: @selector(compare:)];

    // Finally emit to the user.
    for (NSString *appName in appNames) {
        printf ("%sn", appName.UTF8String);
    }

    return 0;

} // main

The main interesting parts are using the realpath() library call to turn the command-line argument into a full path (so you don’t have to worry if the user specified a relative, full, or ~-relative path), and then feeding that into LSCopyApplicationURLsForURL. The kLSRolesEditor is used because it returns the most reasonable list of applications. Sometimes the candidate applications can give you a clue as to what a file is.

% ./launchHandler launchHandler.m
TextEdit.app
Xcode-4.6.app
Xcode-5.0.2.app
Xcode.app
Xcode6-Beta2.app

% ./launchHandler someGraphic.png
Acorn.app
ColorSync Utility.app
Preview.app

% ./launchHandler ./Flongnozzle-2012
%

Unfortunately, it didn’t help out with the Flongnozzle case because there is no file extension or any other creator / file type information available.

Other Utilities

The available set of command-line tools available is remarkably vast, so I have probably missed one or two or a dozen other tools that might help you identify random files. If you have a favorite trick, please leave a comment!

Not Happy with Your Current App, or Digital Product?

Submit your event

Let's Discuss Your Project

Let's Discuss Your Project