Browsing source code with Spotlight and QuickLook
Post date: Aug 15, 2010 9:25:14 PM
I enjoy reading the source code written by good programmers — seeing how they solved problems, composed libraries, and implemented algorithms. The Mac has convenient tools and software for doing this. BBEdit and TextMate both have directory-browsing UIs that let me click on source files and view syntax-highlighted code. Xcode projects are good for exploring Mac code.
But even easier is using the QuickLook feature of the Finder. I can navigate through the directory tree, press the space bar to read a file, and scroll though the code with my Magic Mouse. When I want to find some text in particular, I can enter a search term in the Finder window, have Spotlight bring up all the files with that word, and use QuickLook to pop open the files. It's a very agile process.
There are some glitches in this system, though. It works well for popular languages like Objective-C and Ruby, but not for obscure languages. Because of quirks in the filename-extension handling of Mac OS X, apps tend to only process files with "known" extensions. For example, when BBEdit doesn't recognize an extension, it won't open the file from the browser. And when the Mac OS doesn't know the extension (or if the file lacks an extension, like a Makefile), the it won't display it in QuickLook or index it in Spotlight.
Snow Leopard does not make it easy to fix the issues with file extensions. Unlike on Windows or even classic Mac OS, there's no global mapping of file extensions to their types. The metadata comes from the Info.plist files of apps and plugins installed on the computer — not from the files themselves. And if some app has bad information in their Info.plist, you need to edit the app's Info.plist to fix the problem. (For example, I have been working with Standard ML, whose source files use the suffix .sml. When I downloaded RealPlayer, all the .sml files suddenly became RealPlayer movie files. Even though I deleted RealPlayer before long, my Standard ML files were still listed as RealPlayer files — and QuickLook would refuse to display them.)
Here are my workarounds for better source code browsing:
1. Set the classic Mac type code of the files to 'TEXT'.
This works best for file without extensions (like Makefile) and files with unique extensions (like the .tig files I'm creating while studying Modern Compiler Implementation in ML). After setting the type code to 'TEXT', these files will get indexed by Spotlight and appear in QuickLook as plain text.
I wrote a simple Automator service which simply runs the following shell command:
SetFile -t TEXT "$@"
You can download the Automator file from the Attachments on this page. Install the workflow in ~/Library/Services/. It's accessed in the Finder by right-clicking on one or more files.
2. Add a QuickLook plugin for displaying source code files, and modify it to handle any extra file extensions.
This helps solve cases like my .sml problem where the wrong program has "claimed" the extension to the OS. Even when I set the class file type to 'TEXT', these files would not appear in QuickLook until I installed new software that re-claimed the extension. It also handles cases like .hs files that Spotlight doesn't know how to index.
I'm using the QLColorCode plugin. The underlying display logic handles many languages, but its Info.plist only registers a few of them.
When registering a new file extension like .sml, one does three things:
Declare the ID of the UTI and give it a description. For example, Apple defines a UTI with the ID "public.c-source" and the description "C source code".
Declare what the UTI conforms to. This is like "is-a" inheritance in object-oriented programming. For example, Apple declares that "public.c-source" conforms to "public.source-code".
Map a set of file extensions to that UTI. For example, Apple maps ".c" to "public.c-source"
This registration is done in the Info.plist of an application or QuickLook plugin.
The following is the entry I added for handling SML/NJ files. It declares two UTIs: "org.standardml.ml-source" and "org.smlnj.cm-file", says that they conform to "public.source-code", and maps the extensions .sml, .sig, and .cm. The XML goes inside the UTImportedTypeDeclarations array of the Info.plist in QLColorCode.qlgenerator.
<dict>
<key>UTTypeConformsTo</key>
<array>
<string>public.source-code</string>
</array>
<key>UTTypeDescription</key>
<string>Standard ML Source File</string>
<key>UTTypeIdentifier</key>
<string>org.standardml.ml-source</string>
<key>UTTypeReferenceURL</key>
<string>http://www.standardml.org/</string>
<key>UTTypeTagSpecification</key>
<dict>
<key>public.filename-extension</key>
<array>
<string>sml</string>
<string>sig</string>
</array>
</dict>
</dict>
<dict>
<key>UTTypeConformsTo</key>
<array>
<string>public.source-code</string>
</array>
<key>UTTypeDescription</key>
<string>Standard ML Compilation Manager File</string>
<key>UTTypeIdentifier</key>
<string>org.smlnj.cm-file</string>
<key>UTTypeReferenceURL</key>
<string>http://www.smlnj.org/</string>
<key>UTTypeTagSpecification</key>
<dict>
<key>public.filename-extension</key>
<array>
<string>cm</string>
</array>
</dict>
</dict>
Addendum
I had this all working on my main Mac: browsing source with QuickLook and searching with Spotlight. Then I realized that Spotlight wasn't finding terms in source code stored on a file server. On the file server, a Haskell file like List.hs had the UTI "public.item"; copying the same file to my Mac immediately added UTIs for "public.source-code" and "org.haskell.haskell-source". So Spotlight on the file server was ignoring the source code files.
The solution was to repeat my steps on the file server: install QLColorCode.qlgenerator in /Library/QuickLook, then run "mdimport /". A few moments later, Spotlight refreshed my search results to include Haskell code.
References
QLColorCode: QuickLook extension to display source code with syntax highlighting
Articles on file metadata by Matt Neuburg, John Siracusa, Peter Hosey