Tom Insam

Shelf - Context for MacOS

I really miss Dashboard. It was an effort to display some context around whatever person you were interacting with at any given moment - look at an email from Paul, or open an IM chat with him and you'd see things that he'd blogged or uploaded to Flickr recently. Genius. From the screenshots, it looks practically magic, tying into incoming SMS messages, IM conversations, the RSS feed reader, etc.

Alas, I never had a fully working Dashboard setup locally, mostly because applications had to actively participate in the process - they sent things called 'cluepackets' to the dashboard application containing hints about the current context. Because of this design, every app involved needed its source code patched and a recompile. This was a complete pain. Obviously, had everything gone to plan, the patches would have been merged and everyone would have been happy. I presume that Dashboard failed because the bootstrapping process was so hard that no-one used it.

Anyway, inspired by both Dashboard and Aaron's obsession with the address book, I've had a stab at doing it again, but worse.


Shelf will look at the current foreground application, and try to figure out if what you're looking at corresponds to a person in your Address Book. Then it'll tell you things about them.

Update 2008/01/08: I have downloadable versions of Shelf now. Go to the project page and download one.

Shelf screenshot

It's for MacOS. Because on MacOS, I have OSA - I can interrogate most (well-written) applications about their state in a beautiful, language-agnostic and fast manner. I can ask for the email address of the current mail. I can ask Safari what the URL of the foreground window is. I can ask Adium for the account details of the current chat. I can ask NetNewsWire for the homepage URL of the current subscription. And I can ask the system what app is in the foreground. I can also interrogate the system address book via the Cocoa bindings for same and find out what users have got that email address, or URL, or AIM screen name. And then I can take all the other information about them in their address book entry, and figure out some context. Oh, and the thing's written in Ruby, because the Ruby scripting bridge is a thing of serious beauty and should be played with by everyone.

Good thing

So, advantages. I don't have the bootstrapping problem, because most MacOS applications already have enough of a scripting interface that I can extract information from them. Firefox is proving to be a serious problem, alas, but I've hit no other apps I can't get something useful out of.

Once I have an Addressbook record as context, I can update the interface with a picture of the person and their name/company (direct from the address book, so easy). As a 'will this work?' experiment, I'm parsing every referenced URL in the address book card for RSS feeds, and displaying those as context. And (because I work there) I have special-case Dopplr support that tells me where the person is in the world and where they're going next. This means that when someone IMs me, a window pops up and tells me where they are, when they're back, and what they've blogged recently. Awesome.

addressbook screenshot

The system address book is great - it has multiple email address and URLs for people, so I'm indicating things like Dopplr username by just putting the url to my traveller page in my address book entry. I can parse the username out later and use it to call the API with. This has the advantage that if I visit my Dopplr page in Safari, hey, wow, that URL is in the address book, and it knows that it's me again. Flickr is the next obvious choice for special-casing, but the principle extends to anything.

Bad thing

Disadvantages. Firstly, urgh, I'm polling. Every 2 seconds, I ask the system for the foreground application, then ask that application (if I know how) for context. This is probably a little heavy (is it? I'm guessing..). Secondly, I have to do explicit work for every app out there. The huge advantages of Dashboard's cluepacket approach over mine were that packets were pushed instantly on a change of context, and that a new application was responsible for sending its own cluepackets.

Actually, this is easy. My app should have a 'change context' OSA method that other applications can call. Smart apps can tell me when their context changes, and I'll just poll everyone else. Once I've taken over the world, everyone will be pushing messages to me, and I can deprecate the poll interface. Genius.

Recently, most of the crazy apps I've put here have been labelled as 'proof of concept'. This one is different. This one probably won't even build on your computer. I'm putting things up here as a was of musing about technique. For instance, Dashboard had a far better design than this app. It had a nice pipeline thing going for it, whereas I just have a class per foreground application, this class must produce an Address Book record, then I just interrogate every context producer for information and display it. This is silly - if I'm looking at Paul's Flickr photos page, I don't need my app showing me the thumbnails again, I might be much more interested in where he is right now. Hell, in a perfect world, it would work out the dates of the photos I'm looking at, and show me where he was at that time.


Clever things I could (and want to) do:

  • If the foreground URL doesn't belong to a user, look for hcard markup in the source HTML and try to derive a person from that. Right now, for instance, I'll only recognise your Flickr page as belonging to you if it's one of the URLs against your address book card. But Flickr pages are marked up with enough hcard that I should be just able to figure it out.

  • More intelligence around context - as above, if I'm looking at a blog of a friend, I want to see other things, not their blog again.

  • Remembering connections - if I figure out a local person from a Flickr page via hcard markup rather than an Address Book URL, why not remember their Flickr username and display their photos when they email me?

Many of these features are difficult, mostly because of my core design right now - I derive an Address Book entry from the current application, then derive context from that entry. This hampers cleverness somewhat - I really need to pass around a lot more information about how I derived this person, and keep a local cache of conclusions about them. Maybe the person isn't in my address book - I get email from people I don't know! But their email address might correspond to a Gravatar so I could show a picture of them. Maybe the mail has some URLs in the .sig and I could find their blog. Maybe they've commented on my blog in the past and I'd like links to the comments. Likewise, if I find, via hcard in the source of a page, that a page is about someone I know, should I update Address Book and add URLs for them? Probably not a good idea. So I need a local store of connections as well.

Now what?

I don't know. It's very tempting to rewrite the thing in Python before it gets any more complex. Partially this is because the Ruby feedparser dependencies are a bugger, but mostly it's because I don't want my python sk1llz to atrophy down to nothing. Recently everything I do is in Ruby, and I don't like that. Shelf also desperately needs some work done to make it asynchronous, and cache things - when I look at an email right now, it'll hang for 5 minutes while it goes off and fetches 20 RSS feeds, every time I change the person I'm looking at. Not exactly pleasant. But the 'find out about a person' is really just a trivial example of the sort of things you can do once you know who they are. The 'derive context from current machine state' side of things is much more interesting.