Tom Insam

Brent Simmons wrote a plea for Baked Weblogs the other week. It resonated - I'm the sort of nerd who obsessively re-writes his blogging engine more often than he actually uses it to blog things with, so I've been through a lot of solutions, and I keep coming back to baking.

Anyway. Brent wrote another piece in which he mentions,

I still get to write using MarsEdit, by the way. It talks to WEBrick running on my laptop.

Now, currently I blog in Tumblr, but I have the next generation version all set up and ready to cut across to when I feel like it, and it's based on Jekyll, like the rest of my site is. I want to be able to post to it from MarsEdit! How hard could it be to build something that will let me?

Actually, it turns out to be really annoying. I'm extremely unimpressed by the MetaWeblog API. HOWEVER, finally today I have a releasable / working version of jekyll-metaweblog, a stand-alone ruby webrick server that will expose a Jekyll source tree via MetaWeblog and let you post, edit, delete, upload images, etc, etc, from MarsEdit (and hopefully anything else that supports MetaWeblog).

Get the code and play.

An aside - Talking about baking, Brent writes

[Aaron] also wrote that he doesn’t care about performance. If getting fireballed were a thing back in 2002, he might have cared about performance. If he had seen system X go down for a day, he might have cared about performance. It’s interesting that performance — or robustness — arguably wasn’t an issue in 2002, but it is now.

The Wikipedia page for 'slashdot effect' goes back to at least September 2001. Performance did matter. Aaron's position is actually a lot closer to mine:

Honestly, I don’t care about performance. I don’t care about performance! I care about not having to maintain cranky AOLserver, Postgres and Oracle installs. I care about being able to back things up with scp. I care about not having to do any installation or configuration to move my site to a new server. I care about being platform and server independent. I care about full-featured HTTP implementations, including ETags, Content-Negotiation and If-Modified-Since. (And I know that nobody else will care about it enough to actually implement it in a frying solution.)

Baking has many problems, of course, but it has (for me) one huge overriding advantage - if I get bored of my codebase and want to build something else (this happens a lot), my blog doesn't go away. It just stops getting new content. Much safer. It's easy to build a dynamic site that'll cope with being Fireballed and still host it on a single system. It's hard to have to host 50 megs of mongrel process for the rest of time because you thought it would be a good idea to build some part of your site in Rails and now you can't turn it off.

titles as metadata

Are titles on blog entries good things or not? I'm feeling an obligation to give things I write a title. But nowadays this is mostly because it forms a useful bit of text to use as a link target. Without a title, I have to excerpt the first few words of a piece, which always feels a little out of place. If I put a photo up here without a title, linking to it gets really hard.

Metadata is good, no denying it. But titles aren't metadata in the same way that, say, geolocation on a photo is. You are at a location when you take a photo - not writing the geodata down at the time doesn't mean it wasn't there, it just means you're using a bad camera. But a title isn't an inherent truth of a blog entry, it's a thing I have to add.

OS release schedules

Google delays Honeycomb tablet OS; what if that was Apple

Can you imagine if it were Apple delaying a software release. What would the press say if Apple admitted it took shortcuts with its OS to keep up with Google and now they couldn’t release it? The press would have a field day with that story.

You're right. Apple would never release, say, an OS with multitasking in it for only their new phone platform, and not make it available for their tablets for months. They'd get ROASTED.

OH WAIT.

Footnotes

[..] linking to sources is such an easy thing to do and the motivations for avoiding links are so dubious, I've detected myself using a new rule of thumb: if you don't link to primary sources, I just don't trust you.

-- Ben Goldacre

I was asked today: when some scientist makes a claim that we're told is based on stuff too complicated to understand, but we're supposed to believe it anyway - how are we supposed to be able to trust them. How do we know they're not making it up?

I think it probably boils down to "scientists use footnotes".

Opinions on REST

I wrote a thing about what is and isn't REST. That was (what I believe are) the facts. This bit is opinion.

Things that actually matter:

  • You've shipped something.
  • Your API has client libraries, even if they're trivial wrappers (because people will complain otherwise, or write bad ones).

REST is a nice dream. But I'm not personally a fan. RPC is how programmers' minds work, is the problem. Call method, receive bacon.

Here are some top-of-the head problems with a pure REST API:

  • Result pagination - suppose /books/ returns 1 million books? How do I know how many there are? Maybe it returns a list of pages? Suppose there a million pages? Does it merely return the first page and a link to the next one? How do I get page 100? Stack Overflow mentions the Range header, but I can't just magically use this as a client - the documentation will have to tell me what values are valid for this header. (Does the Range header even go in the OAuth signature? It seems not. Is this a security problem?)

  • How do I even find the links in the first place? Look for any string in the returned data structure that matches ^http://? Presumably the response format needs to be documented, and responses are going to differ based on the sort of object I'm requesting, so in practice, you're going to have documentation that says 'responses for URLs under /books/ look like this', and your URL structure is exposed again.

  • Am I allowed to make verbs up? Suppose I want to be able to flag a particular resource as offensive. Do I post to /books/4/offensive (how do I discover this URL), or POST offensive=1 to /books/4 (but I'm not changing the state of a book, so that doesn't seem right either)?

These aren't big problems. They have obvious solutions, even. But the solutions aren't covered by just saying 'It's REST' - you need to document them.

As a new developer, the main thing a purely REST API is going to differently is that, when I want to do pagination, the docs won't say 'add a page parameter', they'll start talking about Range headers. When I want to get JSON back instead of XML, I'll have to look at Accept headers. When performing operations on objects, they need to work out which verb is appropriate here.They will need to store complete URLs to your objects in their database rather than just IDs. (In practice, they'll just reverse-engineer your ID-to-URL mapping and store the IDs, then complain when you change something and their code breaks.)

Or maybe the developer will be using a high-level client that hides all of this, in which case there is no difference, except that your client libraries have to be a lot more complicated. But you've also hidden most of the benefits.

In practice, you will still end up documenting all of your different resource types, what URLs you can find them under, how to parse their representations, and how to find other objects based on those representations. You'll just have to document them using lots of highly-specific HTTP language, which means your documentation is going to be much harder to use for all the people using clients that hide all this stuff.

The best flickr clients (to my mind) are the ones with a call_flickr( method, params ) function, that takes one of the Flickr API methods, adds the parameters to it, does the authentication signing dance, and returns you the response, parsed into some in-memory data structure. To do this requires some knowledge of the Flickr API, sure, and I can't just point this client to some other web service's endpoint and get data out. But REST doesn't solve this problem either, and I believe it makes the simple things harder.

REST

There have been a couple of things I've been linked to recently about how some APIs claim that they are RESTian, but aren't really - Gareth and Jens. To my mind, there's not a lot of clarity here. So.

Things that make your web service RESTian:

  • Resources are represented by URLs.
  • Resources can be cached according to their caching headers.
  • You perform operations on those things using HTTP verbs (GET, DELETE, PUT, etc).
  • You discover other resource URLs by examining other resources - for instance GETting the /books/ resource might return a document that contains the URLs of all the books.
  • You can request a representation of a resource in different formats using the Accept header.
  • Your API is stateless. (Presumably allowing for things like rate-limiting, of course)

Things that mean your API is NOT RESTian

  • Your documentation describes how to construct URLs based on object IDs (/books/{id}) and is the only way of finding these URLs
  • Your API has a single endpoint and you pass the method as another parameter.
  • HTTP verb use is restricted to just POST, or POST and GET.

Things that have no bearing on the RESTiness of your API:

  • Your APIs look meaningful (like /books/{id}).
  • You return JSON. Or XML. Or anything in particular.
  • You use the word REST in the documentation a lot.

Things that aren't magically true just because your API is RESTian:

  • Writing clients is easy.
  • You don't even need a client, you can just derive everything from a single endpoint.
  • Your API maps properly onto your business objects and therefore makes sense.
  • Your API will scale properly.
  • Your API is easy to extend
  • You won't get support requests from people who didn't read the documentation.

Hopefully this should help things a little.

If you disagree with anything above, mail me, I'm genuinely interested if I've misunderstood something here.

I also have my own opinions on REST vs non-REST. But I'm trying to be factual here.

I bake a lot of my site - everything except the blog, in fact, which is hosted on Tumblr. I do, however, also pull all the blog pages down and render them to another domain, as an emergency "Tumblr has died again" measure. I can repoint a DNS record and I'm entirely stand-alone. In theory.

Brent Simmons talks about how he bakes his blog, and it's very similar to mine. However, he does seem to have a bit of cleverness in there that I'm jealous of:

I still get to write using MarsEdit, by the way. It talks to WEBrick running on my laptop.

I have an idea for this sketched out - clearly I have to finish implementing it. The obvious thing to do it to just expose the folder full of raw blog post sources via a metaweblog API. But it's such a horrible protocol to write for. Not that it's complicated. And I love XML-RPC. But it's just grown so organically that there's not really one authoritative source of "these are the methods you need". You just need to keep adding extensions from various people till all your clients work.

Also, I want to make life difficult for myself and write it in PHP, so it's easy to deploy.

When Quicksilver went away I paid for a version of LaunchBar and moved on with my life. 2 It’s not cheap, at $35, but that money gives the developer a reason to stick with development — it becomes a real business instead of just an elaborate hobby.

Fragility of Free — The Brooks Review