I tend to reimplement the CMS that drives jerakeen.org more often than I add content to it, but the current Django based incarnation seems to have decent sticking power. A lot of this is Django’s magic admin interface middleware. When I add, say, a tagging engine to the site, I only need to worry about the object model and presenting it on the site itself. All the boring and much harder to write admin pages to add and remove tags just write themselves. But the other reason I’m staying with it is that I’ve now added so many features to it (because it’s easy!) that a re-write in another language would be a huge amount of effort.
This weekend, for instance, I’ve added an implementation of the metaweblog API to the site, using the excellent code on allyourpixel as a base. The main source of pain is the persistent weirdness of implementing the Movable Type extensions to the metaweblog extensions to the Blogger XMLRPC API. How can you call something a metaweblog API and not allow for post excerpts, for instance? So annoying.
While implementing it, I found the TextPattern API reference to be far more useful than the official spec, mostly because it covers everything up to the Movable Type extensions, which you need if you want to edit page excerpts. The other problem I encountered was that Ecto won’t talk to an endpoint over HTTPS with a self-signed certificate unless the SSL cert is in the local machine X509 database. The way it fails is incredibly unhelpful and annoying, too. The simplest way to fix it (assuming a recent macos) is to visit the endpoint in Safari. It’ll complain about the certificate - click the ‘always trust this site’ box, and it’ll stop.
I’ve put this off way too long. But Fernando Herrera has found a bug with python-daap and Tangerine (a very cool app, although subject to an annoying variety of disconnect bugs). The fix for this, combined with various safer handing of non-utf8 ID3 tags, is easily enough to encourage a 0.4 release.
I gave a talk on E4X. In a Just and Decent world, I wouldn’t have to write a blog entry on this, because there would be a nice front page to jerakeen.org that listed all the recent things I’ve done, with the option to subscribe to RSS (or whatever) feeds of various subsets. But I’ve been too lazy to write this so far, so I’ll just link to it here until I get django to do what I want.
E4X is a lovely extension to JS (well, compared to messing with the DOM, and it’s in core, so embedded users get it too), despite its crazy inconsistent syntax and annoying brokeness in Firefox. Fortunately, I don’t have to care about web browser-based JS implementations, so I get to use it, and you don’t..
=== operator (approximately, ‘is this the same object’) gives:
js> "a" === "a";
js> "a" + "b" === "ab";
js> "ab".replace(/./, "c") === "cb";
js> new String("a") === new String("a");
If strings were to magically upgrade themselves to objects, they’d change behaviour - previously equivalent strings would suddenly not be equivalent. Likewise, suppose this worked:
var a = "string";
var b = "string";
a === b; # true
a.foo = 1;
a still be equivalent to
b? If not,
a clearly isn’t immutable, as we’ve changed it. But if it is, then we’ve chanaged
b at a distance - it’s grown a
Still all very annoying, of course, but I understand why now.
Recently, I mentioned a peculiar difference between uneval and toSource. Specifically (using the SpiderMonkey JS console):
new String("") are different types of objects. The first is the basic string type, and only really has a value. The second is a full Object, that happens to have a value. However, it turns out that if you treat a basic string type as an Object, say by putting ‘.’ after it in an expression, the SpiderMonkey runtime will implicitly promote the string to a String. Hence,
"".toSource() promotes the string object, then calls toSource on the new String object.
Annoyingly, the String Object doesn’t hang around, it’ll get thrown away as soon as you’re done with it. This leads to the weird case that you can set attributes on a basic string type (because it’ll get promoted to an Object, and Objects have attributes) but they don’t stay set (because the Object you’ve set them on gets thrown away as soon as the set call finishes).
By the way, all of this applies very specifically to the current CVS trunk SpiderMonkey. I don’t know what most web browser engines do with strings, so don’t assume this applies in, say, Internet Explorer. But I’d be interested if someone wants to find out and tell me…
Blog comment spam, the scourge of the internet. Having written yet another CMS to power jerakeen.org, I wanted comments on pages again. Django rocks hard - adding commenting was easy. And a day later, I have comment spam. Bugger.
From a purely abstract point of view, I find this interesting. There must be a spider looking for forms that look like things that can take comments. And the robots must be reasonably flexible - it’s not like my CMS is an off-the-shelf. But from a more concrete, ‘spam bad’ point of view, it’s bloody annoying.
So begins my personal battle against spam. Others have fought this battle, but of course the downside of rolling your own site is that you can’t use anything off the shelf. My plan was to forget trying to recognise and filter spam, and preferably I don’t want to have to moderate anything - I don’t want the spam to be submitted at all. And this really can’t be that hard. Unless there’s a human surfing for blogs and typing in the spam themselves, this should really just be a measure of my ability to write a Turing test. Right?
Or perhaps they figure it out once, and use a replay attack to hit every page? Not really a good assumption with hindsight, but never mind. I added prefixes to the form fields that were generated from the current time, and checked at submit time that the fields weren’t more than an hour old. This saves me from having to store state anywhere, and gains me forms that exipre after a while, unless you reverse-engineer the timestamp format. But I’m premising the existence of some automated tool, perhaps with a little human interaction. I don’t need to be perfect, I merely need to be not as bad as everyone else… But no, this failed too.
More playing with JSON and Spidermonkey has revealed yet another incredibly annoying fact (I hate those guys). Spidermonkey provides a lovely
uneval() function, that does the exact opposite of
eval() - turns JS objects into strings. It works on almost everything, and make life very very nice. There’s also
Object.toSource() which does something similar (but not the same - try
But the strings that
' as a valid string delimiter. And guess what delimiter
uneval produces? Yay. So all the parsers are fine, and it’s just SpiderMonkey that’s broken.
Fortunately, Mochikit provides a nice serializeJSON() function.