Things my colo does

Been looking into getting rid of my colo. It costs a non-trivial amount of money, and philosophically I rather feel I should be able to exist in the cloud by now.

  • It hosts my SVN repository and clones of some of my git repositories.
  • It serves
  • I run irssi in a screen session as an IRC client. This will be hard to replace.
  • I run some perl-based IRC bots from it.
  • I back up my mail to it using offlineimap
  • Some friends have accounts and host simple web content / mail / irssi sessions on it. If you run a colo, don’t give friends accounts ever, for any reason. This one right here is probably the biggest barrier to me getting rid of it. But don’t take it personally if that’s you..

Things it doesn’t do that I’m thankful for

  • Host my mail. Tuffmail rock. But everyone I know already knows this.


Don’t be late for the singularity. Once your mates get their brains uploaded, if you wait a week they’ll have experienced years of virtual time, and will have entirely forgotten about you. And you won’t know any of the in-jokes.

Don’t be early, either. I expect we’ll get the ability to upload brains (let’s assume it’s possible) well before we’ve invented computers that are independent and self-repairing enough that you’d trust them with your newly-immortal self. I can certainly see several singularities starting and burning themselves out almost immediately. Pick the wrong one and your best-case-scenario is to be stuck in obsolete and unreliable hardware for the rest of time.

Sure, the desktop revolution was inevitable. But that doesn’t help the people who invested in Be. Or those that invested in Microsoft in 2001.

Apparently, designers shouldn’t be programmers - should, in fact, actively suppress any programming ability they have when considering user interface design.

Personally, I disagree. Design is in large part about tradeoffs, clarity/information density/whatever. (I just write the code.) If you’re not aware of the technical constraints of what you’re trying to do, you’ll not be able to make intelligent tradeoffs, you’ll just have to guess what’s possible.

For instance, my designer is Matt Jones, who is never fazed by trivial implementation details.

Reading a RWW article on non-relational databases, I came across the term ‘Eventual Consistency’ which is something I’ve seen a couple of times recently. I immediately and loudly demanded that mattb tell me what it meant. He proceeded to dump waaay too much reading material on me almost instantly, which tells me that I’m onto something. I hereby relay the following, so that I don’t lose the links:

  • Eventually Consistent - Revisited by Werner Vogels (Amazon CTO). A nice overview of what the term means, including a list of things that you take for granted and aren’t guaranteed, things you don’t take for granted and aren’t guaranteed, and things that it never occurred to you to doubt, that aren’t guaranteed. For instance…

  • ..suppose you wanted to sync with a database traveling in a different relativistic frame? Well, ok, maybe slightly less serious than that, but things can still disagree on what the time is. Time, Clocks and the Ordering of Events in a Distributed System.

  • Amazon S3 Availability Event: July 20, 2008 - Amazon S3 fell over a few months ago. I remember this because all my twitter icons went away. Anyway, this is why. Interesting in the context of the other two.

I play World of Warcraft. Oh, the shame. But I play it because I’m in a fun guild - we do science!. Well, actually they do science. I’m still at the ‘cleaning the glassware afterwards’ stage, but a tauren can dream..

Anyway, I code. It’s what I do. So once WoLK came out and half the guild went completely insane and started chasing the really silly achievements, it was clear we were going to need an RSS feed of the things. So I built one. It’s based on the Armory, like most WoW tools, and is a complete kludge, like most of my tools. But here are my notes anyway.

The trick to scraping the Armoury is pretending to be Firefox. If you visit as a normal web browser, they serve you a traditional HTML page with some Ajax, and it’s all quite normal and boring. If you visit the armoury in firefox they return an XML document with an XSL stylesheet referenced in the header that transforms the XML into a web page. Why are they doing this? It must be a huge amount of work compared to just serving HTML, I don’t get it. Let’s ignore that. Fake a firefox user agent, and you can fetch lovely XML documents that describe things! There’s no ‘guild achievement’ page, alas, so let’s start by fetching the page that lists the people in the guild. Using Python.

import urllib, urllib2
opener = urllib2.build_opener()
# Pretend to be firefox
opener.addheaders = [ ('user-agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.0; en-GB; rv: Gecko/20070515 Firefox/') ]
url = ""%( urllib.quote(realm,''), urllib.quote(guild,'') )
req = urllib2.Request(url)
data =

(This is the EU armoury, because that’s where I am). The armoury is a really unreliable site, so in practice I put lots more error handling round this. But error handling makes for very hard-to-read example code. The XML looks like this:

<page globalSearch="1" lang="en_us" requestUrl="/guild-info.xml">
  <guildKey factionId="1" name="unassigned variable" nameUrl="unassigned+variable" realm="Nordrassil" realmUrl="Nordrassil" url="r=Nordrassil&amp;n=unassigned+variable"/>
      <members filterField="" filterValue="" maxPage="1" memberCount="66" page="1" sortDir="a">
        <character achPoints="2685" class="Hunter" classId="3" gender="Male" genderId="0" level="80" name="Munchausen" race="Tauren" raceId="6" rank="0" url="r=Nordrassil&amp;n=Munchausen"/>
        <character achPoints="1175" class="Paladin" classId="2" gender="Male" genderId="0" level="80" name="Jonadin" race="Blood Elf" raceId="10" rank="1" url="r=Nordrassil&amp;n=Jonadin"/>

I parse XML using xmltramp, because I’m very lazy and it works. I use xmltramp for all my XML parsing needs. It’s old, and there might be something better, but I don’t really care. This is a toy.

import xmltramp
xml = xmltramp.seed( data )
toons = xml['guildInfo']['guild']['members']['character':]

That gets us a list of people in the guild. The rendered web page has pagination, but the underlying XML seems to have all characters in a single document, so no messing around fetching multiple pages here. (I’ve tried this on a guild of 350ish people. Maybe it paginates beyond that. Don’t use this script on a guild that big, it won’t make you happy.)

Alas, the next thing we have to do is loop over every character and fetch their achievements page (that’s why you shouldn’t run this script over a large guild). This is extremely unpleasant and slow.

for character in toons:
    char_url = ""%( urllib.quote(realm,''), urllib.quote(character('name'),'') )
    char_req = urllib2.Request(char_url)
    char_data =
    char_xml = xmltramp.seed( char_data )

The achievement XML looks like this:

<achievement categoryId="168" dateCompleted="2009-02-08+01:00" desc="Defeat Shade of Eranikus." icon="inv_misc_coin_07" id="641" points="10" title="Sunken Temple"/>
<achievement categoryId="168" dateCompleted="2009-01-31+01:00" desc="Defeat the bosses in Gundrak." icon="achievement_dungeon_gundrak_normal" id="484" points="10" title="Gundrak"/>
<achievement categoryId="155" dateCompleted="2009-01-31+01:00" desc="Receive a Coin of Ancestry." icon="inv_misc_elvencoins" id="605" points="10" title="A Coin of Ancestry"/>

My biggest annoyance here is that there’s no timestamp on these things better than ‘day’, so you don’t get very good ordering when you combine them later. I could solve this by storing some state myself, remembering the first time I see each new entry, etc, etc, but I’m trying to avoid keeping any state here, so I don’t do that. The XML also lists only 5 achievements per character, and getting more involves fetching a lot more pages, so the final feed includes only the 5 most recent achievements per character. Again, something I could solve with local storage.

Anyway, now I have a list of everyone in the guild, and their last 5 achievements. It’s pretty trivial building a list of these and outputting Atom or something. I do it using ‘print’ statements, myself, because I’m inherently evil. You can’t deep-link to the achievement itself on the Armoury, so I link to the wowhead page for individual achievements.

Because the Armoury is unreliable, and my script is slow, I don’t use this thing to generate the feed on demand. I have a crontab call the script once an hour, and if it doesn’t explode, it copies the result into a directory served by my web browser. If it does explode, then meh, I’ll try again in an hour. The feed isn’t exactly timely, but we’re not controlling nuclear power stations here, we’re tracking a computer game. It’ll do.

The code I actually run to generate the feed can be found in my repository here, and the resulting feed (assuming you care, which you shouldn’t, you’re not in the guild..) is here. feel free to steal the code and do your own guild feeds.