the movable type import format

In a previous life, I was trying to import content from a Movable Type blog into Hayfever. Then I wanted to write an importer from Hayfever into WordPress. And wow the MT import format is nasty. Things that have annoyed me, in no particular order:

  • There’s no charset considerations in the spec. I care deeply about explicit charsets nowadays. I’m sure the implementation does something with them, but what?

  • The DATE atom is an annoying US date, with no timezone information.

  • The whole serialization format is just nasty. The WordPress importer, for instance, splits records on ‘––—nAUTHOR:’, which is presumably much more reliable (in the case that there are lines of ‘-‘ in the data), but is a fairly nasty assumption that bit me quite badly for my own importer.

  • The PING and COMMENT atoms seem to contain nested sets of atoms, but this isn’t indicated in any sort of general way – you just have to know that PING is special.

I’m just grouchy, I guess.