Tom Insam

Local food


Two-thirds of the social costs of the food distribution system have nothing directly to do with the environment at all: They are attributable to accidents and congestion. More than half of those costs are caused by driving to the shops.

Economist.com Free Exchange

Thumbnailing

My syndicated links thumbnails seemed like such a good idea at the time. I loved what ma.gnolia was doing, but I didn’t want to switch bookmarking services just for that. And I wanted all my content on one site anyway. But the pain, oh the pain.

The script that actually does the thumbnailing uses python and gtkmozembed. It requires an X server (for the mozilla) and I run it on my colo, which is headless, so I call it from a wrapper script that starts up an xvfb headless server and runs it. All a little fragile, I’m afraid. But it works surprisingly well. I call it from a cronned delicious syndication script (danger! ugly!) that pulls my delicious links into /links here every so often.

It was based a long time ago on mattb’s script/fragment and basically it’s all a bit nasty. I’ve tried to re-write it in various languages, but you’ll quickly find that the gtkmozembed bindings are awful - on most platforms they flat-out don’t work. My wrapper script has to munge LD_LIBRARY_PATH to get it to work under Debian, my server platform. Having to run an X-server is a pain as well. And because you’re wrapping mozilla, and not the underlying renderer, you can’t automatically bypass the security stuff, so it’s impossible to thumbnail sites with bad SSL certificates, you’ll just get a screenshot of the security confirmation dialog. I also can’t find a way of getting a ‘page loaded’ callback, so I just have to sleep some arbitrary amount of time before just blindly taking a screenshot of whatever I’m managed to render so far. That sort of thing.

At the time I wrote this script, I wanted an automated solution. I looked at various 3rd-party thumbnailing services but most of them just thumbnailed the root path of the domain you asked for, not the page itself. Most services also want you to deep-link their thumbnails rather than pulling them to a local server and serving from there, and they charge/quota by number of thumbnails served, rather than generated. I don’t know why. And I wanted thumbnails taken at the time that I bookmarked the page, not at the time you looked at the page. Picky, I know.

Were I to do it again, I think I’d seriously consider 3 alternatives:

Doing the thumbnailing on a Mac

Paul Hammond has a webkit2png script that does the same as my script, but using Webkit on a Mac. It’s almost as annoying, because you still need a windowing server, but the overheads are smaller - there are sensible callbacks so you can thumbnail faster, and it’s a more reliable environment - the bindings work. Of course, there are downsides - you need a Mac, for a start. And if you want an automated solution you’ll need a Mac connected to the internet and turned on 24 hours a day. But I have one of those under my telly now, so it’s tempting. Not sure if I’ll be able to solve the SSL certificate problem for this one, but it’s not a deal breaker.

WebThumb

Simon Willison found a thumbnailing service that doesn’t suck for Oxford Geeks. It’ll actually thumbnail the page itself (or at least, it did last time I poked at it) so if you don’t want the overhead of running your own server, this might work.

A simpler version

I now have a pure-C version of the thumbnailer script. I don’t use this version, but only because I wrote it as a thought experiment some time after I got everything working, and I don’t want to mess with something that works. I see no reason why the C version won’t do just as well, and it avoids most of the bindings pain. It’ll still need the wrapper, but dropping the python side of things might help.

Recently I got dropped in the deep end and had to learn both Ruby and Rails very quickly. I didn’t think this would be a problem - everyone raves about how easy Rails is to pick up, right? - and it wasn’t. The problem is actually arriving now, as I start trying to use Ruby for things other than Rails applications. And I can’t, because I’ve learned all sorts of nice Ruby tricks that looked like they were core language features but actually turn out to be added to the built-in Ruby objects by Rails.

For instance, I really like the 3.days convention for turning numbers into time intervals. That’s added by this extension. In fact, in digging for this, I found out just how many things Rails adds to core Ruby. I’m scared.

I’m torn. I’d like to consider messing with the built in objects confusing and dangerous. And I’ve been bitten by this before. I’ve also had problems where one module’s patching to a Ruby builtin interferes with another module’s patching of the same object. Lovely.

At the same time, though, I love it. I love both the huge convenience and readability of being able to write Time.now + 3.days, and the fact that the language lets me do this. All languages should be this consistent - none of this ‘some types are special’ crap.

There are trade-offs. I love Python, but I hate that map is a global function and not a method on arrays, and I hate that certain types are special and immutable. But I’m sure there are scary speed benefits from doing things this way.

I wonder if part of the reason that Ruby has this ‘just for Rails’ reputation is because, having learned Ruby for Rails, you can’t use that Ruby for anything else without unlearning a stack of habits?

Thanks to the sterling work of the Times Online I now know that 54% of all computer users have ‘illegally logged on to someone else’s wi-fi connection’. Wow. That’s.. a lot of people. Further down the article I find that this statistic comes from Sophos, and some googling (it’s really hard to google for ‘54%’ and ‘wireless’) gets the Exciting Press Release proper.

Where apparently, in an on-line survey (paid for by the Times Online), they asked the single question ‘Have you ever used someone else’s Wi-Fi connection without their permission?’. And 54% of people said ‘Yes’. There’s an exciting quote from.. a Sophos employee. And there’s a nice

Disclaimer: Please bear in mind that this poll is not scientific and is provided for information purposes only. Sophos makes no guarantees about the accuracy of the results other than that they reflect the choices of the users who participated.

Gosh, that was worth making news about, wasn’t it?



Paul Mison pointed me at this very odd twitter. It’s from a phone, so it seems unlikely that is was supposed to look like this. It contains ‘@s@t@r@a@ @e’ - that was supposed to be ‘Straße’? So this is probably a Windows Mobile phone falling back to UTF-16 / UCS-2 to send non-ascii. Every other phone I’ve ever see falls back to UTF-8 to do this.

Don’t you just love windows?

Updated: this one worked..

Language choice

Why do people writing server-side code, where limited CPU and memory resources much be shared between hundreds of users, use ‘high-level’ scripting language, whereas those writing client-side code, running on a machine where CPU and memory are much cheaper, use C and other lower-level languages?