Thursday, August 07, 2003

I'm sure we've all heard the comparisons of political candidates, especially popular in Presidential elections, of the form:

[Candidate X] is the new [Current or past office holder of note]!

Along these lines, Howard Dean is apparently the new....well, it's multiple choice.

A growing pastime, born of GPS technology, is striking out to see just what there is at places where latitude and longitude meet, and a whole website, Confluence.org, documents this fascinating subset of world travel.

Tuesday, August 05, 2003

Ever since I started publicly listing my email address on the web1 I have been steadily receiving more and more spam. Nowadays, on average I get betwen forty and fifty a day. If I signed up to each and every one of these offers, I tell you boys and girls, my penis would have been lengthened to twice the circumference of the Earth by now! (I just thought you'd like to know that.)

Initially I started screening spam before I downloaded it from the server by looking at the subject lines or sender addresses but eventually the task became too laborious so I switched to an email spam filter like SpamAssassin (set up as an email proxy SAProxy which ran locally on my windows box). The results were pretty good but after a while I noticed that quite a lot of emails were managing to circumvent it.

Recently I switched from using IE/Outlook Express to Mozilla/Mail. The Mozilla project admittedly took a bit of while to get ready for Prime Time but I'm pleased to report that it's there now and that I'm no longer dependent on a Microsoft product for doing such an important job2.

It took me a little time after I started using it to realise that the program actually has quite a nice little spam filter already built in to it. The filter is a Bayesian one and it employs the statistical algorithm described in an August 2002 article by Paul Graham : "A Plan for Spam".

Unlike conventional spam filters which recognize certain tell-tale features of spam (such as keywords: sex, teens, free, credit etc), Bayesian filters start with no preconceptions of what spam should look like at all. When a spam email arrives you simply mark it with the "junk" flag (and optionally have it automatically move it to a "junk" folder). After a while - and it takes at least fifty messages before it really starts to get a handle on things - the filter will automatically start marking certain messages as junk all by itself.

In this early stage you may need to unmark some of these emails, in my cases it erroneously marked a few HTML newsletters, one or two mail server notification message etc. But after quite a few weeks of using this filter, I've found that it does an excellent job. Now only a few spams a day manage to slip through to my inbox and while I still check the junk folder for false-positives I haven't found any more legitimate emails wrongly put in there.

So how about you? What's your daily dose of spam like and what strategies have you been using to deal with it?

1 - the Collaboratory is coming up for its first birthday in two weeks btw
2 - I'm also using OpenOffice which is quite a respectable and free replacement of Microsoft Office

Monday, August 04, 2003

The University of North Carolina - Chapel Hill is raising quite the debate by asking incoming students to read "Nickel and Dimed: On (Not) Getting by in America". I know that at least Jaq and Jason have read this book (as have I). Is this fair game as required reading or is it too biased towards an agenda? Can politically charged books make for good reading for students or should they seek these out on their own time? Would you still feel the same iif it were a Bill O' Reilly book?

Sunday, August 03, 2003

John Hardy fixed the sidebar. Three cheers for John!