Interesting insights from Cascade Insights, a company specialized in competitive intelligence. The goal of the study “Web Mail Provider: SPAM Filtering Effectiveness Research” [full-PDF] was to quantify and compare the spam filtering capabilities of Hotmail, Gmail, and Yahoo! Mail.
- Creation: Test-addresses were identically created – same local-part, same day of creation et cetera;
- Seeding: they were also seeded exactly the same way on the web for both spam and permission-based newsletters;
- Operational definition of spam:
- (1) Unsolicited bulk emails (abstracting from message quality) send by parties that was not directly provided with the email addresses plus
- (2) subsequent emails from senders after unsubscribing from their communication.
- Field time: during 5 weeks all emails delivered to the inbox were sighted and sorted into spam-/ham-folders.
About 49% of all emails delivered to Hotmail’s and Gmail’s inbox were missed spam; at Yahoo even 58% of all inbox-mails were falsely classified as ham. Which is not that much better than leaving arriving emails completely untouched from any spam filters (64% spam in the experiment).
At first sight it seems HoTMaiL does a good job. One might say “no wonder” – it was the first freemailer and therefore must have a lead over the other ones. But of course that’d be nonsense. Today’s filters use complex models to classify emails, not just hard content-based rules, and therefore obey completely different laws. Instead it is more likely that Microsoft’s SmartScreen filter is just working really well. Wouldn’t be surprising if you consider the many things running in the background. Like the “Junk email protection program” (=panel data). And especially like the enormous global reach (Hotmail is still No. 1 – plus Exchange & Outlook data et cetera) that feeds those statistical antispam-models.
Two things that made me frown:
- Yahoo! Mail is classed as an “also ran”; but former findings from the Fraunhofer Institute demonstrated that Yahoo! Mail outperforms any other mail provider in filtering spam (PDF, 2010-03-25, German).
- Another thing to keep in mind: If you want to compare the effectiveness of different filters, those false negatives are only one side of the coin. It would have been interesting to see also the false positive rates – emails that have been misclassified as junk. Of course false negatives produce potentially higher costs and are therefore more important. Nevertheless, without a confusion matrix it seems somewhat incomplete. Especially when I think of Gmail that harshly junks out several newsletters I ordered.
Terry Zink talks about another problem – the tiny sample size that reduces the informational value of the study.