The Spam Diaries

News and musings about the fight against spam.
 by Edward Falk

Sunday, April 09, 2006

97% of email is spam?

From a private discussion among system administrators and spam fighters: Anti-spam filters are causing on the order of 97% of incoming email to be rejected as spam. Other admins are reporting similar results.

Sample results:
Failed_Rcptto   6221280 79.23
spamhaus 722754 9.2
njabl 337640 4.3
smtp-delay 177016 2.25
rDNS 72675 0.93
DNS_MF 50419 0.64
bhnc.njabl 9831 0.13
Bad_Helo 9682 0.12
Invalid_Relay 2667 0.03
Virus_Infected 924 0.01
Total rejected 7851995
Accepted 247107
Reject % 96.95

My correspondant tells me that a great deal of the Failed_Rcptto (undeliverable address) is caused by a relative handful of poorly behaved systems repeatedly trying again to deliver mail. The worst of which appears to be trying about 6 times per second to deliver a message to one non-existent user.

Ignoring the Failed_Rcpttos, they'd still be rejecting 84% of mail as spam.

What results are you getting?

1 Comments:

Anonymous Anonymous said...

Roughly two thirds are spam on my two very different setups (two privately run servers for a small number of private and non-profit domains; one corporate setup for 10'000 users worldwide).

But, such "raw" numbers are next to meaningless. For example, there are a number of newsletters that *some* of my users regard as spam, while some others *want* them. So, there should be a differentiation between outright spam, a certain gray area, and clearly "white".

Next, what are we actually counting? The number of SMTP Transactions? The number of recipients? And if you are using blocklists to reject before the SMTP transaction, how do you find out the number of recipients?

How do you differentiate between viruses, virus "left overs" (eg bounces), and spam?

That's why, IMO, sentences like "x% of all email is Y" are not only meaningless, but are actually hiding important facts that are necessary for the effective and efficient use of filtering tools.

Yes, journalists and CIOs may have a harder time to understand that there is more to it than a simple single number.

Tough luck.

-- Matthias

11:52 PM  

Post a Comment

<< Home