GFI/SORBS considered harmful, part 2
Act 1 • Act 2 • Intermezzo • Act 3 • Act 4 • Act 5
Management Summary, Redistributable Documents and Links
Yesterday I talked about GFI responsiveness to queries and delisting requests about SORBS listings. Today I’m going to look at data accuracy.
The two issues are tightly intertwined – a blacklist that isn’t responsive to reports of false positive listings will end up with a lot of stale or inaccurate data, and a blacklist that has many false positives will likely be overwhelmed with complaints and delisting requests, and won’t be able to respond to them – leading to a spiral of dissatisfaction and inaccurate data feeding off each other.
Because it is so difficult to remove an IP address from the list, SORBS as a blacklist produces many false positives.
XS4ALL also consider SORBS harmful
GFI/SORBS maintains nine different IP address based blacklists, but they’re usually bundled together and treated as a single “Don’t accept email from this address” blacklist. Each of the nine lists have somewhat different listing policies, though there’s been some scope creep and blurring of the lines over the years.
I’m going to focus on just one of them, dul.dnsbl.sorbs.net. This is intended to list “Dynamic IP Address ranges” – consumer internet connections, such as DSL lines, cable modems and dialup modem pools where the IP address assigned to a user will change over time. These systems don’t typically send legitimate email directly to recipients (rather they send mail via their ISPs smarthost) and often contain a lot of consumer windows machines, which tend to get infected and send viruses and spam, so declining to accept mail from this sort of address pool is a fairly sensible decision. (Spamhaus maintain a list with similar goals, the PBL, as do Trend Micro).
GFI/SORBS have had a number of database accidents that have repeatedly caused false listings in a number of their lists, but because the DUL zone tends to list large ranges of IP addresses, data handling mistakes there tend to cause more visible problems.
the SORBS DUHL list has become badly broken, flagging thousands maybe millions of static IP’s as dynamic. This setting will flag as spam email from numerous legitimate sources incorrectly. Numerous attempts by mail admins the world over have failed to get Sorbs to fix the mess yet.SmarterMail Support Forum
ISPs tell me that GFI/SORBS also refuse to accept notifications about false positive listings in their DUL zone. Or they do update their database, but then reload the bad data a few weeks later. And if the ISP asks GFI for a status update about a false listing, their policy is to move that request to “the bottom of the pile”, ensuring that the inaccurate data that’s causing noticeable problems continues to be published.
If an ISP has reported to SORBS that a CIDR is no longer dynamic, and the (repeat) notifications have been ignored for 6-12 months… at what point does it go from lack of responsiveness, to data quality, to negligence, to willful malice?frustrated anonymous system administrator
The dul list was originally seeded based on data acquired from the dynablock list in 2003. I’m told that stale data, possibly dating back to 2003, is repeatedly being loaded into the GFI/SORBS DUL list, leading to a huge number of false positives. I can’t tell whether that is the case, but there’s certainly a lot of bad data leading to false listings.
Your problem in researching bad data in SORBS is not going to be finding examples of false listings, it’s going to be whittling that forest down to a manageable stack of wood.Comment from IRC
Very true. Lets choose a particular example: “n2.bullet.mail.sp2.yahoo.com” aka 67.195.134.51. This is one of the mailservers for Yahoo Groups, and sends a lot of mailing list mail. There’s nothing at all to suggest that it’s an end-user, dynamically assigned address machine. Just the opposite, it’s listed at dnswl.org as a Yahoo server that shouldn’t be blacklisted. It’s listed by ARIN as part of a /16 (65536 addresses) assigned to Yahoo. There’s nothing in the hostname to suggest it’s dynamically assigned, it even has the word “mail” in the hostname, a common sign of a legitimate mailserver. McAfee TrustedSource list it as a clean mailserver with a history of sending significant volumes of email, as do SenderBase.
And yet it’s listed in the dnsbl.sorbs.net zone, with a return value of 127.0.0.10 meaning GFI/SORBS are claiming it’s a dynamic IP address.
Looking up that IP address on the SORBS website was a fairly painful exercise (I’ll go into more detail about that, and other SORBS operational problems on Monday) but this is what I found:That IP address is categorized, wrongly, by GFI/SORBS as a dynamic address.
Why did I choose this particular server as an example, rather than one of the countless other false positives I could have picked? Well, it’s not just a single IP address that’s listed as a false psitive: GFI/SORBS are listing all of 67.194.0.0/15 – that’s 131,072 Yahoo servers that are categorized wrongly. And i know that GFI staff were explicitly notified about that particular listing early yesterday morning. Yet GFI are still publishing that data (as well as at least dozens of other false positive listings of similar size).
An outage usually means someone works quickly to resolve it. Having it still be an issue after 5 days is gross negligence. #sorbsTwitter
A blacklist should have checks in place that make it unlikely that badly wrong data is published, though even the best blacklists will very occasionally have a problem and publish bad data. How they respond to false positives is really important. If a blacklist is notified of false positive listings of this magnitude the safe thing to do is to pull all dubious listings from the published blacklist data (or if it can’t be narrowed down, pull all listings) until the problem is resolved.
That will eliminate the loss of legitimate email to the blacklists customers (and most of the spam the blacklist might have stopped will likely be blocked or filtered by other parts of the spam filters they use). GFI/SORBS have not done this, rather they’re following the same practice they’ve used during previous database catastrophes – continuing to publish known bad data.
I do not doubt that there are mail admins rationally fearing for their jobs this week. I am lucky enough to no longer be in the sort of pathological enterprise where a burst of excess false positives is a risk to an admin’s employment, but I am sure that not everyone who was using the SORBS DUHL until this week is so fortunate.Senior Security Consultant
More on Monday.