Do system administrators have too much power?

Yesterday, Laura brought a thread from last week to my attention, and the old-school ISP admin and mail geek in me felt the need to jump up and say something in response to Paul’s comment. My text here is all my own, and is based upon personal experience as well as those of my friends. That said, I’m not speaking on their behalf, either. 🙂
I found Paul’s use of the word ‘SysAdmin’ to be a mighty wide (and — in my experience — probably incorrect) brush to be painting with, particularly when referring to operations at ISPs with any significant number of mailboxes. My fundamental opposition to use of the term comes down to this: It’s no longer 1998.
The sort of rogue (or perhaps ‘maverick’) behavior to which you refer absolutely used to be a thing, back when a clean 56k dial-up connection was the stuff of dreams and any ISP that had gone through the trouble to figure out how to get past the 64k user limit in the UNIX password file was considered both large and technically competent. Outside of a few edge cases, I don’t know many system administrators these days who are able to (whether by policy or by access controls) — much less want to — make such unilateral deliverability decisions.
While specialization may be for insects, it’s also inevitable whenever a system grows past a certain point. When I started in the field, there were entire ISPs that were one-man shows (at least on the technical side). This simply doesn’t scale. Eventually, you start breaking things up into departments, then into services, then teams assigned to services, then parts of services assigned to teams, and back up the other side of the mountain, until you end up with a whole department whose job it is to run one component of one service.
For instance, let’s take inbound (just inbound) email. It’s not uncommon for a large ISP to have several technical teams responsible for the processing of mail being sent to their users:

  • the aforementioned system administrators (who are responsible for running the operating system and base-level support applications — but not the hardware; that’s handled by the hardware team),
  • the application administrators (who manage the process(es) that handle the actual SMTP transactions),
  • and the database administrators (this volume of mail most likely not happening strictly using flat files.) (Oh, and the database hosts have separate hardware administrators and system administrators, too.)

You’ll note that none of these groups have taken responsibility for saying what actually gets to be delivered. In most cases, at most large ISPs, it’s simply not their job. Their jobs are simple: Keep their piece of the machine up and performing within acceptable technical parameters.
The decision to block (or unblock) messages (whether by IP, domain, content, or any other criterion) is made by an entirely different team, with a different set of marching orders. At many ISPs, these people don’t even report into the same organization as the guys listed above. These folks go by many names, but I’ll use the term ‘Postmaster’ here. Postmaster teams have a weird job that’s both science and art.
The science part is pretty easy, as it’s supported by data provided by users. For instance: how many complaints are we getting about a sender and what percentage of messages we receive from the sender results in complaints? (It’s important to remember that those are two very different things.) Since this part of the job is is so data-driven, it is often at least partially automated, or at least streamlined via tools. As dashboard displaying messages from suspected senders or those containing suspected content, with links to or copies of the supporting data would allow the Postmaster team to proactively review inbound messages that may not have yet triggered automated systems, but that are likely to in the near future. (Oh, and this dashboard was probably written by a tools team (still not the system administrators!) who have access to the hosts where the logs are kept (… need I even say it?))
The ‘art’ part is where things get tricky, and is one reason actual system administrators shy away (or run screaming) from this type of work. There are potential legal ramifications to this sort of thing, particularly when blocking on message content (as opposed to message source). What percentage of messages that generate complaints is acceptable? What if nobody (for whatever reason) ever complains, but the message contains illegal content? What happens if a small, but vocal, number of people complain? Do we have a partnership (it happens) with the sender that in any way changes our response to complaints about their messages? Do we attempt to work with the sender prior imposing a block, or simply block the incoming stream without warning? What is the best way to block this sender? Should we attempt to block similar messages? If so, how do we identify these messages?
These are all squishy, unquantifiable things, many of which can be (more or less) legally defined, but not necessarily implemented via a particularly elegant piece of code. Human intervention is required. Actual people actually looking at messages and complaints, taking them in context, comparing that context against the minimal guidelines provided by the various technical (host, system, database) admins, as well as the more nebulous legal and policy guidelines provided by any number of groups: sales, marketing, legal, etc. Calls (like, actual phone calls, not decisions) need to be made.
I have a number of years as a system administrator under my belt, most of them spent at large ISPs, and I speak from experience when I say I know of very very few people who sign up to be a system administrator who want anything to do with this kind of work.
(All of that was my long way of saying that any ISP with a significantly large number of users is long past the point where such unilateral decisions can be made. Or, if they can be made, systems are assuredly in place that they cannot be made without being noticed. Any single player who decides to block all mail from ‘Opposition Candidate’ had best be ready to have technical- and policy- based reasoning at the ready, because actions will need to be justified, and in short order.)

Related Posts

Email filtering: not going away.

VirusBlockI don’t do a whole lot of filtering of comments here. There are a couple people who are moderated, but generally if the comments contribute to a discussion they get to be posted. I do get the occasional angry or incoherent comment. And sometimes I get a comment that is triggers me to write an entire blog post pointing out the problems with the comment.
Today a comment from Joe King showed up for The Myth of the Low Complaint Rate.

Read More

Deliverability and IP addresses

Almost 2 years ago I wrote a blog post titled The Death of IP Based Reputation. These days I’m even more sure that IP based reputation is well and truly dead for legitimate senders.
There are a lot of reasons for this continued change. Deliverability is hard when some people like the same email other people think is spam

Read More

AHBL Wildcards the Internet

AHBL (Abusive Host Blocking List) is a DNSBL (Domain Name Service Blacklist) that has been available since 2003 and is used by administrators to crowd-source spam sources, open proxies, and open relays.  By collecting the data into a single list, an email system can check this blacklist to determine if a message should be accepted or rejected. AHBL is managed by The Summit Open Source Development Group and they have decided after 11 years they no longer wish to maintain the blacklist.
A DNSBL works like this, a mail server checks the sender’s IP address of every inbound email against a blacklist and the blacklist responses with either, yes that IP address is on the blacklist or no I did not find that IP address on the list.  If an IP address is found on the list, the email administrator, based on the policies setup on their server, can take a number of actions such as rejecting the message, quarantining the message, or increasing the spam score of the email.
The administrators of AHBL have chosen to list the world as their shutdown strategy. The DNSBL now answers ‘yes’ to every query. The theory behind this strategy is that users of the list will discover that their mail is all being blocked and stop querying the list causing this. In principle, this should work. But in practice it really does not because many people querying lists are not doing it as part of a pass/fail delivery system. Many lists are queried as part of a scoring system.
Maintaining a DNSBL is a lot of work and after years of providing a valuable service, you are thanked with the difficulties with decommissioning the list.  Popular DNSBLs like the AHBL list are used by thousands of administrators and it is a tough task to get them to all stop using the list.  RFC6471 has a number of recommendations such as increasing the delay in how long it takes to respond to a query but this does not stop people from using the list.  You could change the page responding to the site to advise people the list is no longer valid, but unlike when you surf the web and come across a 404 page, a computer does not mind checking the same 404 page over and over.
Many mailservers, particularly those only serving a small number of users, are running spam filters in fire-and-forget mode, unmaintained, unmonitored, and seldom upgraded until the hardware they are running on dies and is replaced. Unless they do proper liveness detection on the blacklists they are using (and they basically never do) they will keep querying a list forever, unless it breaks something so spectacularly that the admin notices it.
So spread the word,

Read More