Network glitches and corrupted VMs

I had a bit of a interesting Friday. I was so glad it was finally the weekend. Saturday we did a bunch of errands, including go visit our servers. See, we’ve been upgrading infrastructure to implement a second type of backup system. Saturday we were doing the last set of upgrades so we could install over the weekend.
Yes, we do all our own networking and racking.
12974536_10206263292444901_7498678361263518784_n
Saturday evening Steve is installing the new backup software. This is awesome backup software. It backs up the entire virtual machine. If we lose a virtual machine, we can just reload the entire thing and it will be back again.
Except while installing the software, there is a weird network glitch. Said network glitch caused the system to crash. The system crashes hard. The system crash corrupts some of the data on disk. The data on disk is our virtual machine files. Files are in read only mode and won’t fsck automatically.
We lose most of our production virtual machines.  We’re off the air.
IronyBlog
Possibly this was tragic, not ironic. I dunno, it’s been a long weekend.
We lost a bunch of production virtual machines to the disc corruption. We haven’t lost any data, but it’s taking some time to rebuild the machines and pull data from the other backup system and get it installed.
That means some of our websites and services, like tools.wordtothewise.com are down. It may mean you saw some bounces if you sent us mail over the weekend. Mail is back and we are communicating with the outside world again.
Steve’s working through our other services as fast as possible to get them back up and running.
(If massive server issues weren’t enough, one of the cats got a UTI so we’re having to pill her twice a day. Then last night managed to puke so hard she passed out briefly. Poor thing. She’s doing better this morning.)

Related Posts

Unexpected break

Sorry for the unexpected break in blogging. Been dealing with some emergencies. Happy 4th to my fellow citizens. Happy late Canada day to all our northern friends. We’ll resume blogging next week.

Read More

And we're back

Happy New Year!
I am back and ready to talk email with folks.
December is always a busy time, both between the holidays and all associated personal stuff, but also for delivery consulting. There are senders that suddenly discover their email going to the bulk folder and needing help and assistance. But now it’s January and email marketing gets a brief break.
The beginning of the new year and the lull after the Christmas season marketing storm is a good place for folks to think about marketing and email goals for the upcoming years. Many senders get so wrapped up in the day to day details of email that they fail to think strategically about email and their business.
It works much that way for me, as well. I hate it when my clients have bad delivery and do everything I can to fix their problems. If their mail isn’t getting to the inbox, then it’s as much my problem as theirs. I’m thinking and working to get to the root of their problem and come up with solutions to get their mail sent. This sometimes means my own strategic planning gets pushed aside while I focus on client needs. January is a fun time of year for me, because it’s all a little more relaxed and I can look at the new year and how to improve services and share more of my knowledge with folks.
You’ll start to see some of those improvements in the upcoming months. I’ll also be blogging regularly. We should be getting some research and white papers out over the next few months. I’ll be catching up on the Google privacy cases and updating on some other email related lawsuits.
2014 is looking like a year of growth and excitement.

Read More

Your system; your rules

In the late 90s I was reasonably active in the anti-spam community and in trying to protect mailboxes. There were a couple catchphrases that developed as a bit of shorthand for discussions. One of them was “my server, my rules.” The underlying idea was that someone owned the different systems on the internet, and as owners of those systems they had the right to make usage rules for them. These rules can be about what system users can do (AUPs and terms of service) or what about what other people can do (web surfers or email senders).
I think this is still a decent guiding principle in “my network, my rules”. I do believe that network owners can choose what traffic and behavior they will allow on their network. But these days it’s a little different than it was when my dialup was actually a PPP shell account and seeing a URL on a television ad was a major surprise.
But ISPs are not what they once were. They are publicly owned, global companies with billion dollar market caps. The internet isn’t just the playground of college students and researchers, just about anyone in the US can get online – even if they don’t own a computer there is public internet access in many areas. Some of us have access to the internet in our pockets.
They still own the systems. They still make the rules. But the rules have to balance different constituencies including users and stockholders. Budgets are bigger, but there’s still a limited amount of money to go around. Decisions have to be made. These decisions translate into what traffic the ISP allows on the network. Those decisions are implemented by the employees. Sometimes they screw up. Sometimes they overstep. Sometimes they do the wrong thing. Implementation is hard and one of the things I really push with my clients. Make sure processes do what you think they do.
A long way of dancing around the idea that individual people can make policy decisions we disagree with on their networks, and third parties have no say in them. But those policy decisions need to be made in accordance with internal policies and processes. People can’t just randomly block things without consequences if they violate policies or block things that shouldn’t be blocked.
Ironically, today one of the major telcos managed to accidentally splash their 8xx number database. 8xx numbers are out all over the country while they search for backups to restore the database. This is business critical for thousands of companies, and is probably costing companies money right and left. Accidents can result in bigger problems than malice.
 

Read More