When big organizations go bad.
Mar. 7th, 2008 06:18 pm![[identity profile]](https://www.dreamwidth.org/img/silk/identity/openid.png)
![[community profile]](https://www.dreamwidth.org/img/silk/identity/community.png)
Backstory, which is probably necessary. Awhile back, now - this story happened nearly two years ago - I lost my job practically overnight. Aware of the need to eat and pay bills, I took the first job I could find that would cover said bills, despite it's crappiness (helpdesk in a callcenter). We were firstline support for the organization in question, ticket handlers, and generally treated like the muppets most of my coworkers were. I've also posted this story before over at TSC.
I can deal with run of the mill stupidity. However, this problem sticks in my mind as the worst I've ever dealt with.
I first encountered the issue as an incoming email arrived in the call handling software in my slot. Customer was angry. Very angry. By this stage, having been there about two months, I was thoroughly aware I worked for an organization full of muppets, so I delved into his past call notes. Customer can't send email to a business associate at CompanyB. Various folks have checked the email address validity, checked that Customer can send email both internally and externally - it's just that mail to CompanyB doesn't get through. (I worked for CompanyA. The parent company of CompanyA also owned CompanyB.)
Getting to this point, between Helldesk troubleshooting and the customer going on leave for a week, took us three weeks.
The call got punted to the mail system admins. Nope, our exchange servers are working fine, farkoff. Issue gets punted back to Helpdesk for reassignment. (Four weeks.)
Helpdesk punt the issue to the messaging team. (This team deals with communications between the company infrastructure and the rest of the world. I never really worked out what they did.) Said team say it's not their problem, return to helpdesk for reassignment. (Five weeks.)
Helpdesk punt issue to general server admins. Server admins work out the problem - a basic piece of information, freely available in the original bounceback messages from CompanyB. Company A’s mail server has no reverse DNS records. Company B thus (quite justly) rejects their email. Customer sees this as “My email is broken when I email Company B, please fix it NOW.” The server admins don't handle DNS support, a networking team do.
We're now at week seven of this issue being known, and this was the point where I received the customer's umpteenth follow-up mail. Given that I'd just left a job at a web-support place, I had a pretty functional understanding of DNS. I explained the issue as simply as I could to the customer, and receive a glowing email in return.
I neglected to explain to him that the problem could have been fixed immediately, had competent helpdesk staff, competent email admins, or competent network admins actually read the error message.
I was fuming, at this point. The issue was obvious to anyone with the ability to read error messages and spend 5 minutes on google. However, due to various bureaucrappic rules, I wasn't allowed to reassign the ticket to the correct team myself. Instead, I had to go to the helpdesk member who had said ticket in their queue, to suggest the correct assignment. I tried to pass this on to the agent handling. His response? “Well, if it doesn’t belong with the group it’s with now, they’ll send it back. Don’t worry!”
Of course, a new agent couldn’t possibly know anything about technical issues. Never mind that he doesn’t know dick about what I worked in before being dumped in helldesk. I did try to explain what DNS was, and how I knew that was the problem. However, when he asked what an IP address was, I gave up and reassigned the damn ticket myself.
I left the job about six weeks later, for greener pastures. The job still wasn't resolved. CompanyA still didn't have reverse DNS for their mail servers. Customer still can't mail his customer/contact/coworker at CompanyB.
All because no-one with the ability to do it knows a DNS record needs to be updated. One five minute fix, maybe 24 hours replication time, max, waiting on 13 weeks of cranio-rectal inversion.
I can deal with run of the mill stupidity. However, this problem sticks in my mind as the worst I've ever dealt with.
I first encountered the issue as an incoming email arrived in the call handling software in my slot. Customer was angry. Very angry. By this stage, having been there about two months, I was thoroughly aware I worked for an organization full of muppets, so I delved into his past call notes. Customer can't send email to a business associate at CompanyB. Various folks have checked the email address validity, checked that Customer can send email both internally and externally - it's just that mail to CompanyB doesn't get through. (I worked for CompanyA. The parent company of CompanyA also owned CompanyB.)
Getting to this point, between Helldesk troubleshooting and the customer going on leave for a week, took us three weeks.
The call got punted to the mail system admins. Nope, our exchange servers are working fine, farkoff. Issue gets punted back to Helpdesk for reassignment. (Four weeks.)
Helpdesk punt the issue to the messaging team. (This team deals with communications between the company infrastructure and the rest of the world. I never really worked out what they did.) Said team say it's not their problem, return to helpdesk for reassignment. (Five weeks.)
Helpdesk punt issue to general server admins. Server admins work out the problem - a basic piece of information, freely available in the original bounceback messages from CompanyB. Company A’s mail server has no reverse DNS records. Company B thus (quite justly) rejects their email. Customer sees this as “My email is broken when I email Company B, please fix it NOW.” The server admins don't handle DNS support, a networking team do.
We're now at week seven of this issue being known, and this was the point where I received the customer's umpteenth follow-up mail. Given that I'd just left a job at a web-support place, I had a pretty functional understanding of DNS. I explained the issue as simply as I could to the customer, and receive a glowing email in return.
Hi Mahal.Apparently, this was the first time the caller has ever had his issue explained. He was angry because he thought our admins were too lazy to fix a small issue with his personal email. If someone had explained it was a greater problem, he would have been less upset.
Thanks for your update - it’s by far the most helpful response I’ve ever received from helpdesk. If there’s a reward for sending humanized, non-templated, useful and explanatory emails, you should get it.
I neglected to explain to him that the problem could have been fixed immediately, had competent helpdesk staff, competent email admins, or competent network admins actually read the error message.
I was fuming, at this point. The issue was obvious to anyone with the ability to read error messages and spend 5 minutes on google. However, due to various bureaucrappic rules, I wasn't allowed to reassign the ticket to the correct team myself. Instead, I had to go to the helpdesk member who had said ticket in their queue, to suggest the correct assignment. I tried to pass this on to the agent handling. His response? “Well, if it doesn’t belong with the group it’s with now, they’ll send it back. Don’t worry!”
Of course, a new agent couldn’t possibly know anything about technical issues. Never mind that he doesn’t know dick about what I worked in before being dumped in helldesk. I did try to explain what DNS was, and how I knew that was the problem. However, when he asked what an IP address was, I gave up and reassigned the damn ticket myself.
I left the job about six weeks later, for greener pastures. The job still wasn't resolved. CompanyA still didn't have reverse DNS for their mail servers. Customer still can't mail his customer/contact/coworker at CompanyB.
All because no-one with the ability to do it knows a DNS record needs to be updated. One five minute fix, maybe 24 hours replication time, max, waiting on 13 weeks of cranio-rectal inversion.
no subject
Date: 2008-03-07 05:37 am (UTC)Frequently, including the very people who are supposed to be doing it.
Most mailserver admins don't know jack shit about SMTP, either. Sigh.
no subject
Date: 2008-03-07 05:49 am (UTC)no subject
Date: 2008-03-07 05:56 am (UTC)I think the last time I had to update sub-ether's zone file was.... 2006. and my reverse DNS is handled by my ISP, but it's for a single ip address, so that's no biggie there.
no subject
Date: 2008-03-07 05:58 am (UTC)DNS I have something of a clueon about, enough to implement it without breaking stuff, and to make sure I don't break other people's stuff.
no subject
Date: 2008-03-07 06:01 am (UTC)Seriously, that's like 85% of what we do. (5% is "Just re-boot the damned thing!")
no subject
Date: 2008-03-07 06:40 am (UTC)But, like the OP said...5 minutes max to correct the A or B records to fix the described problem...
no subject
Date: 2008-03-07 12:58 pm (UTC)Worked for a ISP for 5 years.. I still have nightmares about BIND, MX records, and A lines.
no subject
Date: 2008-03-07 01:31 pm (UTC)no subject
Date: 2008-03-07 06:02 am (UTC)So sorry you had to go through that. I used to wonder why so many people were going postal/Columbine at their offices. After 10 years on a helpdesk, I no longer wonder.
no subject
Date: 2008-03-07 09:00 am (UTC)no subject
Date: 2008-03-07 10:08 am (UTC)The main problem is that they take xml-rpc (which I don't like particularly, but it's about the best RPC standard available at the moment) and "extend" it in ways that break all existing client libraries. Their error reporting mechanism is crap, too: "X of the Y updates you requested succeeded."
Why can people not think before they do stuff like this?
no subject
Date: 2008-03-07 10:12 am (UTC)no subject
Date: 2008-03-07 01:25 pm (UTC)no subject
Date: 2008-03-07 12:03 pm (UTC)However, the messaging team should have figured out themselves from the bounces, or doing a 30-second telnet to the server that wasn't accepting their mail (since you say the bounces were informative, that means the reject message from a telnet would have been informative as well). So your messaging admins are lazy. And incompetent, since you don't set up a mail server without having real hostnames for your mail gateway servers, OR proper PTR records that resolve to those hostnames (it frankly amazes me how many set-ups can't get those two simple steps right).
It's up to the mail admins to tell the server/DNS (however it's organised) admins what DNS records are needed - the mail, the SMTP RFCs and SMTP server best practice is their area of expertise, after all. If you don't know what's required for an internet-facing mail server (or any internet-facing service, actually), you shouldn't be administering one - it's not as if it's hard to learn about, either.
no subject
Date: 2008-03-07 01:21 pm (UTC)Oh yeah - ICON LOVE! *gank*
no subject
Date: 2008-03-07 05:01 pm (UTC)no subject
Date: 2008-03-09 03:55 am (UTC)I never knew Outlook can circumvent any security measures on servers that is has no control over to deliver your mail no matter what.
no subject
Date: 2008-03-09 04:31 pm (UTC)no subject
Date: 2008-03-09 05:12 pm (UTC)(world's most ridiculous setup, honestly.)