[identity profile] grayhawkfh.livejournal.com posting in [community profile] techrecovery
Yeah, it's been a while since I posted anything. But here's some fun I've been dealing with.

The room that our servers are currently in is...large. We had space for our server racks and bench space for our 2nd tier techs and all was fine with the world.

Then some twit decided that the servers for "System G" had to get out of where they were and go somewhere else. I guess they saw our space and said "This would be excellent". And so they told manglement that they needed to move 5 racks and could they move them into our space. Manglement, of course, being of the "clueless wonder" type, said "OK".

The 5 racks grew into 12 racks over the course of 3 weeks.

But wait, this gets better. As far as anyone can tell, the sum total of evaluating the space for cooling and power needs was a quick walk around. No heat load testing, no evaluation of power needs by anyone who even resembled an electrician in a former life.

So we reduce the amount of bench space for 2nd tier by 2/3, and they move their crap in, only to find that there's not enough power for all 12 racks of crap. Even better, when they fired up 5 of them, the cooling could barely keep up. They have 8 (I believe) running now, and it's a battle to keep the room below 90 degrees.

Of course, this is somehow our fault. Never mind that no one bothered to consult us. And, we're making unreasonable demands like the door has to remain locked (NO, you can't tape the door open!)

So now, we're looking to move our equipment out of there so they can deal with this shit. We have another room that will work, but it has to be reconfigured and the cooling has to be verified. But these same assmunching fucknuggets who FUBAR'd their own move are now thinking that "OH! That'll be an EASY move for you guys."

WHAT? You fuckers are living proof of the Douglas Adams quote: "Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so." Did you learn nothing? Oh wait, all the problems you had were OUR fault.

Fuckers.

UPDATE: Just saw an email from the HMFIC: "Please raise the temperature alarm threshold to 91 degrees, as maintenance feels the current alarm is untenable due to the fact that they cannot maintain stability at 84 degrees."

This not 20 minutes after an email went to this twit stating that 81 degrees was a generally accepted upper limit by IT...
(x-posted to TSC)

Date: 2011-06-02 05:31 pm (UTC)
From: [identity profile] sethb.livejournal.com
I trust you asked for a written commitment from a manager with sufficient financial authority to cover any costs due to hardware failure from over-temperature operation, with copious notes referencing the manufacturer's product manual, etc.

Date: 2011-06-04 03:31 pm (UTC)
From: [identity profile] merlin-t-wizard.livejournal.com
^^THIS!

Document, document, document! Someone will be looking for answers when things start falling apart. Make sure it isn't you, or your team. There's only so long that you can save things when the manglement above you won't listen. Sometimes you have to let it all crash in order for upper management to notice the incompetent. It sucks, but sometimes it's your only choice.

Date: 2011-06-05 10:31 am (UTC)
From: [identity profile] mudo.livejournal.com
^^ This again.
Document! And then when stuff fails, you won't be responsible.

Date: 2011-06-03 11:56 am (UTC)
From: [identity profile] japester.livejournal.com
um ....

get out?
Get out now?
and state that working with/for muppets and f*ckkn*ckles is the reason for departure?

I'm surprised that whoever manages your server room allowed those racks in without proper research. That's asking for the reaming that you are now paying for.

Yikes.

Double yikes when I do the Celsius conversion. I'm used to 70F being the utter maximum for server rooms. 90F, yeeeeeeesh.

The last time I worked in a server room that got that hot, I told my manager that shorts and Hawaiian shirts would be the standard attire for attending that server room.
It was that hot because we had a schorcher of a summer and nobody's chillers were keeping up. (Multiple weeks of 104F+)


oooohhhh can you get commitment from anybody that the new kit gets powered down first when over temperature alarms happen?

Date: 2011-06-03 01:55 pm (UTC)
From: [identity profile] ghostdandp.livejournal.com
At 90F we start shutting off servers here. We have a couple dozen locations and we've had a few cases were cooling has shut off for whatever reason (bad water tower, cooling hardware failure, etc) and have had to use this policy before.

Date: 2011-06-03 01:53 pm (UTC)
From: [identity profile] ghostdandp.livejournal.com
81 is really pushing it for a limit. We keep ours at 65. Google and a few other companies have come out saying 80 is fine. Here's the problem with 80, power fails, AC cuts off. Your UPS is great but probably doesn't do cooling, and if it does it probably won't last for too long. How long can you sustain the data center when it's already 80 degrees in there? How quickly are you going to hit 90 degrees and start having hardware failure?

Date: 2011-06-06 05:55 am (UTC)
From: [identity profile] buckaction.livejournal.com
I hope you don't use my company's systems. I REALLY don't need a metric fuckton of "unexpectedly shutting down due to overtemp alarm" cases in my queue this week.

Profile

techrecovery: (Default)
Elitist Computer Nerd Posse

April 2017

S M T W T F S
      1
2345678
91011121314 15
16171819202122
23242526272829
30      

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 10th, 2025 01:50 pm
Powered by Dreamwidth Studios