(no subject)
Jul. 24th, 2007 05:53 pmHi techsupport, I'm new here :)
For my first post, less of a rant, more of a piece of advice:
Test your backups. It's not hard.
I did some work today for a small charity that bought a Small Business Server a year or so ago. Their head office told them they need proper backups, so they dutifully bought a tape drive, a load of tapes, and Symantec Backup Exec. They write out a grandfather-father-son strategy, number the tapes, sign them in and out so you always know where each tape is, etc. Today, I looked in to fix some other issues and checked their backup log. Four months of 'backup failed' (that's as far back as it would go). The backups were failing because:
a) They'd been set to run as administrator, and admin's password had been changed
b) The tapes hadn't been allocated to media sets
c) The (cheap) external tape drive hadn't been set as the default device for the backup job
So for four months (probably more), they'd been meticulously signing out blank tapes. If the server had failed (not to mention that there have been four burglaries in that building so far this year), they would have been devastated. This is a charity that has an annual IT budget of £2,500 - to cover two sites and seven users. To find that their data were missing because backups hadn't been properly tested would have killed them.
But, I digress; at least their server had RAID-1. The place where I work now, in their 'Wow, it's time we got a full-time IT guy' moment, had a SBS server themselves. It was ordered with two identical disks and a RAID card, but RAID was never configured, so there was no drive mirroring. They too had signing in and out of backup tapes, but when Dell built the server, a bad IDE cable was used for the tape drive (one of the wires was poking out). Again, they'd spent tonnes of time and money on it, but hadn't checked the backup logs, and yes, the drive failed and they had to take it to a data recovery specialist because their tapes were blank.
Moral of the story:
Don't assume your backups work because you spent lots of money on them. Don't assume your backups work because it says 'complete' in the event log, or because the tape popped out.
Assume your backups work after your office has burnt down to the ground and you've bought a brand new server and restored everything without a hitch. Until then, do everything you can to prepare for that.
For my first post, less of a rant, more of a piece of advice:
Test your backups. It's not hard.
I did some work today for a small charity that bought a Small Business Server a year or so ago. Their head office told them they need proper backups, so they dutifully bought a tape drive, a load of tapes, and Symantec Backup Exec. They write out a grandfather-father-son strategy, number the tapes, sign them in and out so you always know where each tape is, etc. Today, I looked in to fix some other issues and checked their backup log. Four months of 'backup failed' (that's as far back as it would go). The backups were failing because:
a) They'd been set to run as administrator, and admin's password had been changed
b) The tapes hadn't been allocated to media sets
c) The (cheap) external tape drive hadn't been set as the default device for the backup job
So for four months (probably more), they'd been meticulously signing out blank tapes. If the server had failed (not to mention that there have been four burglaries in that building so far this year), they would have been devastated. This is a charity that has an annual IT budget of £2,500 - to cover two sites and seven users. To find that their data were missing because backups hadn't been properly tested would have killed them.
But, I digress; at least their server had RAID-1. The place where I work now, in their 'Wow, it's time we got a full-time IT guy' moment, had a SBS server themselves. It was ordered with two identical disks and a RAID card, but RAID was never configured, so there was no drive mirroring. They too had signing in and out of backup tapes, but when Dell built the server, a bad IDE cable was used for the tape drive (one of the wires was poking out). Again, they'd spent tonnes of time and money on it, but hadn't checked the backup logs, and yes, the drive failed and they had to take it to a data recovery specialist because their tapes were blank.
Moral of the story:
Don't assume your backups work because you spent lots of money on them. Don't assume your backups work because it says 'complete' in the event log, or because the tape popped out.
Assume your backups work after your office has burnt down to the ground and you've bought a brand new server and restored everything without a hitch. Until then, do everything you can to prepare for that.
no subject
Date: 2007-07-24 05:14 pm (UTC)I used to check the Backup Exec logs daily, but after four months of no failures, I wrote it out of my morning routine to save time. Bad idea! One of the tapes wasn't allocated properly, which caused the daily backup job to cancel itself, and backups didn't run for something like three weeks. Man, did I feel like a moron.
no subject
Date: 2007-07-24 05:43 pm (UTC)I'm always somewhat amused when I got a phone call late one night from my boss at the last company about 3 months after I started there. He was recounting the issues he was having with the primary file server and adding space to it.
Turns out he accidentally nuked the existing RAID array at some point whilst trying to back out of the disk upgrade. Fortuantely, his backups were good, otherwise the entire company would have been SOL.
I brings a small grin to my face even now after all that's happened since.
no subject
Date: 2007-07-24 06:10 pm (UTC)1) RAID is not a backup solution.
2) RAID is not a backup solution.
I know that, technically, that's only one flaw, but it was such a big one I thought I'd mention it twice.
RAID is, at best, a way of surviving partial hardware failure, not a wetware failure (users deleting/overwriting the wrong file).
Also, "backing everything up" to an external hard drive, then removing the original (because it's now safe on another drive) is not actually a backup - it's moving the file to a different storage device which is prone to failure. You still only have one copy.
Also, also, printing off several hundred pages of text is, technically a backup, but unless you're willing to re-input that data later on (and keep the pages somewhere safe until they're needed (and keep printing off any changes)) it's useless.
no subject
Date: 2007-07-24 07:54 pm (UTC)no subject
Date: 2007-07-24 07:35 pm (UTC)no subject
Date: 2007-07-24 07:56 pm (UTC)no subject
Date: 2007-07-25 05:36 am (UTC)A proper backup solution -includes- regular complete verification and, if you're anal, periodic failure testing. The type of testing you do by saying "ASSUME EVERYTHING HAS DIED. YOU HAVE THAT ONE SERVER, TAPE DRIVE AND A BOX OF TAPES. GET ME BACK THE SQL SERVER (space constraints not withstanding); MAKE SURE THE HR SYSTEM IS FUNCTIONING. YOU HAVE FOUR HOURS."
So much effort goes into trying to squeeze more 9's out of production uptimes that they forget to look at how to recover when things actually fail..