Wednesday, February 6, 2008

Destroy and rebuild

I once worked on a project were we did the ultimate destructive test; rebuild a server from scratch. The idea was, to test how quickly a server could be rebuild, without using a backup tape.
So one day somebody issued the rm -rF command and after that rebuilding could start. The result... two months later we still did not have the server back.

The cause of this delay? In the big organisation we worked in, there were seperate groups for unix, networking, storage, applications, etc. A combination of not enough people, lack of knowledge and commitment slowed the rebuild down. At a certain point there was not any progress for two weeks. Finally, after three months, the server was up and running again.

In my opinion the result of the test, was that the organisation was not able to work using the regular processes. What was worse, there were no lessons learned from this test implemented in the organisation. The next time a server had to be rebuild from scratch, the result would probably be the same.

Luckily they make backups, allthough we never tested that...

