Inefficiently Effective

Being effective does not always mean being efficient.  Sometimes it takes longer to prepare for an infrastructure change then to actually do it.  When the change is extremely critical then it pays to take the extra time to script it as much as possible.

One of my bigger clients was recently doing an acquisition transition and they needed to migrate a Microsoft SQL Server database from one server infrastructure to another.  This database migration was a small part of a bigger event that required a company wide downtime during the migration, this meant no customer bookings and an operational blackout.  Any delays could result in loss of revenue, service incidents, and possibly even a loss of multi million dollar accounts.

I was not directly involved during the preparation and at the the beginning of the migration process, I did eventually get a call after they ran into an issue.  The migration should’ve taken fifteen minutes but over an hour later they were struggling with what appeared to be a corrupt backup file.  

The team had previously, manually, tested this same set of actions; backup the database, FTP it from an external server to an internal server, then restore the backup.  It always worked!

During the actual migration they ended up using a different server to FTP the backup from, resulting in a corrupt backup file on each attempt.  They tested various other scenarios and eventually found that they were using ASCII transfer instead of BINARY, and of course BINARY was the default setting in the environments that they used when testing.

The two hours of down time that it actually took to transfer the file made it quite apparent that a scripted approach would’ve made the world of difference.  In this case the client was lucky as the event occurred at a low volume time.  There was minimal revenue impact and no service incidents.  Whew!

The moral of this story is that even a one time transfer of a file can benefit from a little scripting love.  It would’ve probably taken the team member an extra half hour to write a script, but it would’ve saved two hours of troubleshooting time, eliminated undue stress, and would’ve nearly eliminated any loss of revenue.

Scripting a simple one time FTP action may sound like overkill but in the case of a company wide downtime you should limit the risk of failure as much as possible.  Limiting the risk can only be achieved by confidence in the tasks at hand.  Having a scripted approach to a task builds confidence as there are no manual steps to remember or document – just run the script!

Yes, during the planning phase it would’ve been less efficient to write the script.  But during the execution phase, the time during the actual migration, the effectiveness of the team would be immeasurable.