Today was NIGHTMARE-DAY! Globat.com just emerged from a major outage - the worst in company history and everybody - customers and staff alike - still
feel extremely beaten up. Here's what happened:
At approximately 5:00am Pacific Time on Thursday, February 21, 2008 we
suffered a major network outage, which effected nearly all Globat.com
customers, our own Web sites and service infrastructure as well as our phone
Our primary network switch in our main datacenter in downtown Los Angeles
failed completely and despite tremendous efforts by our technical emergency
response team could not be brought back online. This switch is a major brand
name piece of equipment and as such it contains internal redundancy. We were able to switch over to the back up circuitry and begin bringing the system back online at which time, for yet unknown reasons, the entire devise,
including backup system, failed completely.
After a short deliberation we decided to rush a new switch from our vendor's
nearest warehouse to our data center. While waiting for the replacement we
initiated a work around to at least get email service re-established. After
several attempts all email systems finally went live and back to normal at
approximately 10:00am Pacific Time.
When the new switch arrived we went through the installation and testing
process and as a result were able to bring all our services back online, but
not before we had suffered a 5-hour outage for e-mail services and an 8-hour
outage for Web related services. In Globat's entire 6 year history, we have
never experienced anything even remotely like this! The outage was so severe that it affected our own Web sites and all but one of our telephone lines so we were unable to communicate with our customers to update anybody about the status of the incident. We decided to ask some of our staff members to post updates on well known Internet forums, which helped a little in disseminating the information.
Now that everything is pretty much back to normal I would like to tell you
that we ABSOLUTELY KNOW you count on Globat.com to maintain a high degree ofuptime to host your Web sites and that we take this trust very seriously.
Today, however, my team and I let you down and we all feel terrible because
of that. This has never happened before and we will do whatever it takes to
never let it happen again! I deeply and sincerely apologize for this outage
and the inconvenience and problems it may have caused. We promise to learn from the events of this dreadful day (today was also my wife's birthday, on top of everything).
We are now taking the following steps to prevent a recurrence of this type
of failure. This switch is the only piece of equipment at Globat.com that
could possibly be a single point of failure. This equipment is usually
highly stable and has built-in redundancy. In additional it is being vendor
monitored 24/7. We will now keep a second switch, with its own built-in
redundancy, in our data center to prevent a prolonged outage in the unlikely
event of another switch failure. In addition, we are working with our phone
vendor to create an automated failover system so calls will be routed
directly to remote support sites in the case of a failure of our primary
While nobody can completely prevent any form of failures in the future, with
these changes we can certainly minimize the effects of such failures and
reduce the impact on you, our valued customers.
Globat.com's Director of Customer Service, Tom Cox, stands by to provide
even more information, in case you would like to discuss this event further
or are still experiencing any problems. You can reach Tom at firstname.lastname@example.org.
At the time of me writing this message all of our systems appear to be back
up and running and I can assure you that our network operations team will
continue to investigate this unfortunate event.
In closing I would like to thank the entire Globat.com team (especially
Chris, Lou, Tim, Don and Rene) for working nonstop, without sleep or
interruption to resolve this issue and, of course, our customers for their
many encouraging comments and especially their incredible patience. You are
certainly very much appreciated!
Ben R. Neumann
Chief Executive Officer
Originally posted by bigfatfurrytexan
I love the emoticons on BTS.