Microsoft cloud service goes Azure over tip
Microsoft’s cloud computing service Azure, currently in testing, has suffered a 22-hour outage. It’s an embarrassment, but some pundits believe it might be a wake-up call that improves the finished product.
The Azure system uses a special edition of Windows to run applications which are hosted online rather than on a computer. The idea is that users, particularly businesses, can drastically reduce the complexity of hardware which they need as much of the load is handled by Microsoft’s own servers rather than the user’s machine. It’s intended as an answer to similar services offered by Google and Amazon.
The system went down for 22 hours across Friday and Saturday meaning users taking part in the test program were unable to access many of the applications. It doesn’t appear that any data was lost through the outage.
The problem arose during an upgrade to the system which caused networking problems and an overall slowdown. This slowdown caused ‘a large number’ of servers to stop responding because they had gone too long without receiving any data.
Azure has a built-in system to recover from such problems, but it’s a work in progress; Microsoft believes the outage may improve the system’s ability to cope with any future problems. The firm also discovered that the existing default process for recovering, designed to play it safe by restoring access to one application at a time, was too slow to be practical. Staff switched to restoring multiple applications at once, which may become the standard response in future.
InformationWeek’s Dave Methvin has noted that that such a lengthy outage at least came during a test period. He suggests that the fear of what might have happened if this happened once the service had paid subscribers will force Microsoft to not only improve the system to reduce outages, but come up with a better strategy for coping if things do go wrong.
Meanwhile Dave Rosenberg of CNET believes the biggest problem was Microsoft’s lack of communication with testers during the outage. He says firms offering cloud computing services should not try to fool customers into believing their will never be problems, but instead be open and honest about performance issues to earn trust and confidence.

Related Posts:
