
Tuesday Google had come across an unexpected service outage related to Gmail worldwide. Service was out for almost 100 minutes as Gmail routers dropped offline.
Ben Treynor, VP Engineering and Site Reliability Czar at Google released a statement explaining that during a planned maintenance upgrade, email demand exceeded what was expected for the time, causing load-balancing servers to drop connections like a three-ring juggling clown on a trampoline, cruising down the highway at 55 MPH (admittedly, this would be a sight to see.)
"The Gmail engineering team was alerted to the failures within seconds (we take monitoring very seriously). After establishing that the core problem was insufficient available capacity, the team brought a LOT of additional request routers online (flexible capacity is one of the advantages of Google's architecture), distributed the traffic across the request routers, and the Gmail web interface came back online."
Google ensures they they are taking additional steps to make sure the problem never happens again. "... Increasing request router capacity well beyond peak demand to provide headroom... [the servers] should just get slower instead of refusing to accept traffic and shifting their load."
Google plans to release a series of updates for the Gmail network over the next few weeks to correct the flaws in the system. Let's hope that the Gmail network stays stable during these maintenance upgrades aswell!
[via Google]
0 people left comments. Add one?:
Post a Comment