The Hardware Load Balancer Brick Wall
Last month at Networked Systems Design and Implementation (NSDI) conference, Google lifted the covers off Maglev, their distributed network software load balancer (LB) . Since 2008, Maglev has been handling traffic for core Google services like Search and Gmail. Not surprisingly, it's also the load balancer that powers Google Compute Engine and enables it to serve a million requests per sec without any cache pre-warming . Impressive? Absolutely! If you have been following application delivery in the era of cloud, say over last 6 years, you would have noticed another significant announcement at Sigcomm ‘13 by the Microsoft Azure networking team. Azure runs critical services such as blob, table, and relational storage on Ananta , its home-grown cloud scale software load balancer on commodity x86, instead of running it on more traditional hardware load balancers. Both Google and Microsoft ran headlong into what can be best described as “the hardware LB brick wall”, albeit at different times and along different paths in their cloud evolution. For Google, it started circa 2008 when the traffic and flexibility needs for their exponentially growing services and applications went beyond the capability of hardware LBs. For Azure, it was circa 2011, when the exponential growth of their public cloud led to the realization that hardware LBs do not scale and forced them to build their own software variant.
So, what is this “hardware LB brick wall” that these web-scale companies ran into?