This link (Presentation by AWS Distinguished Engineer) http://www.enterprisetech.com/2014/11/14/rare-peek-massive-scale-aws/ provides insight into "networking" being the pain point.
Some interest things from this post :
"
So, the answer is that AWS probably has somewhere between 2.8 million and 5.6 million servers across its infrastructure.
"
“Networking is a red alert situation for us right now,” explained Hamilton. “The cost of networking is escalating relative to the cost of all other equipment. It is Anti-Moore. All of our gear is going down in cost, and we are dropping prices, and networking is going the wrong way. That is a super-big problem, and I like to look out a few years, and I am seeing that the size of the networking problem is getting worse constantly. At the same time that networking is going Anti-Moore, the ratio of networking to compute is going up.”
More importantly,this seems to contradict with one of my projections "Death of SRIOV".
"
Now, let’s dive into a rack and drill down into a server and its virtualized network interface card. The network interface cards support Single Root I/O Virtualization (SR-IOV), which is an extension to the PCI-Express protocol that allows the resources on a physical network device to be virtualized. SR-IOV gets around the normal software stack running in the operating system and its network drivers and the hypervisor layer that they sit on. It takes milliseconds to wade down through this software from the application to the network card. It only takes microseconds to get through the network card itself, and it takes nanoseconds to traverse the light pipes out to another network interface in another server. “This is another way of saying that the only thing that matters is the software latency at either end,” explained Hamilton. SR-IOV is much lighter weight and gives each guest partition on a virtual machine its own virtual network interface card, which rides on the physical card.
"
It seems that AWS started to use SR-IOV based Ethernet cards.
I guess it is possible for AWS as it also provides AMIs with inbuilt drivers.
What I am not sure is that how they are doing some of the functions - IP filtering, Overlays and traffic probing and analytics without having to see the traffic.
Currently, AWS seems to be using Intel IXGBE drivers, which means that Intel NICs are being used here. See this : http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html
Amazon bought Annapurna in 2015. I wonder how that plays a role. Could it be that AWS folks found issues as detailed in here http://netsecinfo.blogspot.com/2014/08/death-of-sriov-and-emergence-of-virtio.html and hence they are going after programmable smart-NICs?
Food for thought...
Some interest things from this post :
"
So, the answer is that AWS probably has somewhere between 2.8 million and 5.6 million servers across its infrastructure.
"
“Networking is a red alert situation for us right now,” explained Hamilton. “The cost of networking is escalating relative to the cost of all other equipment. It is Anti-Moore. All of our gear is going down in cost, and we are dropping prices, and networking is going the wrong way. That is a super-big problem, and I like to look out a few years, and I am seeing that the size of the networking problem is getting worse constantly. At the same time that networking is going Anti-Moore, the ratio of networking to compute is going up.”
More importantly,this seems to contradict with one of my projections "Death of SRIOV".
"
Now, let’s dive into a rack and drill down into a server and its virtualized network interface card. The network interface cards support Single Root I/O Virtualization (SR-IOV), which is an extension to the PCI-Express protocol that allows the resources on a physical network device to be virtualized. SR-IOV gets around the normal software stack running in the operating system and its network drivers and the hypervisor layer that they sit on. It takes milliseconds to wade down through this software from the application to the network card. It only takes microseconds to get through the network card itself, and it takes nanoseconds to traverse the light pipes out to another network interface in another server. “This is another way of saying that the only thing that matters is the software latency at either end,” explained Hamilton. SR-IOV is much lighter weight and gives each guest partition on a virtual machine its own virtual network interface card, which rides on the physical card.
"
It seems that AWS started to use SR-IOV based Ethernet cards.
I guess it is possible for AWS as it also provides AMIs with inbuilt drivers.
What I am not sure is that how they are doing some of the functions - IP filtering, Overlays and traffic probing and analytics without having to see the traffic.
Currently, AWS seems to be using Intel IXGBE drivers, which means that Intel NICs are being used here. See this : http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html
Amazon bought Annapurna in 2015. I wonder how that plays a role. Could it be that AWS folks found issues as detailed in here http://netsecinfo.blogspot.com/2014/08/death-of-sriov-and-emergence-of-virtio.html and hence they are going after programmable smart-NICs?
Food for thought...