2022年11月19日土曜日

AWS is now a semiconductor manufacturer, the amazing development history that made the server "mainframe-ized". (In-depth reading of GAFA by Atsushi Nakata.)  2022.09.02

https://xtech.nikkei.com/atcl/nxt/column/18/00692/090100088/


Follow the series


Atsushi Nakata Nikkei Crosstech / Nikkei Computer


3717 total characters


 Amazon Web Services (AWS) of the United States has developed and manufactured all of the more than several million server hardware units used in the cloud in-house. The contents of that hardware have recently revealed the fact that it differs significantly from commercially available PC servers and is similar in configuration to mainframes.


 AWS servers are not "IBM mainframe compatible," and in 2012, AWS decided to develop and install dedicated processors for I/O and service management inside the AWS dedicated server hardware, in addition to the CPU. AWS executives revealed in an online presentation in August 2022 that their approach to using such dedicated processors was based on mainframes.


 The executive is AWS Senior Vice President (SVP) James Hamilton. Hamilton has previously revealed internal AWS specifications at his company's AWS re:Invent event. This time, he explained the development history of the proprietary Nitro chip, a semiconductor that has been installed in more than 20 million AWS servers, in his talk "History of Silicon Innovation at AWS" at the AWS Silicon Innovation Day 2022 event held on August 3, 2022. explained in the following section.


Nitro Chip takes over processing for hypervisor and other applications

 The role of the Nitro chip itself was first unveiled at AWS re:Invent in November 2017. Nitro chips are currently being installed in Amazon EC2 servers.


 Specifically, Nitro chips currently handle workloads such as cluster management, security management, and performance monitoring, which are handled by hypervisors and various management software, in addition to network packet processing and encryption processing.


 The November 2017 announcement explains that by offloading various workloads to the Nitro chip, up to 12.5% more of the server's CPU power can now be distributed to guest virtual machines than before. This means that Nitro chips contribute to generating more service revenue from the same amount of servers.


Bare Metal Realization Was a Challenge

 Peter Desantis, a key member of the EC2 development team, came up with the idea of developing the Nitro chip in 2012. At the time, AWS was looking at ways to offer bare metal servers to its customers.


 It was not possible for AWS at the time to offer bare metal servers without a hypervisor, rather than virtual machines running on a hypervisor, because various security features were implemented on the hypervisor that would prevent EC2 users from penetrating other customers' and AWS' system areas. "AWS is a security company," said Mr. Krishnamoorthy. AWS is committed to security," Hamilton said.


 DeSantis came up with the idea of "a server within a server. A dedicated service management server would be built into the physical server, and all traffic to the physical server would be isolated by the dedicated service management server. In this way, server resources could be safely provided to users without the need for a hypervisor. At the time, Mr. DeSantis described this idea as "installing a dongle (dedicated device) in every computer.


Achieve the same system as a mainframe at one-tenth the cost.

 When Mr. Hamilton was approached by Mr. DeSantis, he immediately knew it was a worthwhile idea. The approach of building network and service management servers into the mainframe, separate from the servers, had been used on mainframes and had helped them achieve the high performance and RAS (reliability, availability, and service) of mainframes.


 Hamilton is best known for leading SQL Server development at Microsoft before moving to AWS, and prior to Microsoft, he was part of the DB2 development team at IBM. Mr. Hamilton's knowledge of mainframes from IBM enabled him to judge the value of Mr. DeSantis's idea.


 In the mainframe, the dedicated service server was later transformed into a dedicated service chip. There is no doubt that introducing mainframe-like dedicated chips into EC2 servers would bring mainframe-like RAS to EC2. The challenge, however, is cost. AWS needed to provide cloud computing at a low cost, and it was unacceptable for the hardware to be as expensive as a mainframe, so the Nitro chip needed to be "one-tenth of the expected cost," Hamilton said.


 AWS decided to go further with its approach of developing its own dedicated servers in a vertically integrated manner, and even developed the chips in-house. First, they began developing their own chips in cooperation with Bigfoot Networks, a network management chip manufacturer that no longer exists, and Cavium, a processor manufacturer that also no longer exists.


Relying on ARM's "power of scale

 In order to cut costs by a factor of 10, Mr. Hamilton turned to the "power of scale. In the world of semiconductors, the larger the scale of production, the lower the cost. Hamilton chose ARM as the architecture for the Nitro chip because ARM processors are produced in the tens of billions, and the power of scale is a force to be reckoned with. The first Nitro chip was realized in 2013, using Cavium's ARM processors.


 AWS then acquired Israeli semiconductor manufacturer AnnapurnaLabs in 2015 and began developing Nitro chips entirely in-house. And now, in addition to Nitro chips, the company also develops its own semiconductors, including Graviton, an ARM server CPU; Inferentia, a dedicated machine learning inference chip; Trainium, a dedicated machine learning training chip; and SSD controllers. AWS is now also a semiconductor manufacturer.


Envisioning a single chip for servers, the company entered semiconductor development.

 According to Hamilton, Amazon.com announced in 2013 that it was fully committed to developing its own semiconductors, with founder Jeff Bezos and current CEO Andy Jassy, who was then head of AWS. The decision was made with Jeff Bezos, the founder, and Andy Jassy, the current CEO and head of AWS at the time.


 The rationale behind the decision was that "In the future, all computers will be SoCs (System on Chip), which means that everything, including servers, will be on a single chip. In other words, everything, including servers, will be on a single chip. At that time, smartphones were being converted to SoCs. We thought it was only a matter of time before servers would follow suit.


 In fact, as of 2013, servers developed in-house had become a source of competitiveness for Amazon. If they were to continue to promote in-house development of servers, they would inevitably have to promote chip development as well, as servers become more and more one-chip. With this in mind, Amazon began to develop its own semiconductors.


 At first glance, the prospect of servers becoming a single chip and the current situation where AWS is developing not only CPUs but also Nitro chips and chips dedicated to machine learning to make servers multi-chip may seem contradictory. However, this may not be the case.


 Although Mr. Hamilton did not explicitly say so, the author believes that AWS may be secretly developing a "one-chip server" that integrates the CPU, Nitro chip, and dedicated machine learning chip into a single chip.


ARM server development has also been a secret for nearly 10 years.

 In fact, it was in 2009 that Hamilton declared on his blog that "ARM processors will become mainstream in the server world in the future. In contrast, it wasn't until 2018 that AWS actually announced Graviton and began offering ARM server services. In the intervening years, for nearly a decade, the fact that AWS was developing ARM servers was officially kept under wraps.


 Therefore, the author assumes that AWS will be developing a one-chip server as long as Hamilton has declared that the server will be a one-chip server. And the one-chip server that AWS will announce in the future can be appropriately called a "one-chip mainframe," given its mechanism to ensure high reliability and availability. That is what the author thinks.

0 コメント:

コメントを投稿