Wednesday August 7th, AMD launched the 7nm update of his family of Epyc processors. These new hearts are not limited to Intel in one category, they bring huge improvements in each category. AMD has reduced its prices by heart, increased the IPC and promises to provide many more processor cores than an equivalent Intel socket.
It was only another time that AMD nearly defeated Intel so decisively: the introduction of the Opteron dual core and the Athlon 64 X2 in 2005. The launch of Epyc this week seems more important. In 2005, AMD's dual-core processors matched the number of Intel processors, outperformed Intel's core clock and core processors, and were quite expensive. This time, AMD is targeting the trifecta, with higher performance, more cores and lower per-core pricing. This is the most serious attack against the high-end Intel Xeon market, launched by the company.
Industry analysts have already predicted that AMD's server market share could double in the next 12 months, reaching 10% by the second quarter of 2020. A larger share of the data center market is essential objective of AMD. A larger share of the corporate and data center market will not simply increase AMD's revenue, it will help stabilize the company's financial performance. One of AMD's critical weaknesses over the last two decades has been its reliance on low-end PCs and retail channels. Both markets tend to be recession sensitive. The low-end computer market also offers the least revenue per socket and the lowest margins. Business cycles are less affected by slowdowns. AMD briefly achieved its goal of a substantial market share for businesses in 2005-2006, when its market share for servers had broken by 20%.
Fans like to focus on the performance of AMD desktops, but apart from games, overall PC sales are declining. Growth in narrow categories such as 2 in 1 was not enough to offset the overall decline in sales. Although no one expects the personal computer market to fail, it is clear that the 2011 economic downturn was not a shock. AMD still has an interest in fighting to expand its share of the desktop and mobile market, but it makes more So it makes sense to fight for a share of server space, where revenue and unit shipments have increased over the past 8 years. The year 2019 may be a difficult year for server sales, but the general trend of migrating workloads to the cloud shows no signs of slowing down.
In our discussions on Rome, we focused primarily on the Epyc 7742. This graph, derived from ServetheHome, illustrates Epyc's performance against Xeon on more SKUs. Viewing at the bottom of the pile:
A pair of AMD Epyc 7742 costs $ 13,900. A support of 7502 (32C / 64T, base of 2.5 GHz, amplification of 3.35 GHz, $ 2600) is equivalent to $ 5,200. The Intel Xeon Platinum 8260 processor is a $ 4,700 processor, but there are four in the system with the highest score, for a total cost of $ 18,800. AMD processors worth $ 13,900 earn you 1.19 times more performance than Intel processors worth $ 18,800. The comparison does not improve with the falling of the pile. Four E7-8890v4 would cost nearly $ 30,000 at list price. A pair of Platinum 8280 costs $ 20,000. The 8676L is a $ 16,600 processor at list price.
But it's not just the price, or even the price / performance ratio where AMD has an advantage. Intel heavily subdivides the features of its product and bills much more. Consider, for example, the price difference between the Xeon 8276, 8276M and Xeon Platinum 8276L models. These three processors are identical, with the exception of the maximum amount of RAM supported by each. The price, however, is anything but.
In this case, "Maximum Memory" includes Intel Optane. The 4.5 TB RAM memory assumes that 3 TB of Optane is installed next to 1.5 TB of RAM. For comparison, the Rome 7nm processors offer up to 4TB of RAM support. It is a standard feature built into all processors, simplifying product purchases and future planning. AMD not only offers chips at lower prices, it is also interested in Intel's market segmentation method. Good luck justifying a price increase of $ 8,000 for additional RAM support when AMD is ready to sell you a 4 TB addressable capacity at base price.
One of AMD's talking points with Epyc is how it delivers the benefits of a 2S system in a 1S configuration. This table of ServetheHome shows the differences:
The advantage of AMD here is that it can simultaneously hit several Intel weaknesses. Need a lot of PCIe lanes? AMD is better. You want PCIe 4.0? AMD is better. If your workloads evolve optimally with the hearts, no one sells more cores per socket than AMD. Intel can still claim some benefits – it offers L3 unified caches much larger than those of AMD (each AMD L3 cache actually is 16 MB, with a slice of 4 MB per core). But these benefits will be limited to the specific applications that respond to them. Intel wants suppliers to invest in creating support for its persistent Optane DC memory, but nothing is said. how much do it. The current low prices of NAND and DRAM have made Optane's competition on the market much more difficult.
The move to 7nm has given AMD an edge in terms of power consumption, especially when you plan to end the server's life. STH indicates a single-threaded power consumption on a Platinum Xeon 8180 at ~ 430W (wall power), compared to around 340W at the wall for the AMD Epyc 7742 system. They note however that the high number of cores on the latter AMD processors will allow them to remove between 6 and 8 Intel Xeons 2017 sockets (60 to 80 cores) to consolidate workloads into a single AMD Epyc system. The energy savings from removing 3-4 dual-socket servers are well above the difference of about 90 W between the two processors.
Features like DL Boost can give Intel a performance advantage in AI and machine learning, but the company will fight hard. Until now, the data we have seen suggests that these factors can help Intel. match AMD as opposed to beating him.
The catalog prices we quoted for this story are the official prices that Intel publishes for Xeon processors in 1K units. It is also notorious that they are inaccurate, at least as far as major OEMs are concerned. We do not know what Dell, HPE and other vendors actually pay for Xeon processors, but we know that it is often well below the list price, which is usually only paid for by the retail network.
The gap between Intel's list prices and actual prices may explain why Threadripper did not have a lot of market penetration. Despite the fact that Threadripper processors have offered a lot more cores per dollar and better performance per dollar for two years, OEMs sharing sales information, like MindFactory, report very weak sales of Threadripper and Skylake-X. However, Intel has not shown any particular interest in lowering the price of Core X. It continues to position a 10-core Core i9-9820X as a suitable competitor for chips such as the Threadripper 2950X, despite AMD's superior performance in this game. This strongly implies that Intel has no particular problem the sale The 10-core processors to OEM partners who want it, despite Threadripper's superior price / quality ratio and AMD's share of the workstation market, is quite limited.
While Intel has cut its HEDT prices (the Core i7-6950X at 10 cores was worth $ 1723 in 2016, compared to $ 900 for a Core i9-9820X today), it has never tried to make a price / performance comparison against Threadripper. If this bulwark is to collapse, Rome will be the processor that will do it. Ryzen and Threadripper will be considered more credible workstation processors if Epyc begins to penetrate the server market.
Intel can reduce its prices to meet AMD in the short term. In the long term, we will have to challenge AMD directly. This means that more cores will need to be delivered at lower prices, with larger amounts of memory supported by socket. Cooper Lake, which is built on 14nm and includes additional support for new AVX-512 instructions focused on AI, will arrive in the first half of next year. This chip will allow Intel to focus on some of the markets it wants to compete with, but will not change the base count differential between the two companies. Similarly, Intel may have difficulty setting up a $ 3,000 to $ 7,000 premium for the support of 2TB to 4.5TB of RAM, since AMD is willing to take over up to 4TB of memory on each processor socket.
We do not know yet if Intel will increase the number of central servers with Ice Lake servers or what types of designs it will market, but ICL in the servers is in at least a year. By the time the ICL servers are ready for delivery, AMD's EUV 7 nm designs are also ready. After launching the mother of all refreshment cycles with Rome, AMD's challenge over the next 12 to 24 months will be to demonstrate the continued pace of updates and continuous performance improvement. If so, it is truly able to create the kind of stable business market that has been desired for decades.
When AMD launched the dual-core Opteron and its consumer counterpart, the Athlon 64 X2, the company finally felt that come. A little over a year later, Intel launched the Core 2 Duo. AMD spent the next eleven years roaming the wilderness. Later, the leaders would admit that the company had gone out of sight and was distracted by the acquisition of ATI. A series of problems followed.
The simplistic assumption that P4 Prescott was a disaster for which Intel could not recover had proved inaccurate. In the past, attacking Intel has often been likened to hitting a rubber wall with a Sledgehammer (pun intended). Distorting the wall is relatively easy. To destroy it completely is a much more difficult task. AMD may have the best opportunity to take market share in the company it has always had with Epyc 7 nm, but server sharing construction is a slow and cautious process, not a wind sprint. If AMD wants to keep what it builds this time, it has to play its cards differently from 2005-2006.
But that being said, I do not take lightly phrases like "golden age". I use it now. Although I do not make any predictions as to its duration, the 7nm Epyc's debut officially formalized it: welcome to the second golden age of AMD.
Intel may have launched Cascade Lake relatively recently, but another refresh of the 14-nm server is already on the horizon. Intel has lifted the veil on Cooper Lake today, giving new details on how the processor integrates into its product line with the 10-nm Ice Lake server chips supposed to be queuing for the deployment in 2020.
Cooper Lake features include support for Google's bfloat16 format. It will also support up to 56 processor cores in a snap-in format, unlike Cascade Lake-AP, which can scale up to 56 cores but only in a welded BGA configuration. The new take would be known as LGA4189. There is reports that these chips could offer up to 16 channels of memory (since Cascade Lake-AP and Cooper Lake use multiple chips on the same chip, Intel could run up to 16 channels of memory per socket with version double chip).
Bfloat16 support is a major addition to Intel's artificial intelligence efforts. While 16-bit semi-precision floating point numbers have been defined in the IEEE 754 standard for over 30 years, bfloat16 changes the balance between the format used for significant digits and that used for exponents. The original IEEE 754 standard is designed to give priority to precision, with only five bits of exponent. The new format allows a much larger range of values but with less precision. This is particularly useful for artificial intelligence and deep learning calculations, and is a major step on Intel's path to improving the performance of artificial intelligence and deep processor learning computations. Intel has released a White Book on bfloat16 if you are looking for more information on the subject. Google says that using bfloat16 instead of the conventional semi-precision floating point can generate significant performance benefits. The society written"Some operations are related to the memory bandwidth, which means that the memory bandwidth determines the time spent in such operations. Storing the inputs and outputs of memory bandwidth-related operations in bfloat16 format reduces the amount of data to be transferred, improving the speed of operations. "
The other benefit of Cooper Lake is that the CPU would share a socket with the upcoming Ice Lake servers in 2020. A theoretically important distinction between the two families is that Ice Lake servers at 10 nm can not support bfloat16, while 14nm Cooper Lake servers will. This could be the result of increased differentiation of Intel's product lines, although it is also possible that this reflects the difficult development of 10 nm.
The introduction of 56 cores as a base indicates that Intel expects Cooper Lake to expand to more customers than the Cascade Lake / Cascade Lake-AP target number. It also raises questions about the type of Ice Lake servers that Intel is going to market and the possibility of seeing 56-core versions of these chips as well. To date, all of Intel's 10-nm Ice Lake messaging has focused on servers or mobile devices. This may reflect the strategy used by Intel for Broadwell, where desktop versions of the processor were scarce, and where server and server components dominated this family – but Intel says later the fact of not publishing Broadwell desktop was a mistake and that the company had gaffed by skipping the market. Does this mean that Intel keeps launching an Ice Lake desktop or if the company has decided to no longer use its desktop computer? made understand that this time is not yet clear.
Cooper Lake's attention to AI treatment means that it is not necessarily meant to go with AMD's next 7 nm Epyc. AMD has not talked much about AI or machine learning on its processors and, although its 7nm chips add support for 256-bit AVX2 operations, the company's CPU division does not tell us has not yet hinted that a particular goal is the AI market. AMD's efforts in this area are still based on a graphics processor and, although its processors will certainly work with AI code, it does not seem that the market is at the same level as that of Intel. Between the addition of a new support for AI to existing Xeons, its products Movidius and Nervana, projects like Loihiand plans the data center market with Xe, Intel is trying to build a market to protect its high-performance computing and high-end server operations, and to address Nvidia's current dominance of the industry.
In recent years, Intel has been talking about its Cascade Lake servers with DL Boost (also called VNNI, Vector Neural Net Instructions). These new features are a subset of the AVX-512 and are intended to specifically accelerate processor performance in artificial intelligence applications. Historically, many AI applications have favored GPUs over processors. The GPU architecture is much better suited to graphics processors than processors. Processors offer many more thread-based execution resources, but even today's multicore processors are overwhelmed by the parallelism available in a high-end graphics processor core.
Anandtech did you compare the performance of Cascade Lake, the Epyc 7601 (soon outperformed by the AMD Rome 7nm processors, but still today AMD's main server core), and a RTX Titan. The article, written by the excellent Johan De Gelas, discusses different types of neural networks beyond CNN networks (convolutional neural networks), which are generally compared, and explains how a key element of the strategy Intel is competing with Nvidia in workloads where GPUs are not as powerful. or can not yet meet the emerging market needs due to memory capacity constraints (GPUs still can not match CPUs here), the use of 'light' artificial intelligence models does not not requiring long periods of workout, or artificial intelligence models that rely on statistical models of non-neural networks.
Growing revenue from data centers is a critical part of Intel's global strategy for artificial intelligence and machine learning. Nvidia, meanwhile, is keen to protect a market in which it currently has virtually no competition. Intel's Artificial Intelligence strategy is broad and encompasses many products, from Movidius and Nervana to DL Boost on Xeon, to the next GPU Xe range. Nvidia seeks to show that GPUs can be used to handle artificial intelligence calculations in a wider range of workloads. Intel incorporates new artificial intelligence features into its existing products, uses new hardware that, it is hoped, will impact the market, and attempts to create its first serious GPU to challenge the work done by AMD and Nvidia in the consumer market.
Anandtech's benchmarks show, overall, that the gap between Intel and Nvidia remains wide, even with DL Boost. This graph of a recurrent neural network test used a "Long Short Term Memory (LSTM)" network as a neural network. A type of RNN, LSTM "selectively remembers" patterns over a period of time. "Anandtech also used three different configurations to test it – Tensorflow ready to use with conda, a Tensorflow optimized for Intel with PyPi, and optimized version of Tensorflow from the source using Bazel, using the latest version of Tensorflow.
This pair of images captures the relative scale between the processors as well as the comparison with the RTX Titan. Ready-to-use performance was quite poor under AMD, although it improved with the optimized code. Intel's performance skyrocketed like a rocket when the source-optimized version was tested, but even the source-optimized version did not fit the performance of Titan RTX very well. De Gelas notes: "Secondly, we were pretty surprised that our Titan RTX is less than three times faster than our dual Xeon setup," which explains how these comparisons are done in the larger article.
DL Boost is not enough to narrow the gap between Intel and Nvidia, but in all fairness, this has probably never been supposed to be. Intel's goal is to improve AI performance enough on Xeon to make the execution of these plausible workloads on servers that will be mainly used for other purposes, or when creating models of artificial intelligence that do not meet the constraints of D & C A modern graphics processor. The long-term goal of the company is to compete in the AI market with a range of equipment, not just Xeons. Since Xe is not quite ready yet, competing in the HPC space means competing with Xeon for the moment.
For those of you who are wondering about AMD, AMD does not really talk about performing artificial intelligence workloads on Epyc processors, but is focused on its RocM initiative to run CUDA code under OpenCL. AMD does not talk much about this side of its business, but Nvidia dominates the GPU application market for AI and HPC. AMD and Intel both want a piece of space. At the moment, both seem to be fighting for one.