{"id":1027985,"date":"2024-02-19T02:47:51","date_gmt":"2024-02-19T07:47:51","guid":{"rendered":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/uncategorized\/cloud-native-efficient-computing-is-the-way-in-2024-and-beyond-servethehome.php"},"modified":"2024-02-19T02:47:51","modified_gmt":"2024-02-19T07:47:51","slug":"cloud-native-efficient-computing-is-the-way-in-2024-and-beyond-servethehome","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/cloud-computing\/cloud-native-efficient-computing-is-the-way-in-2024-and-beyond-servethehome.php","title":{"rendered":"Cloud Native Efficient Computing is the Way in 2024 and Beyond &#8211; ServeTheHome"},"content":{"rendered":"<p><p>    Today we wanted to discuss cloud native and efficient    computing. Many have different names for this, but it is going    to be the second most important computing trend in 2024, behind    the AI boom. Modern performance cores have gotten so big and    fast that there is a new trend in the data center: using    smaller and more efficient cores. Over the next few months, we    are going to be doing a series on this trend.  <\/p>\n<p>    As a quick note: We get CPUs from all of the major silicon    players. Also, since we have tested these CPUs in Supermicro    systems, we are going to say that they are all sponsors of    this, but it is our own idea and content.  <\/p>\n<p>    Let us get to the basics. Once AMD re-entered the server market    (and desktop) with a competitive performance core in 2017,    performance per core and core counts exploded almost as fast as    pre-AI boom slideware on the deluge of data. As a result, cores    got bigger, cache sizes expanded, and chips got larger. Each    generation of chips got faster.  <\/p>\n<p>    Soon, folks figured out a dirty secret in the server industry:    faster per core performance is good if you license software by    core, but there are a wide variety of applications that need    cores, but not fast ones. Todays smaller efficient cores tend    to be on the order of performance of a mainstream Skylake\/    Cascade Lake Xeon from 2017-2021, yet they can be packed more    densely into systems.  <\/p>\n<p>    Consider this illustrative scenario that is far too common in    the industry:  <\/p>\n<p>    Here, we have several apps built by developers over the years.    Each needs its own VM and each VM is generally between 2-8    cores. These are applications that need to be online 247 but    are not ones that need massive amounts of compute. Good    examples are websites that serve a specific line of business    function but do not have hundreds of thousands of visitors.    Also, these tend to be workloads that are already in cloud    instances, VMs, or containers. As the industry has started to    move away from hypervisors with per-core licensing or    per-socket license constraints, scaling up to bigger, faster    cores that are going underutilized makes little sense.  <\/p>\n<p>    As a result, the industry realized it needed lower cost to    produce chips that are chasing density instead of per-core    performance. An awesome way to think about this is to think    about trying to fit the maximum number of instances for those    small line-of-business applications developed over the years    that are sitting in 2-8 core VMs into as few servers as    possible. There are other applications like this as well that    are commonly shown such as nginx web servers, redis servers,    and so forth. Another great example is that some online game    instances require one core per user in the data center, even if    that core is relatively meager. Sometimes just having more    cores is, well, more cores = more better.  <\/p>\n<p>    Once the constraints of legacy hypervisor per core\/ per socket    licensing are removed, then the question becomes how to fit as    many cores on a package, and then how dense those packages can    be deployed in a rack. One other trend we are seeing is not    just more cores, but also lower clock speed cores. CPUs that    have a maximum frequency in the 2-3GHz range today tend to be    considerably more power efficient than those with frequencies    of P-core only servers in the 4GHz+ range and desktop CPUs now    pushing well over 5GHz. This is the voltage frequency curve at    work. If your goal is to have more cores, but do not need    maximum per-core performance, then lowering the performance per    core by 25% but decreasing the power by 40% or more, means that    all of those applications are being serviced with less power.  <\/p>\n<p>    Less power is important for a number of reasons. Today, the    biggest reason is the AI infrastructure build-out. If you, for    example, saw our 49ers Levis Stadium tour video, that is a    perfect example of a data center that is not going to expand in    footprint and can only expand cooling so much. It also is a    prime example of a location that needs AI servers for sports    analytics.  <\/p>\n<p>    That type of constraint where the same traditional work needs    to get done, in a data center footprint that is not changing,    while adding more high-power AI servers is a key reason    cloud-native compute is moving beyond the cloud. Transitioning    applications running on 2017-2021 era Xeon servers to modern    cloud-native cores with approximately the same performance per    core can mean 4-5x the density per system at ~2x the power    consumption. As companies release new generations of CPUs, the    density figures are increasing at a steep rate.  <\/p>\n<p>    We showed this at play with the same era of servers and modern    P-core servers in our     5th Gen Intel Xeon Processors Emerald Rapids review.  <\/p>\n<p>    We also covered the consolidation just between P-core    generations in the accompanying video. We are going to have an    article with the current AMD EPYC Bergamo parts very soon in    a similar vein.  <\/p>\n<p>    If you are not familiar with the current players in the    cloud-native CPU market, that you can buy for your data    centers\/ colocation, here is a quick run-down.  <\/p>\n<p>    The     AMD EPYC Bergamo was AMDs first foray into cloud-native    compute. Onboard, it has up to 128 cores\/ 256 threads and is    the densest publicly available x86 server CPU currently    available.  <\/p>\n<p>    AMD removed L3 cache from its P-core design, lowered the    maximum all core frequencies to decrease the overall power, and    did extra work to decrease the core size. The result is the    same Zen 4 core IP, with less L3 cache and less die area. Less    die area means more can be packaged together onto a CPU.  <\/p>\n<p>    Some stop with Bergamo, but AMD has another Zen 4c chip in the    market. The AMD EPYC 8004 series, codenamed Siena also uses    Zen 4c but with half the memory channels, less PCIe Gen5 I\/O    and single-socket only operation.  <\/p>\n<p>    Some organizations that are upgrading from popular dual 16 core    Xeon servers can move to single socket 64-core Siena platforms    and stay within a similar power budget per U while doubling the    core count per U using 1U servers.  <\/p>\n<p>    AMD markets Siena as the edge\/ embedded part, but we need to    recognize this is in the vein of current gen cloud native    processors.  <\/p>\n<p>    Arm has been making a huge splash into the space. The only Arm    server CPU vendor out there for those buying their own servers,    is Ampere led by many of the former Intel Xeon team.  <\/p>\n<p>    Ampere has two main chips, the Ampere Altra (up to 80 cores)    and Altra Max (up to 128 cores.) These use the same socket and    so most servers can support either. The Max just came out later    to support up to 128 cores.  <\/p>\n<p>    Here, the focus on cloud-native compute is even more    pronounced. Instead of having beefy floating point compute    capabilities, Ampere is using Arm    Neoverse N1 cores that focus on low power integer    performance. It turns out, a huge number of workloads like    serving web pages are mostly integer performance driven. While    these may not be the cores if you wanted to build a Linpack    Top500 supercomputer, they are great for web servers. Since the    cloud-native compute idea was to build cores and servers that    can run workloads with little to no compromise, but at lower    power, that is what Arm and Ampere built.  <\/p>\n<p>    Next up will be the AmpereOne. This is already shipping, but we    have yet to get one in the lab.  <\/p>\n<p>    AmpereOne uses a custom designed core for up to 192 cores per    socket.  <\/p>\n<p>    Assuming you could buy a server with AmpereOne, you would get    more core density than an AMD EPYC Bergamo server (192 vs 128    cores) but you would get fewer threads (192 vs 256 threads.) If    you had 1 vCPU VMs, AmpereOne would be denser. If you had 2    vCPU VMs, Bergamo would be denser. SMT has been a challenge in    the cloud due to some of the security surfaces it exposes.  <\/p>\n<p>    Next in the market will be the Intel Sierra Forest. Intels new    cloud-native processor will offer up to 144\/     288 cores. Perhaps most importantly, it is aiming for a low    power per core metric while also maintaining x86 compatibility.  <\/p>\n<p>    Intel is taking its efficient E-core line and bringing it to    the Xeon market. We have seen massive gains in E-core    performance in both embedded as well as lower-power lines like    the Alder Lake-N where we saw greater than 2x generational    performance per chip. Now, Intel is splitting its line into    P-cores for compute intensive workloads and E-cores for    high-density scale-out compute.  <\/p>\n<p>    Intel will offer Granite Rapids as an update to the current 5th    Gen Xeon Emerald Rapids for all P-core designs later in 2024.    Sierra Forest will be the first generation all E-core design    and is planned for the first half of 2024. Intel already has    announced the next generation     Clearwater Forest will continue the all E-core line. As a    full disclosure, this is a launch I have been excited about for    years.  <\/p>\n<p>    We are going to quickly mention the NVIDIA Grace Superchip    here. With up to 144 cores across two dies packaged along with    LPDDR memory.  <\/p>\n<p>    While at 500W and usingArm    Neoverse V2 performance cores, one would not think of this    as a cloud native processor, it does have something really    different. The Grace Superchip has onboard memory packaged    alongside its Arm CPUs. As a result, that 500W is actually for    CPU and memory. There are applications that are primarily    memory bandwidth bound, not necessarily core count bound. For    those applications, something like a Grace Superchip can    actually end up being a lower-power solution than some of the    other cloud-native offerings. These are also not the easiest to    get, and are priced at a significant premium. One could easily    argue these are not cloud-native, but if our definition is    doing the same work in a smaller more efficient footprint, then    the Grace Superchip might actually fall into that category for    a subset of workloads.  <\/p>\n<p>    If you were excited for our     2nd to 5th Gen Intel Xeon server consolidation piece, get    ready. To say that the piece we did in late 2023 was just the    beginning would be an understatement.  <\/p>\n<p>    While many are focused on AI build-outs, projects to shrink    portions of existing compute footprints by 75% or more are    certainly possible, making more space, power, and cooling    available for new AI servers. Also, just from a carbon    footprint perspective, using newer and significantly more    power-efficient architectures to do baseline application    hosting makes a lot of sense.  <\/p>\n<p>    The big question in the industry right now on CPU compute is    whether cloud native energy-efficient computing is going to be    25% of the server CPU market in 3-5 years, or if it is going to    be 75%. My sense is that it likely could be 75%, or perhaps    should be 75%, but organizations are slow to move. So at STH,    we are going to be doing a series to help overcome that    organizational inertia and get compute on the right-sized    platforms.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>More: <\/p>\n<p><a target=\"_blank\" rel=\"nofollow noopener\" href=\"https:\/\/www.servethehome.com\/cloud-native-efficient-computing-is-the-way-in-2024-and-beyond-supermicro-intel-amd-ampere-nvidia\" title=\"Cloud Native Efficient Computing is the Way in 2024 and Beyond - ServeTheHome\">Cloud Native Efficient Computing is the Way in 2024 and Beyond - ServeTheHome<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> Today we wanted to discuss cloud native and efficient computing. Many have different names for this, but it is going to be the second most important computing trend in 2024, behind the AI boom. Modern performance cores have gotten so big and fast that there is a new trend in the data center: using smaller and more efficient cores.  <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/cloud-computing\/cloud-native-efficient-computing-is-the-way-in-2024-and-beyond-servethehome.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[494695],"tags":[],"class_list":["post-1027985","post","type-post","status-publish","format-standard","hentry","category-cloud-computing"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/1027985"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=1027985"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/1027985\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=1027985"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=1027985"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=1027985"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}