{"id":227015,"date":"2017-07-11T10:54:18","date_gmt":"2017-07-11T14:54:18","guid":{"rendered":"http:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/uncategorized\/ethernet-getting-back-on-the-moores-law-track-the-next-platform.php"},"modified":"2017-07-11T10:54:18","modified_gmt":"2017-07-11T14:54:18","slug":"ethernet-getting-back-on-the-moores-law-track-the-next-platform","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/moores-law\/ethernet-getting-back-on-the-moores-law-track-the-next-platform.php","title":{"rendered":"Ethernet Getting Back On The Moore&#8217;s Law Track &#8211; The Next Platform"},"content":{"rendered":"<p><p>    July 10, 2017 Timothy    Prickett Morgan  <\/p>\n<p>    It would be ideal if we lived in a universe where it was    possible to increase the capacity of compute, storage, and    networking at the same pace so as to keep all three elements    expanding in balance. The irony is that over the past two    decades, when the industry needed for networking to advance the    most, Ethernet got a little stuck in the mud.  <\/p>\n<p>    But Ethernet has pulls out of its boots and left them in the    swamp and is back to being barefoot again on much more solid    ground where it can run faster. The move from 10 Gb\/sec to 40    Gb\/sec was slow and costly, and if it were not for the    hyperscalers and their intense bandwidth hunger we might not    even be at 100 Gb\/sec Layer 2 and Layer 3 switching, much less    standing at the transition to 200 Gb\/sec and looking ahead to    the not-to-distant future when 400 Gb\/sec will be available.  <\/p>\n<p>    Bandwidth has come along just at the right moment, when    advances in CPU throughput are stalling as raw core performance    did a decade ago and as new adjunct processing capabilities,    embodied in GPUs, FPGAs, and various kinds of specialized    processors are coming to market to get compute back on the    Moores Law track. Storage, thanks to flash and persistent    flash-like and DRAM-like memories such as 3D XPoint from Intel    and Micron Technology, is also undergoing an evolution. It is a    fun time to be a system architect, but perhaps only because we    know that with these advanced networking options that bandwidth    is not going to be a bottleneck.  <\/p>\n<p>    The innovation that is allowing Ethernet to not leap ahead so    much as jump to where it should have already been is PAM-4    signaling. The typical non-return to zero, or NRZ, modulation    used with Ethernet switching hardware, cabling, and server    adapters can encode one bit on a signal. With pulse amplitude    modulation, or PAM, multiple levels of signaling can be    encoded, so multiple bits can be encoded in the signal. With    PAM-4, there are four levels of signaling which allow for two    bits of data to be encoded at the same time on the signal,    which doubles the effective bandwidth of a signal without    increasing the clock rate. And looking ahead down the road,    there is a possibility of stuffing even more bits in the wire    using higher levels of PAM, and the whiteboards of the    networking world are sketching out how to do three bits per    signal with PAM-8 encoding and four bits per signal with PAM-16    encoding.  <\/p>\n<p>    With 40 Gb\/sec Ethernet, we originally had 10 Gb\/sec lanes    aggregated. This was not a very energy efficient way to do 40    Gb\/sec, and it was even worse for early 100 Gb\/sec Ethernet    aggregation gear, which ganged up ten 10 Gb\/sec lanes. When the    hyperscalers nudged the industry along in July 2014 to backcast    this 25 GHz (well, really 28 GHz before encoding) to 25 Gb\/sec    and 50 Gb\/sec Ethernet switching with backwards compatibility    to run 10 Gb\/sec and 40 Gb\/sec, the industry did it. So we got    to affordable 100 Gb\/sec switching with four lanes running at    25 Gb\/sec, and there were even cheaper 25 Gb\/sec and 50 Gb\/sec    options for situations where bandwidth needs were not as high,    and at a much better cost. (Generally, you got 2.5X the    bandwidth for 1.5X to 1.8X the cost, depending on the switch    configuration.)  <\/p>\n<p>    With the 200 Gb\/sec Spectrum-2 Ethernet switching that Mellanox    Technologies is rolling out, and that other switch makers are    going to adopt, the signaling is still running at 25 GHz    effective, but with the Spectrum-2 gear Mellanox has just    unveiled, it is layering on PAM-4 modulation to double pump the    wires, so it delivers 50 Gb\/sec per lane even though it is    still running at the same speed as 100 Gb\/sec Ethernet lanes.    And to reach 400 Gb\/sec with Spectrum-2 gear, Mellanox is    planning to widen out to eight lanes running at this 25 GHz    (effective) while layering on PAM-4 modulation to get 100    Gb\/sec effective per lane. At some point, the lane speed will    have to increase to 50 GHz, but with PAM-8 modulation the    switching at eight lanes could be doubled again to 800 GB\/sec,    and with PAM-16 you could hit 1.6 TB\/sec. Adding in the 50 GHz    real signaling here would get us to 3.2 TB\/sec  something that    still probably seems like a dream and that is probably also    very far into the future.  <\/p>\n<p>    This all sounds a lot easier in theory than it will be to    actually engineer, Kevin Deierling, vice president of marketing    at Mellanox, tells The Next Platform. You can go to    PAM-8 and you can go to Pam-16, but when you do that, you are    starting to shrink the signal and it gets harder and harder to    discriminate from one level in the signal and the next. Your    signal-to-noise ratio goes away because you are shrinking your    signal. Some folks are saying lets go to PAM-8 modulation, and    other folks are saying that they need to use faster signaling    rates like 50 GHz. I think we will see a combination of both.  <\/p>\n<p>    The sweet thing about using PAM-4 to get to 200 Gb\/sec    switching is that the same SFP28 and QSFP28 adapters and cables    that were used for 100 Gb\/sec switching (and that are used for    the 200 Gb\/sec Quantum HDR InfiniBand     that was launched by Mellanox last year and that will start    shipping later this year) are used for the doubled up Ethernet    speed bump. You need better copper cables for Spectrum-2    because the signal-to-noise ratio is shrinking, and similarly    the optical transceivers need to be tweaked for the same    reason. But the form factors for the adapters and switch ports    remain the same.  <\/p>\n<p>    With the 400 Gb\/sec Spectrum-2 switching, the adapters have new    wider form factors, with Mellanox supporting the QSFP-DD (short    for double density) option instead of the OSFP (short for Octal    Small Form Factor) option for optical ports. Deierling says    Mellanox will let the market decide and support whatever it    wants  one, the other, or both  but it is starting with    QSFP-DD.  <\/p>\n<p>    The Spectrum-2 ASIC can deliver 6.4 Tb\/sec of aggregate    switching bandwidth, and it can be carved up in a bunch of    ways, including 16 ports at 400 Gb\/sec, 32 ports at 200 Gb\/sec,    64 ports at 100 Gb\/sec (using splitter cables), and 128 ports    running at 25 Gb\/sec or 50 Gb\/sec (again, using splitter    cables). The Spectrum-2 chip can handle up to 9.52 billion    packets per second, and has enough on chip SRAM to handle    access control lists (ACLs) that span up to 512,000; with one    of the 200 Gb\/sec ports and a special FPGA accelerator that is    designed to act as an interface to a chunk of external DRAM    next to the chip, the Spectrum-2 can handle up to 2    million additional routes on the ACL  what Deierling says is    the first internet-scale Ethernet switch based on a commodity    ASIC that is suitable for hyperscaler-class customers who want    to do Layer 3 routing on a box at the datacenter scale.  <\/p>\n<p>    As for latency, which is something that everyone is always    concerned with, the port-to-port hop on the Spectrum-2 switch    is around 300 nanoseconds, and this is about as low as the    Ethernet protocol, which imposes a lot of overhead, can go,    according to Deierling. The SwitchX-2 and Quantum InfiniBand    ASICs from Mellanox can push latencies down to 100 nanoseconds    or a tiny bit lower, but that is where InfiniBand hits a wall.  <\/p>\n<\/p>\n<p>    At any rate, Mellanox reckons that Spectrum-2 has the advantage    in switching capacity, with somewhere between 1.6X and 1.8X the    aggregate switching bandwidth compared to its competition  and    without packet loss  and somewhere on the order of 1.5X to    1.7X lower latency, too.  <\/p>\n<p>    At the moment, Mellanox is peddling four different    configurations of its Spectrum-2 switches, which are shown    below:  <\/p>\n<\/p>\n<p>    The Spectrum-2 switches are being made available in two    different form factors, two full width devices and two half    width devices. The SN3700 has a straight 32 ports running at    200 Gb\/sec for flat, Clos style networks, while the SN3410 has    48 ports running at 50 Gb\/sec with eight uplinks running at 200    Gb\/sec for more standard three tiered networks used in the    enterprise and sometimes on the edges of the datacenter at    hyperscalers. The SN3100 is a half-width switch that has 16    ports running at 200 Gb\/sec, and the SN3200 has 16 ports    running at 400 Gb\/sec.  <\/p>\n<p>    It is interesting that there is not a full width SN series    switch with 400 Gb\/sec ports. This is intentionally so and    based on the expected deployment scenarios. In scenarios where    a very high bandwidth switch is needed to create a storage    cluster or a hyperconverged storage platform, 16 ports in a    rack is enough and two switches at 16 ports provides redundant    paths between compute and storage or hyperconverged    compute-storage nodes to prevent outages.  <\/p>\n<\/p>\n<p>    There is even a scenarios that, using the VMS Wizard software    for the Spectrum-2 switch that converts a quad of the 2100    Gb\/sec and 400 Gb\/sec switches that creates a virtual modular    switch that with 64 of the SN3410 devices that can support up    to 3,072 ports in a single management domain. Take a look:  <\/p>\n<\/p>\n<p>    This Virtual Modular Switch is about 25 percent less expensive    than actual modular switches with the same port count and lower    bandwidth and higher latency.  <\/p>\n<p>    Programmability is a big issue with networking these days, and    the Spectrum-2 devices will be fully programmable and support    both a homegrown compiler and scripting stack created by    Mellanox as well as     the P4 compiler that was created by Barefoot Networks for its    Tofino Ethernet switch ASICs and that is being    standardized upon by some hyperscalers. Mellanox expects for    hyperscalers to want to do a lot of their own programming, but    that most enterprise customers will simply run the protocols    and routines that Mellanox itself codes for the machines. The    point is, when a new protocol or extension comes along,    Spectrum-2 will be able to adopt it and customers will not have    to wait until new silicon comes out. The industry waited far    too long for VXLAN to be supported in chips, and that will not    happen again.  <\/p>\n<p>    As for pricing, the more bandwidth you get, the more you pay,    but the cost per bit keeps coming down and will for the 200    Gb\/sec and 400 Gb\/sec speeds embodied in the Spectrum-2 lineup.    Pricing depends on volumes and on the cabling, of course, but    here is how it generally looks. With the jump from 40 Gb\/sec to    100 Gb\/sec switching (based on the 25G standard), customers got    a 2.5X bandwidth boost for somewhere between 1.5X and 1.8X the    price  somewhere around a 20 percent to 30 percent    price\/performance benefit. Today, almost two years later, 100    Gb\/sec ports are at price parity with 40 Gb\/sec ports back    then, and Deierling says that a 100 Gb\/sec port costs around    $300 for a very high volume hyperscaler and something like $600    per port for a typical enterprise customer. The jump to 200    Gb\/sec will follow a similar pattern. Customers moving from 100    Gb\/sec to 200 Gb\/sec switches (moving from Spectrum to    Spectrum-2 devices in the Mellanox lineup) will get 2X the    bandwidth for 1.5X the cost. Similarly, those jumping from 100    Gb\/sec to 400 Gb\/sec will get 4X the bandwidth per port for 3X    the cost.  <\/p>\n<p>    Over time, we expect that there will be price parity between    100 Gb\/sec pricing today and 200 Gb\/sec pricing, perhaps two    years hence, and that the premium for 400 Gb\/sec will be more    like 50 percent than 100 percent. But those are just guesses. A    lot depends on what happens in the enterprise. What we do know    is that enterprises are increasingly being forced by their    applications and the latency demands of their end user    applications to deploy the kind of fat tree networks that are    common at HPC centers and hyperscalers and they are moving away    from the over-subscribed, tiered networks of the past where    they could skimp on the switch devices and hope the latencies    were not too bad.  <\/p>\n<p>    Categories: Cloud, Connect, Enterprise, Hyperscale  <\/p>\n<p>    Tags: Ethernet, Mellanox, PAM-4,    Spectrum-2  <\/p>\n<p>    Parameter Encoding on FPGAs Boosts Neural Network    Efficiency OpenPower, Efficiency Tweaks Define Europes DAVIDE    Supercomputer  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>See original here: <\/p>\n<p><a target=\"_blank\" href=\"https:\/\/www.nextplatform.com\/2017\/07\/10\/ethernet-getting-back-moores-law-track\/\" title=\"Ethernet Getting Back On The Moore's Law Track - The Next Platform\">Ethernet Getting Back On The Moore's Law Track - The Next Platform<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> July 10, 2017 Timothy Prickett Morgan It would be ideal if we lived in a universe where it was possible to increase the capacity of compute, storage, and networking at the same pace so as to keep all three elements expanding in balance. The irony is that over the past two decades, when the industry needed for networking to advance the most, Ethernet got a little stuck in the mud. But Ethernet has pulls out of its boots and left them in the swamp and is back to being barefoot again on much more solid ground where it can run faster <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/moores-law\/ethernet-getting-back-on-the-moores-law-track-the-next-platform.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[14],"tags":[],"class_list":["post-227015","post","type-post","status-publish","format-standard","hentry","category-moores-law"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/227015"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=227015"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/227015\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=227015"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=227015"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=227015"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}