Championship 2016/17 table: Super Computer predicts end of season final places – *February update* – talkSPORT.com

Monday, February 20, 2017

With two-thirds of the Championship season played, the table is finally starting to take shape.

Many people expected surprise packages Huddersfield Town and Leeds United to drop off the pace, while big-spending Aston Villa have underachieved in their attempts to make an instant return to the Premier League.

Brighton and Newcastle have been fighting for top spot all season, while Norwich's recent defeat at Burton leaves a seven-point gap between sixth [the final play-off place] and seventh.

At the other end of the table, barring a miracle upturn in form like last season under Neil Warnock, Rotherham United look destined for the drop and it is a question of who will be joining them in May.

Birmingham City continue to slide underGianfranco Zola, while Bristol City and Aston Villa are on poor runs.

talkSPORT has decided to submit this season's data into the Super Computer to see howthe Championship table will look in May. At this stage, no one can say with any certainty, but our popular systemhas had a good go of predicting the final standings. Based on information available on February 20, 2016, take a look at how the Super Computer is predicting the final standings - from 24th to first -by clicking the right arrow, above.

The table, of course, should be taken with a pinch of salt! Anything can happen over the next three months, but it is always fun to speculate. Where do you think your team will finish? Share your thoughts by leaving a comment below...

Read more:

Championship 2016/17 table: Super Computer predicts end of season final places - *February update* - talkSPORT.com

China’s new supercomputer will be 10 times faster – Economic Times

BEIJING: China has started to build a new-generation supercomputer that is expected to be 10 times faster than the current world champion, a media report said.

This year, China is aiming for breakthroughs in high-performance processors and other key technologies to build the world's first prototype exascale supercomputer, the Tianhe-3, said Meng Xiangfei, the director of application at the National Super Computer Tianjin Centre, on Monday.

The prototype is expected to be completed by 2018, the China Daily reported.

"Exascale" means it will be capable of making a quintillion (1 followed by 18 zeros) calculations per second. That is at least 10 times faster than the world's current speed champ, the Sunway TaihuLight, China's first supercomputer to use domestically designed processors. That computer has a peak speed of 125 quadrillion (1 followed by 15 zeros) calculations per second, he said.

"Its computing power is on the next level, cementing China as the world leader in supercomputer hardware," Meng said.

It would be available for public use and "help us tackle some of the world's toughest scientific challenges with greater speed, precision and scope", he added.

Tianhe-3 will be made entirely in China, from processors to operating system. It will be stationed in Tianjin and fully operational by 2020, earlier than the US plan for its exascale supercomputer, he said.

Tianhe-1, China's first quadrillion-level supercomputer developed in 2009, is now working at full capacity, undertaking more than 1,400 assignments each day, solving problems "from stars to cells".

The exascale supercomputer will be able to analyse smog distribution on a national level, while current models can only handle a district, the daily said.

Tianhe-3 also could simulate earthquakes and epidemic outbreaks in more detail, allowing swifter and more effective government responses, Meng said.

The new machine also will be able to analyse gene sequence and protein structures in unprecedented scale and speed. That may lead to new discoveries and more potent medicine, he said.

Read more from the original source:

China's new supercomputer will be 10 times faster - Economic Times

Next-Generation TSUBAME Will Be Petascale Supercomputer for AI – TOP500 News

The Tokyo Institute of Technology, also known as Tokyo Tech, has revealed that the TSUBAME 3.0 supercomputer scheduled to be installed this summer will provide 47 half precision (16-bit) petaflops of performance, making it one of the most powerful machines on the planet for artificial intelligence computation. The system is being built by HPE/SGI and will feature NVIDIAs Tesla P100 GPUs.

Source:Tokyo Institute of Technology

For Tokyo Tech, the use of NVIDIAs latest P100 GPUs is a logical step in TSUBAMEs evolution. The original 2006 system used ClearSpeed boards for acceleration, but was upgraded in 2008 with the Tesla S1040 cards. In 2010, TSUBAME 2.0 debuted with the Tesla M2050 modules, while the 2.5 upgrade included both the older S1050 and S1070 parts plus the newer Tesla K20X modules. Bringing the P100 GPUs into the TSUBAME lineage will not only help maintain backward compatibility for the CUDA applications developed on the Tokyo Tech machines for the last nine years, but will also provide an excellent platform for AI/machine learning codes.

In a press release from NVIDIA published Thursday, Tokyo Techs Satoshi Matsuoka, a professor of computer science who is building the system, said, NVIDIAs broad AI ecosystem, including thousands of deep learning and inference applications, will enable Tokyo Tech to begin training TSUBAME 3.0 immediately to help us more quickly solve some of the worlds once unsolvable problems.

For Tokyo Techs supercomputing users, its a happy coincidence that the latest NVIDIA GPU is such a good fit with regard to AI workloads. Interest in artificial intelligence is especially high in Japan, given the countrys manufacturing heritage in robotics and what seems to be almost a cultural predisposition to automate everything.

When up and running, TSUBAME 3.0 will operate in conjunction with the existing TSUBAME 2.5 supercomputer, providing a total of 64 half precision petaflops. That would make it Japans top AI system, although the title is likely to be short-lived. The Tokyo-based National Institute of Advanced Industrial Science and Technology (AIST) is also constructing an AI-capable supercomputer, which is expected to supply 130 half precision petaflops when it is deployed in late 2017 or early 2018.

Although NVIDIA and Tokyo Tech are emphasizing the AI capability of the upcoming system, like its predecessors, TSUBAME 3.0 will also be used for conventional 64-bit supercomputing applications, and will be available to Japans academic research community and industry partners. For those traditional HPC tasks, it will rely on its 12 double precision petaflops, which will likely earn it a top 10 spot on the June TOP500 list if they can complete a Linpack run in time.

The system itself is a 540-node SGI ICE XA cluster, with each node housing two Intel Xeon E5-2680 v4 processors, four NVIDIA Tesla P100 GPUs, and 256 GB of main memory. The compute nodes will talk to each other via Intels 100 Gbps Omni-Path network, which will also be extended to the storage subsystem.

Speaking of which, the storage infrastructure will be supplied by Data Direct Networks (DDN) and will provide 15.9 petabytes of Lustre file system capacity based on three ES14KX appliances. The ES14KX is currently DDNs top-of-the-line file system storage appliance, delivering up 50 GB/seconds of I/O per enclosure. It can theoretically scale to hundreds of petabytes, so the TSUBAME 3.0 installation will be well within the products reach.

Energy efficiency is also like to be a feature of the new system, thanks primarily to the highly proficient P100 GPUs. In addition, the TSUBAME 3.0 designers are equipping the supercomputer with a warm water cooling system and are predicting a PUE (Power Usage Effectiveness) as high as 1.033. That should enable the machine to run at top speed without the need to throttle it back during heavy use. A top 10 spot on the Green500 list is all but assured.

View original post here:

Next-Generation TSUBAME Will Be Petascale Supercomputer for AI - TOP500 News

Inside the race to build the fastest ever supercomputer – EWN – Eyewitness News

The fastest supercomputer in the world, the Sunway TaihuLight, is about to lose its title with the Japanese planning to build something even faster.

FILE: An individual working on a computer. Picture: iStock.

When China unveiled the Sunway TaihuLight in June 2016, it became the fastest supercomputer in the world. It easily surpassed the previous record holder, Tianhe-2. Its almost three times as fast. But now, the title it has held for less than a year is under threat, with the Japanese planning to build something even faster.

The Ministry of Economy, Trade and Industry plans to invest 19.5 billion yen ($172 million) in the new machine, as part of an attempt to revitalise Japans electronics industry and reassert Japan's technical dominance.

Recent years have seen Japan's lead challenged by competition from South Korea and China, but the Japanese government hopes to reverse that trend.

IMMENSE COMPUTING POWER

The new machine, called the AI Bridging Cloud Infrastructure, or ABCI, is designed to have a capacity of 130 petaflops. That means it will be able to perform 130 quadrillion calculations a second. Still confused? Well for the sake of easy comparison, thats equal to the computing power of 70,652 Playstation 4s.

As well as out-computing the current Chinese machine it will also be nearly ten times as fast as the Oakforest-PACS, the current fastest Japanese supercomputer, whose 13.6 petaflops will be dwarfed by those of the new machine. As far as we know, there is nothing out there that is as fast, said Satoshi Sekiguchi, director general at Japan's National Institute of Advanced Industrial Science and Technology.

Perhaps the most ambitious aspect of the proposed machine, though, is its hyper-efficient power consumption. The computers designers are aiming for a power consumption of fewer than three megawatts. This would be five times lower than the TaihuLight, and the same as the Oakforest-PACS, the output of which is ten times lower.

THE APPLICATION

While other countries have optimised their most powerful computers for processes such as atmospheric modelling or nuclear weapon simulations, AIST aims to use the new machine to accelerate advances in AI technology. ABCI could help companies improve driverless vehicles by analysing vast amounts of traffic data. According to Sekiguchi, the supercomputer could also be used to mine medical records to develop new services and applications.

The computer will also be made available to Japanese corporations for a fee, said Sekiguchi, alongside others involved in the project. Japanese companies currently outsource their major data-crunching to foreign firms such as Google or Microsoft.

THE FUTURE

Japan hopes that ABCI will be operational by 2018, whereupon it will take the top spot on the TOP 500s ranking list of supercomputers.

It might not stay there for very long, though. Computer manufacturer Atos has already begun work on the Bull sequana supercomputer for the French Alternative Energies and Atomic Energy Commission (CEA). This machine is projected to have a performance of one exaflop, meaning that it will be able to perform a billion calculations a second - almost seven and a half times faster than the ABCI.

The French machine won't be operational until 2020 however, meaning ABCI should still enjoy a spot in the supercomputing limelight.

This article was republished courtesy of World Economic Forum.

Written by Robert Guy, content producer, Formative Content.

See more here:

Inside the race to build the fastest ever supercomputer - EWN - Eyewitness News

Nvidia Will Power Japan’s fastest AI Super Computer, Tsubame 3.0 Launching This Summer – SegmentNext

Nvidia will partner with Tokyo Institute of technology for Japans fastest AI supercomputer known as Tsubame 3.0. Tsubame 3.0 is said to be much better than its predecessor Tsubame 2.5, Nvidia will provide the Pascal-based Tesla P1000 GPU technology, accelerating the performance in Tsubame 3.0.

See Also: Nvidia GTX 1080 Ti Launch At GDC GeForce GTX Celebration?

Precisely, experts say that newfastest AI supercomputer: Tsubame 3.0 will be 3X efficient than its predecessor. This will be possible only with the latest GPU technology from Nvidia. Reportedly, new GPUs by Nvidia will deliver up to 12.2 petaflops of double precision performance.

Additionally, Nvidia says that a combination of Tsubame 3.0 and Tsubame 2.5 will pack a performance of 64.3 petaflops and this will make a combined system of Supercomputers to rank in top 10 around the globe.

Tokyo Techs Satoshi Matsuoka, a professor of computer science who is building the fastest AI super computer, further praised the partnership with Nvidia and said:

Nvidias broad AI ecosystem, including thousands of deep learning and inference applications, will enable Tokyo Tech to begin training Tsubame 3.0 immediately to help us more quickly solve some of the worlds once unsolvable problem

In parallel news, Nvidias top tier powerful yet reasonable GPU may be revealed soon at Nvidias GDC event. Nvidias GTX 1080 Ti is rumored to be in production and that it may arrive in late March.

The graphics card is expected to launch from March 20th to 23rd. The custom designed AIB versions of Nvidias GTX 1080Ti are expected to come out in the market sometime after the launch. Since the Geforce GTX 1080Ti will be slightly less in power than TITAN X, its price tag is expected to fall in the range of $750 to $900 at launch.

The graphics card is based on Nvidias GP102 Silicon. Nvidia will host its own show at GDC 2017 and we look forward to see and know more on GeForce GTX 1080Ti. To know more on this story, check out here.

Go here to see the original:

Nvidia Will Power Japan's fastest AI Super Computer, Tsubame 3.0 Launching This Summer - SegmentNext

Stampede Supercomputer Assists With Real-Time MRI Analysis – HPCwire (blog)

Feb. 17 One of the main tools doctors use to detect diseases and injuries in cases ranging from multiple sclerosis to broken bones is magnetic resonance imaging (MRI). However, the results of an MRI scan take hours or days to interpret and analyze. This means that if a more detailed investigation is needed, or there is a problem with the scan, the patient needs to return for a follow-up.

A new, supercomputing-powered, real-time analysis system may change that.

Researchers from the Texas Advanced Computing Center (TACC), The University of Texas Health Science Center (UTHSC) and Philips Healthcare, have developed a new, automated platform capable of returning in-depth analyses of MRI scans in minutes, thereby minimizing patient callbacks, saving millions of dollars annually, and advancing precision medicine.

The team presented a proof-of-concept demonstration of the platform at theInternational Conference on Biomedical and Health Informaticsthis week in Orlando, Florida.

The platform they developed combines the imaging capabilities of the Philips MRI scanner with the processing power of theStampedesupercomputer one of the fastest in the world using the TACC-developedAgave API Platforminfrastructure to facilitate communication, data transfer, and job control between the two.

An API, or Application Program Interface, is a set of protocols and tools that specify how software components should interact. Agave manages the execution of the computing jobs and handles the flow of data from site to site. It has been used for a range of problems, from plant genomics to molecular simulations, and allows researchers to access cyberinfrastructure resources like Stampede via the web.

The Agave Platform brings the power of high-performance computing into the clinic, said William (Joe) Allen, a life science researcher for TACC and lead author on the paper. This gives radiologists and other clinical staff the means to provide real-time quality control, precision medicine, and overall better care to the patient.

The entire article can be found here.

Source:Aaron Dubrow, TACC

Read more:

Stampede Supercomputer Assists With Real-Time MRI Analysis - HPCwire (blog)

Inside the race to build the fastest ever supercomputer – Eyewitness News

When China unveiled the Sunway TaihuLight in June 2016, it became the fastest supercomputer in the world. It easily surpassed the previous record holder, Tianhe-2. Its almost three times as fast. But now, the title it has held for less than a year is under threat, with the Japanese planning to build something even faster.

The Ministry of Economy, Trade and Industry plans to invest 19.5 billion yen ($172 million) in the new machine, as part of an attempt to revitalise Japans electronics industry and reassert Japan's technical dominance.

Recent years have seen Japan's lead challenged by competition from South Korea and China, but the Japanese government hopes to reverse that trend.

IMMENSE COMPUTING POWER

The new machine, called the AI Bridging Cloud Infrastructure, or ABCI, is designed to have a capacity of 130 petaflops. That means it will be able to perform 130 quadrillion calculations a second. Still confused? Well for the sake of easy comparison, thats equal to the computing power of 70,652 Playstation 4s.

As well as out-computing the current Chinese machine it will also be nearly ten times as fast as the Oakforest-PACS, the current fastest Japanese supercomputer, whose 13.6 petaflops will be dwarfed by those of the new machine. As far as we know, there is nothing out there that is as fast, said Satoshi Sekiguchi, director general at Japan's National Institute of Advanced Industrial Science and Technology.

Perhaps the most ambitious aspect of the proposed machine, though, is its hyper-efficient power consumption. The computers designers are aiming for a power consumption of fewer than three megawatts. This would be five times lower than the TaihuLight, and the same as the Oakforest-PACS, the output of which is ten times lower.

THE APPLICATION

While other countries have optimised their most powerful computers for processes such as atmospheric modelling or nuclear weapon simulations, AIST aims to use the new machine to accelerate advances in AI technology. ABCI could help companies improve driverless vehicles by analysing vast amounts of traffic data. According to Sekiguchi, the supercomputer could also be used to mine medical records to develop new services and applications.

The computer will also be made available to Japanese corporations for a fee, said Sekiguchi, alongside others involved in the project. Japanese companies currently outsource their major data-crunching to foreign firms such as Google or Microsoft.

THE FUTURE

Japan hopes that ABCI will be operational by 2018, whereupon it will take the top spot on the TOP 500s ranking list of supercomputers.

It might not stay there for very long, though. Computer manufacturer Atos has already begun work on the Bull sequana supercomputer for the French Alternative Energies and Atomic Energy Commission (CEA). This machine is projected to have a performance of one exaflop, meaning that it will be able to perform a billion calculations a second - almost seven and a half times faster than the ABCI.

The French machine won't be operational until 2020 however, meaning ABCI should still enjoy a spot in the supercomputing limelight.

This article was republished courtesy of World Economic Forum.

Written by Robert Guy, content producer, Formative Content.

Continued here:

Inside the race to build the fastest ever supercomputer - Eyewitness News

Cheyenne Supercomputer Triples Scientific Capability with Greater Efficiency – Scientific Computing

The National Center for Atmospheric Research (NCAR) is launching operations this month of one of the world's most powerful and energy-efficient supercomputers, providing the nation with a major new tool to advance understanding of the atmospheric and related Earth system sciences.

Named "Cheyenne," the 5.34-petaflop system is capable of more than triple the amount of scientific computing performed by the previous NCAR supercomputer, Yellowstone. It also is three times more energy efficient.

Scientists across the country will use Cheyenne to study phenomena ranging from wildfires and seismic activity to gusts that generate power at wind farms. Their findings will lay the groundwork for better protecting society from natural disasters, lead to more detailed projections of seasonal and longer-term weather and climate variability and change, and improve weather and water forecasts that are needed by economic sectors from agriculture and energy to transportation and tourism.

"Cheyenne will help us advance the knowledge needed for saving lives, protecting property, and enabling U.S. businesses to better compete in the global marketplace," said Antonio J. Busalacchi, president of the University Corporation for Atmospheric Research. "This system is turbocharging our science."

UCAR manages NCAR on behalf of the National Science Foundation (NSF).

Cheyenne currently ranks as the 20th fastest supercomputer in the world and the fastest in the Mountain West, although such rankings change as new and more powerful machines begin operations. It is funded by NSF as well as by the state of Wyoming through an appropriation to the University of Wyoming.

Cheyenne is housed in the NCAR-Wyoming Supercomputing Center (NWSC), one of the nation's premier supercomputing facilities for research. Since the NWSC opened in 2012, more than 2,200 scientists from more than 300 universities and federal labs have used its resources.

"Through our work at the NWSC, we have a better understanding of such important processes as surface and subsurface hydrology, physics of flow in reservoir rock, and weather modification and precipitation stimulation," said William Gern, vice president of research and economic development at the University of Wyoming. "Importantly, we are also introducing Wyomings school-age students to the significance and power of computing."

The NWSC is located in Cheyenne, and the name of the new system was chosen to honor the support the center has received from the people of that city. The name also commemorates the upcoming 150thanniversary of the city, which was founded in 1867 and named for the American Indian Cheyenne Nation.

INCREASED POWER, GREATER EFFICIENCY

Cheyenne was built by Silicon Graphics International, or SGI (now part of Hewlett Packard Enterprise Co.), with DataDirect Networks (DDN) providing centralized file system and data storage components. Cheyenne is capable of 5.34 quadrillion calculations per second (5.34 petaflops, or floating point operations per second).

The new system has a peak computation rate of more than 3 billion calculations per second for every watt of energy consumed. That is three times more energy efficient than the Yellowstone supercomputer, which is also highly efficient.

The data storage system for Cheyenne provides an initial capacity of 20 petabytes, expandable to 40 petabytes with the addition of extra drives. The new DDN system also transfers data at the rate of 220 gigabytes per second, which is more than twice as fast as the previous file systems rate of 90 gigabytes per second.

Cheyenne is the latest in a long and successful history of supercomputers supported by the NSF and NCAR to advance the atmospheric and related sciences.

Were excited to provide the research community with more supercomputing power, said Anke Kamrath, interim director of NCARs Computational and Information Systems Laboratory, which oversees operations at the NWSC. Scientists have access to increasingly large amounts of data about our planet. The enhanced capabilities of the NWSC will enable them to tackle problems that used to be out of reach and obtain results at far greater speeds than ever.

MORE DETAILED PREDICTIONS

High-performance computers such as Cheyenne allow researchers to run increasingly detailed models that simulate complex events and predict how they might unfold in the future. With more supercomputing power, scientists can capture additional processes, run their models at a higher resolution, and conduct an ensemble of modeling runs that provide a fuller picture of the same time period.

"Providing next-generation supercomputing is vital to better understanding the Earth system that affects us all, " said NCAR Director James W. Hurrell. "We're delighted that this powerful resource is now available to the nation's scientists, and we're looking forward to new discoveries in climate, weather, space weather, renewable energy, and other critical areas of research."

Some of the initial projects on Cheyenne include:

Long-range, seasonal to decadal forecasting:Several studies led by George Mason University, the University of Miami, and NCAR aim to improve prediction of weather patterns months to years in advance. Researchers will use Cheyenne's capabilities to generate more comprehensive simulations of finer-scale processes in the ocean, atmosphere, and sea ice. This research will help scientists refine computer models for improved long-term predictions, including how year-to-year changes in Arctic sea ice extent may affect the likelihood of extreme weather events thousands of miles away.

Wind energy:Projecting electricity output at a wind farm is extraordinarily challenging as it involves predicting variable gusts and complex wind eddies at the height of turbines, which are hundreds of feet above the sensors used for weather forecasting. University of Wyoming researchers will use Cheyenne to simulate wind conditions on different scales, from across the continent down to the tiny space near a wind turbine blade, as well as the vibrations within an individual turbine itself. In addition, an NCAR-led project will create high-resolution, 3-D simulations of vertical and horizontal drafts to provide more information about winds over complex terrain. This type of research is critical as utilities seek to make wind farms as efficient as possible.

Space weather:Scientists are working to better understand solar disturbances that buffet Earth's atmosphere and threaten the operation of satellites, communications, and power grids. New projects led by the University of Delaware and NCAR are using Cheyenne to gain more insight into how solar activity leads to damaging geomagnetic storms. The scientists plan to develop detailed simulations of the emergence of the magnetic field from the subsurface of the Sun into its atmosphere, as well as gain a three-dimensional view of plasma turbulence and magnetic reconnection in space that lead to plasma heating.

Extreme weather:One of the leading questions about climate change is how it could affect the frequency and severity of major storms and other types of severe weather. An NCAR-led project will explore how climate interacts with the land surface and hydrology over the United States, and how extreme weather events can be expected to change in the future. It will use advanced modeling approaches at high resolution (down to just a few miles) in ways that can help scientists configure future climate models to better simulate extreme events.

Climate engineering:To counter the effects of heat-trapping greenhouse gases, some experts have proposed artificially cooling the planet by injecting sulfates into the stratosphere, which would mimic the effects of a major volcanic eruption. But if society ever tried to engage in such climate engineering, or geoengineering, the results could alter the world's climate in unintended ways. An NCAR-led project is using Cheyenne's computing power to run an ensemble of climate engineering simulations to show how hypothetical sulfate injections could affect regional temperatures and precipitation.

Smoke and global climate:A study led by the University of Wyoming will look into emissions from wildfires and how they affect stratocumulus clouds over the southeastern Atlantic Ocean. This research is needed for a better understanding of the global climate system, as stratocumulus clouds, which cover 23 percent of Earth's surface, play a key role in reflecting sunlight back into space. The work will help reveal the extent to which particles emitted during biomass burning influence cloud processes in ways that affect global temperatures.

Read more:

Cheyenne Supercomputer Triples Scientific Capability with Greater Efficiency - Scientific Computing

Baidu Joins ASC17 Supercomputer Competition with AI Challenge – HPCwire (blog)

The 2017 ASC Student Supercomputer Challenge (ASC17) has announced that the contest will include an AI traffic prediction application provided by the Baidu Institute of Deep Learning. Commonly used among unmanned vehicle technologies, this key application software assesses spatial and temporal relations to make reasonable predictions on traffic conditions, helping vehicles choose the most appropriate route, especially in times of congestion.

For the preliminary contest, Baidu will provide the teams with a set of actual data of traffic conditions in a certain city from the past 50 weekdays for training. Each team will conduct data training using Baidus deep learning computing architecture, PaddlePaddle, to predict traffic every five minutes during the morning rush hour on the 51st weekday. Baidu will then judge each team on the accuracy of their traffic predictions.

This years ASC Student Supercomputer Challenge, the largest supercomputer contest in the world, is jointly organized by the Asia Supercomputer Community, Inspur, the National Supercomputing Center in Wuxi and Zhengzhou University. There are a total of 230 university teams from 15 countries and regions participating in the 2017 contest with the finalists announced on March 13th and the final competition held April 24th-28th.

The contest aims to inspire innovation in supercomputer applications and cultivate young talent. The era of intelligent computing is here and it is being driven by AI. High performance computing is one of the main technologies supporting AI and is facing changes and new challenges. It is with this that ASC has incorporated AI into the competition. It is hoped that more young university students can get involved in this trendy application more quickly and cultivate their enthusiasm for innovation.

For more information on the ASC17 preliminary contest, please visit http://www.asc-events.org/ASC17/Preliminary.php

Read the original:

Baidu Joins ASC17 Supercomputer Competition with AI Challenge - HPCwire (blog)

Exabyte Measures Linpack Performance Across Major Cloud Vendors – TOP500 News

Exabyte, a materials discovery cloud specialist, has published a study that compares Linpack performance on four of the largest public cloud providers. Although the studys methodology had some drawbacks, the results suggested that with the right hardware, HPC applications could not only scale well in cloud environments, but could also deliver performance on par with that of conventional supercomputers.

Overall, HPC practitioners have resisted using cloud computing for a variety of reasons, one of the more significant being the lack of performant hardware available in cloud infrastructure. Cluster network performance, in particular, has been found wanting in generic clouds, since conventional Ethernet, both GigE and 10GigE, do not generally have the bandwidth and latency characteristics to keep up with MPI applications running on high core-count nodes. As well see in a moment, it was the network that seemed to matter most in terms of scalability for these cloud environments.

The Exabyte study used high performance Linpack (HPL) as the benchmark metric, measuring its performance on four of the most widely used public clouds in the industry: Amazon Web Service (AWS), Microsoft Azure, IBM SoftLayer, and Rackspace. (Not coincidentally Exabyte, a cloud service provider for materials design, device simulations, and computational chemistry, employs AWS, Azure, SoftLayer and Rackspace as the infrastructure of choice for its customers.) Linpack was measured on specific instances of these clouds to determine benchmark performance across different cluster sizes and its efficiency in scaling from 1 to 32 nodes. The results were compared to those on Edison, a 2.5 petaflop (peak) NERSC supercomputer built by Cray. It currently occupies the number 60 spot on the TOP500 rankings.

To keep the benchmark results on as level a playing field as possible, it looks like the Exabyte team tried to use the same processor technology across systems, in this case, Intel Xeon processors of the Haswell or Ivy Bridge generation. However, the specific hardware profile clock speed, core count, and RAM capacity -- varied quite a bit across the different environments. As it turns out though, the system network was the largest variable across the platforms. The table below shows the node specification for each environment.

Source: Exabyte Inc.

As might be expected, Edison was able to deliver very good results, with a respectable 27-fold speedup as the Linpack run progressed from 1 to 32 nodes, at which point 10.44 teraflops of performance was achieved. That represents decent scalability and is probably typical of a system with a high-performance interconnect, in this case Crays Aries network. Note that Edison had the highest core count per node (48), but one of the slower processor clocks (2.4 GHz) of the environments tested.

The AWS cloud test used the c4.8xlarage instance, but was measured in three different ways: one with hyperthreading enabled, one with hyperthreading disabled, and one with hyperthreading disabled and with the node placements optimized to minimize network latency and maximize bandwidth. The results didnt vary all that much between the three with maximum Linpack performance of 10.74 being recorded for the 32-node setup with hyperthreading disabled and optimal node placement. However the speedup achieved for 32 nodes was just a little over 17 times that of a single node.

The Rackspace cloud instance didnt do nearly as well in the performance department, achieving only 3.04 teraflops on the 32-node setup. Even with just a single node, its performance was much worse than that of the AWS case, despite having more cores, a similar clock frequency, and an identical memory capacity per node. Rackspace did, however, deliver a better than an 18-fold speed as it progressed from 1 to 32 nodes -- slightly better than that of Amazon. That superior speedup is not immediately explainable since AWS instance provides more than twice the bandwidth of the Rackspace setup. Its conceivable the latters network latency is somewhat lower than that of AWS.

IBM SoftLayer fared even worse, delivering just 2.46 Linpack teraflops at 32 nodes and a speedup of just over 4 times that of a single node. No doubt the relatively slow processor clock (2.0 GHz) and slow network speed (1 gigabit/sec) had a lot to do with its poor performance.

Micrsofts Azure cloud offered the most interesting results. Here the Exabyte team decided to test three instances F16s, A9, and H16. The latter two instances were equipped with InfiniBand, the only platforms in the study where this was the case. The A9 instance provided 32 gigabits/sec and the H16 instance provided 54 gigabits/sec nearly as fast as the 64 gigabits/sec of the Aries interconnect on Edison.

Not surprisingly, the A9 and H16 exhibited superior scalability for Linpack, specifically, more than a 28-fold speedup on 32 nodes compared to a single node. Thats slightly better than the 27x speedup Edison achieved. In the performance area, the H16 instance really shined, delivering 17.26 Linpack teraflops in the 32-node configuration. Thats much higher than any of the other environments tested, including the Edison supercomputer. Its probably no coincidence that the H16, which is specifically designed for HPC work, was equipped with the fastest processor of the bunch at 3.2 GHz. Both the A9 and H16 instances also had significantly more memory per node than the other environments.

One of the unfortunate aspects of the Edison measurement is that they enabled hyperthreading for the Linpack runs, something Intel explicitly says not to do if you want to maximize performance on this benchmark. With the exception of one of the AWS tests, none of the others ran the benchmark with hyperthreading enabled.

In fact, the poor Linpack yield on Edison, at just 36 percent of peak in the 32-node test run, suggests the benchmark was not devised very well forthat system. The actual TOP500 run across the entire machine achieved a Linpack yield of more than 64 percent of peak, which is fairly typically of an HPC cluster with a high-performance network. The Azure H16 in this test had a 67 percent Linpack yield.

Theres also no way to tell if other hardware variations things like cache size, memory performance, etc. -- could have affected the results across the different cloud instances. In addition, it was unclear if the benchmark implementations were optimized for the particular environments tested. Teams working on a TOP500 submission will often devote weeks to tweaking a Linpack implementation to maximize performance on a particular system.

It would have been interesting to see Linpack results on other InfiniBand-equipped clouds. In 2014, an InfiniBand option was added to SoftLayer, but the current website doesnt make any mention of such a capability. However, Penguin Computing On Demand, Nimbix, and ProfitBricks all have InfiniBand networks for their purpose-built HPC clouds. Comparing these to the Azure H16 instance could have been instructive. Even more interesting would be to see other HPC benchmarks, like the HPCG metric or specific application kernels tested across these platforms.

Of course, what would be ideal would be some sort of cloud computing tracker that could determine the speed and cost of executing your code on a given cloud platform at any particular time. Thats apt to require a fair amount of AI, not to mention a lot more transparency by cloud providers on how their hardware and software operates underneath the covers. Well, maybe someday...

Go here to read the rest:

Exabyte Measures Linpack Performance Across Major Cloud Vendors - TOP500 News

High-resolution regional modeling (no supercomputer needed … – UCAR

Annual precipitation over Colorado as modeled by the low-resolution, global Community Earth System Model (top) compared to the high-resolution, regional Weather Research and Forecasting model (below). (Images courtesy Ethan Gutmann, NCAR.)

February 13, 2017 | In global climate models, the hulking, jagged Rocky Mountains are often reduced to smooth, blurry bumps.

It's a practical reality that these models, which depict the entire planet, typically need to be run at a relatively low resolution due to constraints on supercomputing resources. But the result, a virtual morphing of peaks into hills, affects the ability of climate models to accurately project how precipitation in mountainous regions may change in the future information that is critically important to water managers.

To address the problem, hydrologists have typically relied on two methods to "downscale" climate model data to make them more useful. The first, which uses statistical techniques, is fast and doesn't require a supercomputer, but it makes many unrealistic assumptions. The second, which uses a high-resolution weather model like the Weather Research and Forecasting model (WRF), is much more realistic butrequires vast amounts of computing resources.

Now hydrologists at the National Center for Atmospheric Research (NCAR) are developing an in-between option: The Intermediate Complexity Atmospheric Research Model (ICAR) gives researchers increased accuracy using only a tiny fraction of the computing resources.

"ICAR is about 80 percent as accurate as WRF in the mountainous areas we studied," said NCAR scientist Ethan Gutmann, who is leading the development of ICAR. "But it only uses 1 percent of the computing resources. I can run it on my laptop."

How much precipitation falls in the mountains and when is vitally important for communities in the American West and elsewhere that rely on snowpack to act as a frozen reservoir of sorts. Water managers in these areas are extremely interested in how a changing climate might affect snowfall and temperature, and therefore snowpack, in these regions.

But since global climate models with low resolution are not able to accurately represent the complex topography of mountain ranges, they are unsuited for answering these questions.

For example, as air flows into Colorado from the west, the Rocky Mountains force that air to rise, cooling it and causing moisture to condense and fall to the ground as snow or rain. Once these air masses clear the mountains, they are drier than they otherwise would have been, so there is less moisture available to fall across Colorado's eastern plains.

Low-resolution climate models are not able to capture this mechanism the lifting of air over the mountains and so in Colorado, for example, they often simulate mountains that are drier than they should be and plains that are wetter. For a regional water manger, these small shifts could mean the difference between full reservoirs and water shortages.

"Climate models are useful for predicting large-scale circulation patterns around the whole globe, not for predicting precipitation in the mountains or in your backyard," Gutmann said.

Precipitation in millimeters over Colorado between Oct. 1 and May 1 as simulated by the Weather Research and Forecasting model (WRF), the Intermediate Complexity Atmospheric Research model (ICAR), and the observation-based Parameter-Elevation Regressions on Independent Slopes Model. (Images courtesy Ethan Gutmann.)

A simple statistical fix for these known problems may include adjusting precipitation data to dry out areas known to be too wet and moisten areas known to be too dry. The problem is that these statistical downscaling adjustments don't capture the physical mechanisms responsible for the errors. This means that any impact of a warming climate on the mechanisms themselves would not be accurately portrayed using a statistical technique.

That's why using a model like WRF to dynamically downscale the climate data produces more reliable results the model is actually solving the complex mathematical equations that describe the dynamics of the atmosphere. But all those incredibly detailed calculations also take an incredible amount of computing.

A few years ago, Gutmann began to wonder if there was a middle ground. Could he make a model that would solve the equations for just a small portion of the atmospheric dynamics that are important to hydrologists in this case, the lifting of air masses over the mountains but not others that are less relevant?

"I was studying statistical downscaling techniques, which are widely used in hydrology, and I thought, 'We should be able to do better than this,'" he said. "'We know what happens when you lift air up over a mountain range, so why dont we just do that?'"

Gutmann wrote the original code for the model that would become ICAR in just a few months, but he spent the next four years refining it, a process that's still ongoing.

Last year, Gutmann and his colleagues Martyn Clark and Roy Rasmussen, also of NCAR; Idar Barstad, of Uni Research Computing in Bergen, Norway; and Jeffrey Arnold, of the U.S. Army Corps of Engineers published a study comparing simulations of Colorado created by ICAR and WRF against observations.

The authors found that ICAR and WRF results were generally in good agreement with the observations, especially in the mountains and during the winter. One of ICAR's weaknesses, however, is in simulating storms that build over the plains in the summertime. Unlike WRF, which actually allows storms to form and build in the model, ICAR estimates the number of storms likely to form, given the atmospheric conditions, a method called parameterization.

Even so, ICAR, which is freely available to anyone who wants to use it, is already being run by teams in Norway, Austria, France, Chile, and New Zealand.

"ICAR is not perfect; it's a simple model," Gutmann said. "But in the mountains, ICAR can get you 80 to 90 percent of the way there at 100 times the speed of WRF. And if you choose to simplify some of the physics in ICAR, you can get it close to 1,000 times faster."

Title: The Intermediate Complexity Atmospheric Research Model (ICAR)

Authors: Ethan Gutmann, Idar Barstad, Martyn Clark, Jeffrey Arnold, and Roy Rasmussen

Journal:Journal of Hydrometeorology, DOI: 10.1175/JHM-D-15-0155.1

Funders: U.S. Army Corps of Engineers U.S. Bureau of Reclamation

Collaborators:Uni Research Computing in Norway U.S. Army Corps of Engineers

Writer/contact:Laura Snider, Senior Science Writer

Read the rest here:

High-resolution regional modeling (no supercomputer needed ... - UCAR

Top Chinese Supercomputer Blazes Real-World Application Trail – The Next Platform

February 13, 2017 Jeffrey Burt

Chinas massive Sunway TaihuLight supercomputer sent ripples through the computing world last year when it debuted in the number-one spot on the Top500 list of the worlds fastest supercomputers. Delivering 93,000 teraflops of performance and a peak of more than 125,000 teraflops the system is nearly three times faster than the second supercomputer on the list (the Tianhe-2, also a Chinese system) and dwarfs the Titan system Oak Ridge National Laboratory, a Cray-based machine that is the worlds third-fastest system, and the fastest in the United States.

However, it wasnt only the systems performance that garnered a lot of attention. It also was the fact that the supercomputer was powered by Sunways many-core SW26010 processors built in China rather than chips from well-known US players like Intel, AMD or Nvidia. As weve talked about before, the TaihuLight system and the the SW26010 chips it runs on are part of a larger push by Chinese officials to have more components for Chinese systems made in China rather than by US vendors, an effort that is fueled by a number of factors, from national security issues to national competitive pride. Another part of that push is Chinas plan to spend $150 billion over 10 years to build out the countrys chip-making capabilities.

The chip itself is not overly impressive by the numbers Jack Dongarra of the University of Tennessee and Oak Ridge National Laboratory outlined the current state of the high-performance computing space and the challenges it faces, and described the SW26010s size (built on 28-nanometer technology) and speed (1.45GHz) as modest compared with what Intel, AMD and other vendors in the United States are coming out with. However, the supercomputer is powered by more than 10.6 million cores. By comparison, Tianhe-2 is running 3.12 million Intel Xeon E5-2692 cores.

The size and performance capabilities of the supercomputer, which is installed at the National Supercomputing Center in China, makes it an attractive choice when running computationally intensive workloads like computational fluid dynamics (CFD), used to simulate occurrences in a broad range of scientific areas, including meteorology, aerodynamics and environmental sciences. A group of scientists from the Center for High Performance Computing at Shanghai Jiao Tong University in China and the Tokyo Institute of Technology in Japan recently released a paper outlining experiments they conducted running a hybrid implementation of the Open Source Field Operation and Manipulation (OpenFOAM) CFD application on the TaihuLight system. The researchers wanted to see if they could develop a hybrid implementation of the software to overcome a compiler incompatibility situation in the SW26010 processor. They called OpenFOAM was of the most popular CFD applications built on C++.

In their study, titled Hybrid Implementation and Optimization of OpenFOAM on the SW26010 Many-core Processor, the researchers laid out the challenge presented by the chip when running C++ programs.

The processor includes four core groups(CGs), each of which consists of one management processing element (MPE) and sixty-four computing processing elements (CPEs) arranged by an eight by eight grid, they wrote. The basic compiler components on MPE support C/C++ programming language, while the compiler components on CPE only support C. The compilation incompatibility problem makes it difficult for C++ programs to exploit the computing power of the SW26010 processor.

In order to get high performance from the OpenFOAM program while running on the chip, the researchers Delong Meng, Minhua Wen, Jianwen Wei, James Lin not only used a mixed-language design for the application, but also leveraged several feature-specific optimizations on the SW26010 on the software. What they did with the OpenFOAM application can also be used with other complex C++ programs to ensure high performance when running on systems powered by the SW26010 processor.

Details of the study can be found here, but one of the key steps was developing a mixed-language programming model for OpenFOAM, in party by modifying the data storage format and reimplementing the kernel code with C language. In addition, on the MPE, they put in a new compilation method for OpenFOAM in which they compile ThirdParty and OpenFOAM with GCC and swg++-4.5.3, respectively, and changed the linking mode of OpenFOAM, using the static library. The optimizations of the MPE included the such areas as vectorization, data presorting and algorithm optimization.

They also took steps for running OpenFOAM on the chips CPE cluster, which only supports the C compiler, through such steps as using the master-slave cooperative algorithm of the PCG method and by modifying the library file. Optimizations of the CPE were done in such areas as data structure transformation, register communication, direct memory access (DMA), prefetching, double buffering and data reuse.

The studys authors then tested the software by running it on both a SW26010 processor and a 2.3GHz Xeon E5-2695 v3 in a test case involving what they described as a lid-driven cavity flow. The top boundary of the cube is a moving wall that moves in the x-direction, whereas the rest are static walls. In the tests comparing the performance of the MPE, the CPE cluster and the Intel chip, they found that after optimizing the CPE cluster, there was an 8.03-times performance increase based on the optimized implementation on the MPE. In addition, the CPE cluster was 1.18 times faster than the single-core Intel chip. However, while the CPE cluster performance was better than that of the Intel processor, there were issues with efficiency. Those were due to a smaller cache and scratchpad memory (SPM) size of the SW26010, which means having to repeatedly load data into the SPM and hindering memory access. In addition, the DMA latency was high and the automatic optimizations of the SW26010 applied by the compiler was less efficient than with the Intel chip.

However, the researchers said they proved that the work they did with OpenFOAM to enable it to reach high performance in the SW26010 can be used with other C++ workloads.

The implementation and results we present demonstrate how complex codes and algorithms can be efficiently implemented on such diverse architectures as hybrid MPE-CPEs systems, they wrote. We can hide hardware-specific programming models into libraries and make them general purpose. OpenFOAM is now ready to effectively exploit the new supercomputing system based on the SW26010 processor.

Categories: Code, HPC

Tags: China, Sunway, TaihuLight, Top 500

Intel Gets Serious About Neuromorphic, Cognitive Computing Future ARM Gains Stronger Foothold In China With AI And IoT

See original here:

Top Chinese Supercomputer Blazes Real-World Application Trail - The Next Platform

Illinois Uses Supercomputer to Solve State Problems – MeriTalk (press release) (blog)

Foellinger Auditorium on the University of Illinois campus at Urbana-Champaign. (Photo: Shutterstock)

The University of Illinois supercomputer program is working to secure state-collected data and organize information in order to find solutions to state problems.

The National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign is partnering with the state to build new ways to protect and categorize individuals data. The state collects data on health, business licenses, and credit card information, which will be analyzed to reduce problems such as traffic congestion and repeat offenders in the prison system.

In the first phase of the partnership, NCSA will work with the Illinois Department of Innovation and Technology (DoIT) to safeguard citizen data.

A partnership between DoIT and NCSA will bring great benefits to Illinois businesses and citizens in the area of cybersecurity, said Kirk Lonbom, DoITs chief information security officer. The threat posed by cyberattackers grows exponentially by the day and collaborations such as these accelerate the pace of cybersecurity progress.

The NCSA plans to secure collected data, protect critical infrastructure systems, respond to threats, and provide for the integrity of their information systems.

NCSAs Cybersecurity Division brings expertise in several areas to help the state with their strategic goals, said Bill Gropp, NCSAs acting director.

After the data is secured, the NCSA and the state will decide how to organize the data in order to improve the process of curation and usability. The NCSA plans to look into how the location of certain projects and local culture affect the data to develop solutions.

We expect to learn lots! Especially things we arent expecting, Gropp said. Think of this as a step toward customized services for the citizens and visitors to the state. But the most exciting outcomes are the ones that you dont expect; we hope to make it easier not just for the state but for the public to explore public data and innovate.

Illinois launched its smart state initiatives in 2016. The state hosted workshops in April and December for private sector and government leaders to discuss how smart state projects will improve government efficiency, access to services, and promote the growth of business.

Illinois is nationally recognized as the first U.S. state to have a vision and road map for becoming a smarter state, said state CIO Hardik Bhatt. The goal is to use technology, Internet of Things, analytics, and cybersecurity to improve operational efficiency and find new and more cost-effective ways to serve our customers.

The NCSA was established in 1986 as one of the of the National Science Foundations Supercomputer Centers Programs. NCSA is funded by the state of Illinois, the University of Illinois, the National Science Foundation, and grants from other Federal agencies. The center provides resources on computing, software, data, networking, and cybersecurity to scientists and academics across the country.

The NCSAs goals to accomplish by 2020 include working with government agencies to help the nations research opportunities and use data to address complex problems. This partnership with the state of Illinois could tackle issues in health care, the prison system, and the roadway system.

At NCSA, University of Illinois faculty, staff, students, and collaborators from around the globe use advanced digital resources to address research grand challenges for the benefit of science, industry, and society, said Gropp. We are excited about leveraging these resources to modernize infrastructure in order to better serve the citizens of Illinois and uplift the states economy.

See more here:

Illinois Uses Supercomputer to Solve State Problems - MeriTalk (press release) (blog)

Boulder’s NCAR boasts powerful new supercomputer at Wyoming site – Boulder Daily Camera

In the fall of 2016, the new Cheyenne supercomputer was installed at the NCAR-Wyoming Supercomputing Center. This is a video screen grab.

Key projects at Boulder's National Center for Atmospheric Research will now be supported by a powerful new supercomputer capable of more than three times the amount of scientific computing performed by its predecessor.

The new cyber-wonder is called Cheyenne, and it takes its name from Cheyenne, Wyo., the city where it is housed within the NCAR-Wyoming Supercomputing Center.

Projects that will be supported by the new supercomputer include studies of wind energy, long-range, seasonal-to-decadal forecasting, extreme weather, climate engineering and space weather, giving scientists a better understanding of solar disturbances that can affect the operation of power grids, satellites and global communications.

"Cheyenne will help us advance the knowledge needed for saving lives, protecting property and enabling U.S. businesses to better compete in the global marketplace," Antonio Busalacchi, president of the University Corporation for Atmospheric Research, said in a prepared statement. "This system is turbocharging our science."

NCAR is managed by UCAR on behalf of the National Science Foundation.

Cheyenne is one of the world's most energy-efficient and powerful supercomputers, a 5.34-petaflop system capable of more than triple the amount of scientific computing than that performed by Yellowstone, which was NCAR's previous supercomputer. Additionally, it is three times more energy efficient.

A petaflop is the ability of a computer to perform one quadrillion floating point operations per second. Cheyenne is currently the 20th-fastest supercomputer in the world and the fastest in the mountain west.

Since the 2012 opening of the NCAR-Wyoming Super Computing Center, more than 2,200 scientists from over 300 universities and federal laboratories have used it, according to a news release.

Charlie Brennan: 303-473-1327, brennanc@dailycamera.com or twitter.com/chasbrennan

The rest is here:

Boulder's NCAR boasts powerful new supercomputer at Wyoming site - Boulder Daily Camera

NCSA Facilitates Performance Comparisons With China’s Top Supercomputer – HPCwire (blog)

Feb. 10 China has topped supercomputer rankings on the internationalTOP500 listof fastest supercomputers for the past eight years. They have maintained this status with their newest supercomputer,Sunway TaihuLight, constructed entirely from Chinese processors.

While Chinas hardware has come into its own, asForeign Affairs wrote in August, no one can say objectively at present how fast this hardware can solve scientific problems compared to other leading systems around the world. This is because the computer is newhaving made its debut in June, 2016.

Researchers were able to use seed funding provided through the Global Initiative to Enhance @scale and Distributed Computing and Analysis Technologies (GECAT) project administered by the National Center for Supercomputing Applications (NCSA)Blue Waters Projectto port and run codes on leading computers around the world. GECAT is funded by the National Science FoundationsScience Across Virtual Institutes(SAVI) program, which focuses on fostering and strengthening interaction among scientists, engineers and educators around the globe. Shanghai Jiao Tong University and its NVIDIA Center of Excellence matched the NSF support for this seed project, and helped enable the collaboration to have unprecedented full access to Sunway TaihuLight and its system experts.

It takes time to transfer, or port, scientific codes built to run on other supercomputer architectures, but an international, collaborative project has already started porting one major code used in plasma particle-in-cell simulations, GTC-P. The accomplishments made and the road towards completion were laid out in a recent paper that won best application paper from theHPC China 2016 Conferencein October.

While LINPACK is a well-established measure of supercomputing performance based on a linear algebra calculation, real world scientific application problems are really the only way to show how well a computer produces scientific discoveries, said Bill Tang, lead co-author of the study and head of theIntel Parallel Computing Center at Princeton University. Real @scale scientific applications are much more difficult to deploy than LINPACK for the purpose of comparing how different supercomputers perform, but its worth the effort.

The GTC-P code chosen for porting to TaihuLight is a well-traveled code in supercomputing, in that it has already been ported to seven leading systems around the worlda process that ran from 2011 to 2014 when Tang served as the U.S. principal investigator for the G8 Research CouncilsExascale Computing for Global Scale IssuesProject in Fusion Energy, orNuFuSE.It was an international high-powered computing collaboration between the US, UK, France, Germany, Japan and Russia.

A major challenge that the Shanghai Jiao Tong and Princeton Universities collaborative team have already overcome is adapting the modern language (OpenACC-2) in which GTC-P was written, making it compatible with TaihuLights homegrown compiler, SWACC. An early result from the adaptation is that the new TaihuLight processors were found to be about three times faster than a standardCPUprocessor. Tang said the next step is to make the code work with a larger group of processors.

If GTC-P can build on this promising start to engage a large fraction of the huge number of TaihuLight processors, well be able to move forward to show objectively how this impressive, new, number-one-ranking supercomputer stacks up to the rest of the supercomputing world, Tang said, adding that metrics like time to solution and associated energy to solution are key to the comparison.

These are important metrics for policy makers engaged in deciding which kinds of architectures and associated hardware best merit significant investments, Tang added.

The top seven supercomputers worldwide on which GTC-P can run well all have diverse hardware investments. For example, NCSAs Blue Waters has more memory bandwidth than other U.S. systems, while TaihuLight has clearly invested most heavily in powerful new processors.

As Tang said recently in atechnical program presentationat the SC16 conference in Salt Lake City, improvements in the GTC-P code have for the first time enabled delivery of new scientific insights. These insights show complex electron dynamics at the scale of the upcomingITERdevice, the largest fusion energy facility ever constructed.

In the process of producing these new findings, we focused on realistic cross-machine comparison metrics, time and energy to solution, Tang said. Moving into the future, it would be most interesting to be able to include TaihuLight in such studies.

About the National Center for Supercomputing Applications (NCSA)

TheNational Center for Supercomputing Applications(NCSA) at theUniversity of Illinois at Urbana-Champaignprovides supercomputing and advanced digital resources for the nations science enterprise. At NCSA, University of Illinois faculty, staff, students, and collaborators from around the globe use advanced digital resources to address research grand challenges for the benefit of science and society. NCSA has been advancing one third of the Fortune 50 for more than 30 years by bringing industry, researchers, and students together to solve grand challenges at rapid speed and scale.

Source: NCSA

Link:

NCSA Facilitates Performance Comparisons With China's Top Supercomputer - HPCwire (blog)

Global Supercomputer Market (2017-2021) – By End Users, Operating System & Processor Type – Growing Demand for … – Business Wire (press release)

DUBLIN--(BUSINESS WIRE)--Research and Markets has announced the addition of the "Global Supercomputer Market 2017-2021" report to their offering.

The global supercomputer market to grow at a CAGR of 7.00% during the period 2017-2021.

The TOP500 List is the benchmark that is attributed to the best and the most high-performing systems in the world and commonly defines market dynamics. A study carried out by industry experts indicate that a number of companies already using supercomputers and unwilling to give up the use of the systems was between the range of 95%-100%. Some of the applications of supercomputers include drug discovery and testing, data analytics in financial services, vehicle crash collision testing, scientific research in physics and chemistry, and weather forecasting.

According to the report, the amount of data that governments across the global must handle range from ongoing and unmet security needs to cryptanalysis and even data mining for rapid and precise analysis of data from a number of disparate sources. There are numerous computational challenges associated with the SIGINT (Signal Intelligence) mission of NSA. In this program, the NSA needs to intercept as well as analyze the communication signals of foreign adversaries, most of which are protected using encodings and other countermeasures.

The mission targets capabilities, intentions, and activities of foreign countries such as Russia and China, thus playing an important role in offering counterintelligence for protection against espionage, sabotage, and assassinations conducted on behalf of foreign powers, organizations, and international terrorist groups.

Key vendors

Key Topics Covered:

Part 01: Executive summary

Part 02: Scope of the report

Part 03: Market research methodology

Part 04: Introduction

Part 05: Market landscape

Part 06: Market segmentation by end-users

Part 07: Market segmentation by operating system

Part 08: Market segmentation by processor type

Part 09: Geographical segmentation

Part 10: Market drivers

Part 11: Impact of drivers

Part 12: Market challenges

Part 13: Impact of drivers and challenges

Part 14: Market trends

Part 15: Vendor landscape

Part 16: Appendix

For more information about this report visit http://www.researchandmarkets.com/research/t9hn3m/global

See the original post here:

Global Supercomputer Market (2017-2021) - By End Users, Operating System & Processor Type - Growing Demand for ... - Business Wire (press release)

Chinese Firms Racing to the Front of the AI Revolution – TOP500 News

While US-based firms such as Google, Facebook and Microsoft still dominate the artificial intelligence space, Chinese counterparts like Baidu, Tencent, and Alibaba are quickly catching up, and in some cases, surpassing their US competition. As a consequence, China appears to be on a path to reproduce its success in supercomputing in AI.

As should be apparent to anyone following this space, the technology duo of supercomputing and AI are not unrelated, the most recent example being the triumph of the Libratus poker-playing application over four of the best players in the game. Libratuss software was developed at Carnegie Mellon University, but schooled at the Pittsburgh Supercomputing Center using the Bridges supercomputer. In fact, Libratus was tapping into Bridges at night during the poker tournament, refining its poker tactics, while the human players slept. In fact, all the technologies discussed below rely on some sort of HPC platform.

But while its relatively straightforward, although not necessarily easy, to build supercomputing systems, developing AI software requires more cutting-edge talents. And until a few years ago, much of that talent resided inside US-based companies and universities. No more. In fact, a US government report determined that the number of academic papers published in China that mentioned deep learning exceeds the number published by US researchers.

Another visible indication the Chinese are catching up is the number of AI-related patents being submitted there. In an article published last week in Nikkei Asian Review, an analysis showed that Chinese patent applications in this segment rose to 8,410 over the five-year period between 2010 and 2014, represent a 186 percent increase. During that same timeframe, US-sourced AI patent applications reached 15,317, a rate of increase of only 26 percent. The article quotes Shigeoki Hirai, director general at the Japanese government-affiliated New Energy and Industrial Technology Development Organization, who believes the patent growth in China is not only quantitative, but also qualitative. "China's progress is remarkable in hot areas like deep learning," he said. "It's not like they are only growing in numbers."

Last month CNBC reported that venture capital investment in China is being spurred by AI, robotics and the internet-of-things. According to a study by tech auditing firm KPMG, VC investments there will move increasingly into artificial intelligence in 2017. The study noted that venture capital money in China reached a record high of $31 billion last year, despite a global slowdown in VC investment in 2016.

Some of that money is flowing into Chinese startups like iCarbonX, a company specializing in mining medical data and using machine learning analysis to optimize health outcomes. The company, which was founded in 2015 by Jun Wang, has since received a whopping $600 million in investment capital. Wang, who is an alum of Shenzhen-based genomics giant BGI, says he will be able to collect more data and do it much less expensively than US-based rivals working in this area. According to a write-up in Nature, he expects to get data from more than a million people over the next five years. That, he maintains, will allow the algorithms the company is developing to understand how this data correlates with disease states, and be able to dispense advice on lifestyle choices to improve the health of its users.

Other Chinese up-and-comers like iFLYTEK, a firm that focuses on speech and language recognition, and Uisee Technology, a self-driving car company, have also received some notoriety, most recently in a New York Times article. While that report focused primarily on Chinas rapidly maturing AI-based military defense capabilities, it noted that much of the technology is freely flowing across borders. As a result, AI knowledge is rapidly assimilated in countries like China because much of that expertise originated with US-based multinationals and the academic community, neither of which hold a particular allegiance to US government interests.

More well-known Chinese firms like Tencent, the countrys biggest provider of Internet services, and Alibaba, the countrys largest e-retailer, are quickly ramping up their AI efforts. Last August, Alibaba announced a new AI suite, dubbed ET, which includes everything from audio transcription and video recognition to financial risk analysis and traffic forecasting. Tencent, meanwhile, has established an AI lab, which while still relatively small (about 30 researchers) by Google standards, represents just the start of the companys push into this space. In an article published last December in MIT Technology Review, the labs director, Xing Yao, said he thinks domestic companies have an advantage in acquiring AI talent. Chinese companies have a really good chance, because a lot of researchers in machine learning have a Chinese background, he said. So from a talent acquisition perspective, we do think there is a good opportunity for these companies to attract that talent.

To date though, the biggest Chinese success story in artificial intelligence has to be Baidu, which commands the biggest Internet search platform in its homw country. As one of the first firms to recognize the potential of AI technology, it opened a deep learning institute in Silicon Valley in 2013, a move designed to tap into US-based expertise and computing resources. The next year it expanded its investment, to the tune of $300 million dollars, establishing the Silicon Valley AI Lab (SVAIL), which is now one of the premier AI research centers in the world.

Baidus pioneering work in speech recognition, with its Deep Speech and Deep Speech 2 platforms, is considered the best in the business and is quickly closing the gap between human transcribers and automated speech recognition. At the same time, the company has moved forward on many other fronts, including autonomous driving, image recognition, ad matching, and language translation (especially Mandarin to English) many of which are now in production serving its domestic users.

Baidu also recently recently hired Qi Lu, a former Microsoft executive who was at the center of the software makers move into AI and bots. Lu is now Baidus chief operating officer (COO) tasked with overseeing the companys business and research operations. According to company founder Robin Li. Lus immediate focus will be to work on beefing up Baidus search business with AI technologies,. For his part, Li has said he intends to make Baidu a global leader in artificial intelligence and machine learning.

Even given all that, US-based AI is likely to remain dominant for some time. Multinationals like Google, Facebook and Microsoft still have a bigger audience, and thus a bigger data collection pipeline and deployment potential than the largest Chinese web-based companies. But not that much bigger. Chinas internet user base is estimated to be in the neighborhood of 800 million people and if these companies can expand elsewhere in Asia or beyond, those numbers could quickly shift. In which case, that Mandarin-to-English translator is going to be especially useful.

See the original post:

Chinese Firms Racing to the Front of the AI Revolution - TOP500 News

Wrangler Supercomputer at TACC Supports Information Retrieval Projects – HPCwire

Feb. 7 Much of the data of the World Wide Web hides like an iceberg below the surface. The so-called deep web has beenestimatedto be 500 times bigger than the surface web seen through search engines like Google. For scientists and others, the deep web holds important computer code and its licensing agreements. Nestled further inside the deep web, one finds the dark web, a place where images and video are used by traders in illicit drugs, weapons, and human trafficking. A new data-intensive supercomputer called Wrangler is helping researchers obtain meaningful answers from the hidden data of the public web.

TheWranglersupercomputer got its start in response to the question, can a computer be built to handle massive amounts of I/O (input and output)? TheNational Science Foundation(NSF) in 2013 got behind this effort andawardedthe Texas Advanced Computing Center (TACC), Indiana University, and the University of Chicago $11.2 million to build a first-of-its-kind data-intensive supercomputer. Wranglers 600 terabytes of lightning-fast flash storage enabled the speedy reads and writes of files needed to fly past big data bottlenecks that can slow down even the fastest computers. It was built to work in tandem with number crunchers such as TACCsStampede, which in 2013 was the sixth fastest computer in the world.

While Wrangler was being built, a separate project came together headed by theDefense Advanced Research Projects Agency(DARPA) of the U.S. Department of Defense. Back in 1969, DARPA had built theARPANET, which eventually grew to become the Internet, as a way to exchange files and share information. In 2014, DARPA wanted something new a search engine for the deep web. They were motivated to uncover the deep webs hidden and illegal activity, according toChris Mattmann, chief architect in the Instrument and Science Data Systems Section of the NASA Jet Propulsion Laboratory (JPL) at the California Institute of Technology.

Behind forms and logins, there are bad things. Behind the dynamic portions of the web like AJAX and Javascript, people are doing nefarious things, said Mattmann. Theyre not indexed because the web crawlers of Google and others ignore most images, video, and audio files. People are going on a forum site and theyre posting a picture of a woman that theyre trafficking. And theyre asking for payment for that. People are going to a different site and theyre posting illicit drugs, or weapons, guns, or things like that to sell, he said.

Mattmann added that an even more inaccessible portion of the deep web called the dark web can only be reached through a special browser client and protocol called TOR, The Onion Router. On the dark web, said Mattmann, theyre doing even more nefarious things. They traffic in guns and human organs, he explained. Theyre basically doing these activities and then theyre tying them back to terrorism.

In response, DARPA started a program calledMemex. Its name blends memory with index and has roots to an influential 1945 Atlantic magazine article penned by U.S. engineer and Raytheon founder Vannevar Bush. His futuristic essay imagined making all of a persons communications books, records, and even all spoken and written words in fingertip reach. The DARPA Memex program sought to make the deep web accessible. The goal of Memex was to provide search engines the information retrieval capacity to deal with those situations and to help defense and law enforcement go after the bad guys there, Mattmann said.

Karanjeet Singh is a University of Southern California graduate student who works with Chris Mattmann on Memex and other projects. The objective is to get more and more domain-specific (specialized) information from the Internet and try to make facts from that information, said Singh said. He added that agencies such as law enforcement continue to tailor their questions to the limitations of search engines. In some ways the cart leads the horse in deep web search. Although we have a lot of search-based queries through different search engines like Google, Singh said, its still a challenge to query the system in way that answers your questions directly.

Once the Memex user extracts the information they need, they can apply tools such as named entity recognizer, sentiment analysis, and topic summarization. This canhelp law enforcement agencieslike the U.S. Federal Bureau of Investigations find links between different activities, such as illegal weapon sales and human trafficking, Singh explained.

Lets say that we have one system directly in front of us, and there is some crime going on, Singh said. The FBI comes in and they have some set of questions or some specific information, such as a person with such hair color, this much age. Probably the best thing would be to mention a user ID on the Internet that the person is using. So with all three pieces of information, if you feed it into the Memex system, Memex would search in the database it has collected and would yield the web pages that match that information. It would yield the statistics, like where this person has been or where it has been sited in geolocation and also in the form of graphs and others.

What JPL is trying to do is trying to automate all of these processes into a system where you can just feed in the questions and and we get the answers, Singh said. For that he worked with an open source web crawler calledApache Nutch. It retrieves and collects web page and domain information of the deep web. TheMapReduceframework powers those crawls with a divide-and-conquer approach to big data that breaks it up into small pieces that run simultaneously. The problem is that even the fastest computers like Stampede werent designed to handle the input and output of millions of files needed for the Memex project.

The Wrangler data-intensive supercomputer avoids data overload by virtue of its 600 terabytes of speedy flash storage. Whats more, Wrangler supports theHadoopframework, which runs using MapReduce. Wrangler, as a platform, can run very large Hadoop-based and Spark-based crawling jobs, Mattmann said. Its a fantastic resource that we didnt have before as a mechanism to do research; to go out and test our algorithms and our new search engines and our crawlers on these sites; and to evaluate the extractions and analytics and things like that afterwards. Wrangler has been an amazing resource to help us do that, to run these large-scale crawls, to do these type of evaluations, to help develop techniques that are helping save people, stop crime, and stop terrorism around the world.

Click here to viewthe entire article.

Source: Jorge Salazar, TACC

Excerpt from:

Wrangler Supercomputer at TACC Supports Information Retrieval Projects - HPCwire

Nvidia’s new graphics card turns your PC into a ‘supercomputer’ – TechRadar

Over at Solidworks in Los Angeles, Nvidia has revealed a crop of new Pascal-based Quadro graphics cards led by the GP100, which the company claims will effectively grant supercomputing capabilities to a desktop workstation PC.

The Quadro GP100 is certainly a very tasty piece of hardware aimed at the likes of deep learning, engineering and simulation workloads, as well as VR content creation. Based on the SPECviewperf 12 benchmark, it boasts no less than (up to) double the performance of the firms previous-generation solution.

It also comes with 16GB of HBM2 (high-bandwidth) memory and NVLink allows for a pair of the GPUs to be combined for 32GB of HBM2 on tap in a single PC.

To throw some more juicy numbers at you, the GP100 offers unprecedented double precision performance, Nvidia notes, tipping up over 5TFlops which is nearly triple the speed of the Quadro K6000.

Nvidia says that single precision (FP32) performance is 10 TFlops, doubled to 20 TFlops when in half precision (FP16) mode.

The Quadro GP100 is joined by five other offerings, namely the P4000, P2000, P1000, P600 and P400, which boast different levels of power depending on what you need (although only the flagship GP100 has HBM2 video memory on board).

The company also said that its new Pascal-based Quadro cards are capable of rendering photorealistic images over 18 times faster than an Intel Xeon E5 2697 V3 (2.6GHz) processor with 14 cores (thats 720p footage with Iray).

The launch of this host of cards follows the release of the Quadro P6000 and P5000 last summer.

These powerful new GPUs will be available from as soon as next month, although the exact pricing is still to be confirmed. Obviously enough, the beefy GP100 is likely to carry a very weighty price tag.

Read the rest here:

Nvidia's new graphics card turns your PC into a 'supercomputer' - TechRadar

New supercomputer starts work in Cheyenne | Business | trib.com – Casper Star-Tribune Online

One of the worlds most powerful supercomputers sprang into action last month in Wyomings capital city.

The National Center for Atmospheric Research launched the machine, named Cheyenne, to help researchers better understand the world we live in, according to a news release. Scientists across the nation will use Cheyenne currently the worlds 20th-fastest supercomputer to study a range of topics, from wildfires and earthquakes to wind.

Leaders hope the results of that research will lead to improvements in natural disaster protection and anticipation as well as strengthen long- and short-term weather and water forecasts. That could be good news for businesses and economic sectors that depend on that information, such as agriculture, energy, transportation and tourism.

Cheyenne will help us advance the knowledge needed for saving lives, protecting property, and enabling U.S. businesses to better compete in the global marketplace, Antonio J. Busalacchi, president of the University Corporation for Atmospheric Research, said in the release. This system is turbocharging our science.

Some of the topics researchers hope to tackle using the machine include forecasting of long-range weather patterns, wind energy, space weather, extreme weather, climate engineering and smoke and the global climate, the release said.

Cheyenne, the fastest supercomputer in the Mountain West, can handle more than three times the amount of scientific computing that its predecessor, Yellowstone, could do, according to the release, and is three times more energy-efficient as well. It lives in the NCAR-Wyoming Supercomputing Center, which opened in 2012. Since then, more than 2,200 scientists from more than 300 universities and labs have harnessed its power.

The supercomputer was named to thank the people of Cheyenne for their support of the center as well as to celebrate the 150th anniversary of Wyomings capital city, which was founded in 1867.

Read the rest here:

New supercomputer starts work in Cheyenne | Business | trib.com - Casper Star-Tribune Online