{"id":232278,"date":"2017-08-04T12:41:38","date_gmt":"2017-08-04T16:41:38","guid":{"rendered":"http:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/uncategorized\/3x-performance-boost-using-intel-advisor-and-intel-trace-analyzer-in-astrophysics-simulations-insidehpc.php"},"modified":"2017-08-04T12:41:38","modified_gmt":"2017-08-04T16:41:38","slug":"3x-performance-boost-using-intel-advisor-and-intel-trace-analyzer-in-astrophysics-simulations-insidehpc","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/astro-physics\/3x-performance-boost-using-intel-advisor-and-intel-trace-analyzer-in-astrophysics-simulations-insidehpc.php","title":{"rendered":"3X Performance Boost Using Intel Advisor and Intel Trace Analyzer in Astrophysics Simulations &#8211; insideHPC"},"content":{"rendered":"<p><p>    Sponsored Post  <\/p>\n<p>    Few problems are more computationally intense than    magnetohydrodynamics (MHD) simulations for astrophysics. Even    with the best algorithms and hardware, some calculations can    take weeks to complete.  <\/p>\n<p>    Simulations     mathematical modeling  is used to discover the evolutionary    processes that created and continue to shape the universe.    Clearly, performing experiments in the laboratory here on Earth    are just not possible. But simulating these complex cosmic    processes at high resolution is possible and requires the most    powerful supercomputers.  <\/p>\n<p>    At Novosibirsk State University (NSU), a major research and    education center in Siberia, astrophysicists needed to optimize    performance of the AstroPhi project codes they were developing    for Intel Xeon PhiTM processor-based hardware. This    valuable project helps students learn to create numerical    simulation codes for massively parallel supercomputers.  <\/p>\n<p>    A key aspect of the AstroPhi project was optimizing the code    for maximum performance on the Intel Xeon Phi processors.    Before optimization, the team had difficulty identifying vector    dependencies and choosing the best vector sizes. The goals for    optimizing the code were to remove vector dependencies that    inhibited optimization and to optimize memory load operations    by efficiently adapting vector and array sizes for the Intel    Xeon Phi architecture. To help achieve these goals, the team    turned to Intel Advisor and Intel Trace Analyzer and Collector,    tools that are part of Intel Parallel Studio XE.  <\/p>\n<p>    The NSU team co-designed a new solver for massively parallel    architectures based on Intel Xeon Phi processors. They based    the solver on Intel Advanced Vector Extensions 512 (Intel    AVX-512) instructions. These instructions deliver 512-bit SIMD    support and enable programs to pack eight double-precision or    16 single-precision floating-point numbers, or eight 64-bit    integers, or 16 32-bit integers within the 512-bit vectors.    This enables processing twice the number of data elements that    AVX\/AVX2 can process with a single instruction, and 4X that of    SSE.  <\/p>\n<p>    On todays processors, it is crucial to both vectorize (using    AVX* or SIMD* instructions) and parallelize software to realize    the full performance potential of the processor. Using Intel    Advisor, part of Intel Parallel Studio XE, the team was able to    perform a roofline analysis to highlight poor-performing loops    and show performance headroom for each loop, identifying which    can be improved and which are worth improving.  <\/p>\n<p>    The team reported that Intel Advisor made it easier to identify    bottlenecks and determine the best optimization strategies by    forecasting performance gains in various scenarios, greatly    eliminating wasted implementation time. Intel Advisor    provided the project team tips for effective vectorization    along with key data like trip counts, data dependencies, and    memory access patterns, to make vectorization safe and    efficient.  <\/p>\n<p>    Also, using the graphical Intel Trace Analyzer and Collector    increased the teams understanding of the applications MPI    communication behavior across nodes. Here too they were quickly    able to find bottlenecks, improve correctness, and maximize the    applications performance on Intel architecture. MPI    communications profiling and analysis features helped to    improve application scaling.  <\/p>\n<p>    By optimizing their applications with tools from Intel Parallel    Studio XE, and running on the latest Intel hardware, the NSU    team achieved a performance speed-up of 3X, cutting the    standard time for calculating one problem from one week to just    two days.  <\/p>\n<p>    Intel Parallel Studio XE is a comprehensive software    development suite of compilers and tools that gives developers    the ability to maximize application performance on todays and    future processors by taking advantage of the ever-increasing    processor core count and vector register width.  <\/p>\n<p>        Download your free 30-day trial of Intel Parallel Studio    XE  <\/p>\n<\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Read this article:<\/p>\n<p><a target=\"_blank\" href=\"https:\/\/insidehpc.com\/2017\/08\/3x-performance-boost-using-intel-advisor-and-intel-trace-analyzer-in-astrophysics-simulations\/\" title=\"3X Performance Boost Using Intel Advisor and Intel Trace Analyzer in Astrophysics Simulations - insideHPC\">3X Performance Boost Using Intel Advisor and Intel Trace Analyzer in Astrophysics Simulations - insideHPC<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> Sponsored Post Few problems are more computationally intense than magnetohydrodynamics (MHD) simulations for astrophysics. Even with the best algorithms and hardware, some calculations can take weeks to complete.  <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/astro-physics\/3x-performance-boost-using-intel-advisor-and-intel-trace-analyzer-in-astrophysics-simulations-insidehpc.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[22],"tags":[],"class_list":["post-232278","post","type-post","status-publish","format-standard","hentry","category-astro-physics"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/232278"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=232278"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/232278\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=232278"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=232278"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=232278"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}