{"id":227181,"date":"2017-07-12T11:45:03","date_gmt":"2017-07-12T15:45:03","guid":{"rendered":"http:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/uncategorized\/case-study-more-efficient-numerical-simulation-in-astrophysics-insidebigdata.php"},"modified":"2017-07-12T11:45:03","modified_gmt":"2017-07-12T15:45:03","slug":"case-study-more-efficient-numerical-simulation-in-astrophysics-insidebigdata","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/astro-physics\/case-study-more-efficient-numerical-simulation-in-astrophysics-insidebigdata.php","title":{"rendered":"Case Study: More Efficient Numerical Simulation in Astrophysics &#8211; insideBIGDATA"},"content":{"rendered":"<p><p>    Sponsored Post  <\/p>\n<p>    Novosibirsk State    University is one of the major research and educational    centers in Russia and one of the largest universities in    Siberia. When researchers at the University were looking    to develop and optimize a software tool for numerical    simulation of magnetohydrodynamics (MHD) problems with    hydrogen ionization part of an astrophysical objects    simulation (AstroPhi) projectthey needed to optimize the    tools performance on Intel Xeon Phi processor-based    hardware. The team turned to Intel Advisor and    Intel Trace Analyzer    and Collector. This resulted in a performance    speed-up of 3X, cutting the standard time for calculating one    problem from one week to just two days.  <\/p>\n<p>    Mathematical modeling plays a key role in modern astrophysics.    It is the universal tool for research of non-linear    evolutionary processes in the universe. Modeling the    complex astrophysical processes in high resolution takes the    most powerful supercomputers. The Universitys AstroPhi project    develops astrophysical code for massively parallel    supercomputers with Intel Xeon Phi processors. This valuable    project helps students learn to create numerical    simulation code for massively parallel supercomputers. The    students also learn about modern HPC hardware    architecturespreparing them to develop tomorrows exascale    supercomputers.  <\/p>\n<p>      The use of Intel Advanced Vector Extensions for Intel Xeon      Phi processors gave us the maximum code performance compared      with other architectures available on the market, said      Igor Kulikov, Assistant Professor, Novosibirsk State      University.    <\/p>\n<p>    Numerical Method  <\/p>\n<p>    The team designed the project using a numerical method shown in    the figure below. The benefits of this high-order method    included:  <\/p>\n<p>    The first three benefits are the key factors for realistic    modeling of all the significant physical effects in    astrophysical problems. The simplicity of the method, plus the    small number of MPI send\/receive operations, provides efficient    parallelizationand potentially infinite scalability in terms    of weak scalability.  <\/p>\n<\/p>\n<p>    Massively Parallel Architecture  <\/p>\n<p>    The team co-designed the new solver for massively parallel    architecture based on Intel Xeon Phi processors. Designed to    help eliminate node bottlenecks and simplify code    modernization, the bootable processors provided the power    efficiency the team needed to handle the most demanding    high-performance computing applications.  <\/p>\n<p>    The team based the solver on Intel Advanced Vector Extensions    512 (Intel AVX-512) instructions, which deliver 512-bit SIMD    support and enable programs to pack eight    double-precision or 16 single-precision floating-point numbers,    or eight 64-bit integers, or 16 32-bit integers within the    512-bit vectors. This enables processing of 2X the number of    data elements that AVX\/AVX2 can process with a single    instruction, and 4X that of SSE.  <\/p>\n<p>      The use of Intel Advanced Vector Extensions 512 for Intel      Xeon Phi processors gave us the maximum code performance      compared with other architectures available on the market,      said Igor Kulikov, assistant professor at NSU.    <\/p>\n<p>    Optimizing the Code  <\/p>\n<p>    A key aspect of the AstroPhi project was optimizing the code    for maximum performance on the Intel Xeon Phi processors.    Before optimization, the team had some problems with vector    dependencies and vector sizes. The goals for optimizing the    code were to remove vector dependencies and optimize memory    load operations, efficiently adapting vector and array sizes    for the Intel Xeon Phi architecture. The team used Intel    Advisor and Intel Trace Analyzer and Collector, two tools that    are part of Intel Parallel    Studio XE, for the optimization.  <\/p>\n<p>    Intel Parallel Studio XE is a comprehensive software    development suite that helps developers maximize application    performance on todays and future processors by taking    advantage of the ever-increasing processor core count and    vector register width.  <\/p>\n<p>    Intel Advisor is a software tool based on the fact that for    modern processors, it is crucial to both vectorize (use AVX* or    SIMD* instructions) and thread software to realize the    full performance potential of the processor. Using this tool,    the team was able to perform a roofline analysis highlighting    poor-performing loops and showing performance headroom for each    loop, identifying which can be improved and which are worth    improving.  <\/p>\n<p>      Intel Advisor made it easier to find the cause of bottlenecks      and decide on next optimization steps, explained Igor      Chernykh, assistant professor at NSU. It provided data to      help us forecast the performance gain before we invested      significant effort in implementation.    <\/p>\n<p>    Intel Advisor sorted loops by potential gain, making compiler    reports easier to read by showing messages on the source, and    giving the project team tips for effective vectorization. It    also provided key data like trip counts, data dependencies, and    memory access patterns make vectorization safe and efficient.  <\/p>\n<p>    Intel Trace Analyzer and Collector was another help in    optimizing the code. This graphical tool helped the team    understand MPI application behavior, quickly find bottlenecks,    improve correctnessand, ultimately, maximize the tools    performance on Intel architecture. It includes MPI    communications profiling and analysis features that helped to    improve weak and strong scaling.  <\/p>\n<p>    Results  <\/p>\n<p>    After all the improvements and optimizations, the team achieved    190 GFLOPS performance and 0.3 FLOP\/byte arithmetic intensity,    with 100 percent mask utilization and 573 GB\/s memory    bandwidth.  <\/p>\n<p>      Using Intel Advisor and Intel Trace Analyzer and Collector,      we were able to remove vector dependencies, optimize load      operations, and adapt vector and array size for the Intel      Xeon Phi architecture, explained Kulikov. This optimization      gave the opportunity to run 3X more variants of astrophysical      tests.    <\/p>\n<\/p>\n<p>    Download your free    30-day trial of Intel Parallel Studio XE  <\/p>\n<\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Visit link: <\/p>\n<p><a target=\"_blank\" href=\"https:\/\/insidebigdata.com\/2017\/07\/11\/case-study-efficient-numerical-simulation-astrophysics\/\" title=\"Case Study: More Efficient Numerical Simulation in Astrophysics - insideBIGDATA\">Case Study: More Efficient Numerical Simulation in Astrophysics - insideBIGDATA<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> Sponsored Post Novosibirsk State University is one of the major research and educational centers in Russia and one of the largest universities in Siberia.  <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/astro-physics\/case-study-more-efficient-numerical-simulation-in-astrophysics-insidebigdata.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[22],"tags":[],"class_list":["post-227181","post","type-post","status-publish","format-standard","hentry","category-astro-physics"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/227181"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=227181"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/227181\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=227181"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=227181"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=227181"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}