The First SVE Enabled Arm Processor: A64FX and Building up Arm HPC Ecosystem

Please download to get full document.

View again

of 28
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Similar Documents
Information Report



Views: 0 | Pages: 28

Extension: PDF | Download: 0

Presented at ISC 19 AHUG Workshop in Frankfurt in June 2019. Overview of A64FX Arm processor with SVE and some performance results. Also, presented Arm HPC Ecosystem information.
  • 1. Shinji Sumimoto, Ph.D. Next Generation Technical Computing Unit FUJITSU LIMITED Jun. 20th, 2019 The First SVE Enabled Arm Processor: A64FX and Building up Arm HPC Ecosystem Copyright 2019 FUJITSU LIMITED0 AHUG@ISC19 Workshop
  • 2. Outline of This Talk The First SVE Enabled Arm Processor: A64FX A64FX: High Performance Arm CPU Arm HPC Ecosystem Development Arm HPC Software Topics •Activities with Arm, Linaro and OSS Community •OSS Application Porting Updates 1 Copyright 2019 FUJITSU LIMITED
  • 3.  Inheriting Fujitsu HPC CPU technologies with commodity standard ISA A64FX: High Performance Arm CPU Copyright 2019 FUJITSU LIMITED2
  • 4. High Performance Arm CPU “A64FX”  Architecture features ISA Armv8.2-A (AArch64 only) SVE (Scalable Vector Extension) SIMD width 512-bit Precision FP64/32/16, INT64/32/16/8 # of cores 48 computing cores + 4 assistant cores (4 CMGs) Memory HBM2: Peak B/W 1024 GB/s Interconnect TofuD: 28 Gbps x 2 lanes x 10 ports  Peak performance (Chip-level) 0.128 0.128 2.7+ 5.4+ 10.8+ 21.6+ 0 5 10 15 20 25 64 bits 32 bits 16 bits 8 bits SPARC64 VIIIfx (K computer) A64FX (Supercomputer "Fugaku") (TOPS) (Element size) N/A N/A HPC AI Copyright 2019 FUJITSU LIMITED3 HBM2 HBM2 HBM2 HBM2 TofuD Controller PCIe Controller NetworkonChip CMG(Core Memory Group) specification 13 cores L2 Cache 8 MiB Mem 8 GiB, 256 GB/s TofuD 28 Gbps x 2 lanes x 10 ports I/O PCIe Gen3 16 lanes Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core L2 Cache L2 Cache L2 Cache L2 Cache HBM2 Interface HBM2 Interface HBM2 interface HBM2 Interface PCIe InterfaceTofuD Interface RING-Bus
  • 5. 1 Peta FLOPS System: K computer vs. Post-K  K computer  80x compute racks & 20x disk racks  Post-K (Now Fugaku)  1x rack w/ SSDs 2.2m More applications as well as system software will come in collaboration with Open Source Community K computer Post-K Compute nodes 7,680(=96x80) 384 IO nodes 4,80(=6x80) Footprint (m2) 128(=4x32) 1.1 SPARC Linux Arm Linux Copyright 2019 FUJITSU LIMITED4 + ICC SPARC64 VIIIfx A64FX
  • 6. A64FX CPU Performance Evaluation Over 2.5x faster in HPC & AI benchmarks than SPARC64 XIfx Copyright 2019 FUJITSU LIMITED5
  • 7. A64FX Performance Comparison(1/2) Himeno Benchmark (Fortran90) † “Performance evaluation of a vector supercomputer SX-aurora TSUBASA”, SC18, Copyright 2019 FUJITSU LIMITED6
  • 8. A64FX Performance Comparison(2/2) Copyright 2019 FUJITSU LIMITED WRF: Weather Research and Forecasting model  Vectorizing loops including IF-constructs is key optimization  Source code tuning using directives promotes compiler optimizations x x 7
  • 9.  With Arm and Linaro  With OSS Community: SPACK, Open MPI and Lustre Arm HPC Software Topics: Activities with Community Copyright 2019 FUJITSU LIMITED8
  • 10. Strong Relationship with Arm HPC community  Arm  Great Establishment and Contribution to Arm HPC base such as SVE Support of Linux GCC and OpenHPC  Linaro  Building binary portability on Arm HPC •Standardization of Arm Basic System Software (Linux Kernel, glibc, GCC etc.) and Upstreaming to OSS community •Developing and upstreaming SVE software to OSS community  OpenHPC  Developing Standard IA and Arm HPC software portability  Distribution •2017/11: v1.3.3 the first Arm version distributed •2019/6: v1.3.8 for Arm distributed Copyright 2019 FUJITSU LIMITED9
  • 11. Activities with Arm and Linaro LLVM SVE upstreaming and OSS porting with Arm Variable Vector Length Support for LLVM Community in cooperation with Arm OpenHPC with Linaro: Mr. Okamoto(Fujitsu) has been selected a 2018-2019 TSC(Technical Steering Committee) member Development Status with Linaro LLVM/Clang for aarch64 Improvement: now ongoing •Register allocation, Software pipelining support, Vectorization/SIMDization •Pushing SVE support to the LLVM community in cooperation with Arm, Variable Vector Length Support is critical issue to introduce to LLVM tree. QEMU/SVE Development: for building SVE software development •V4.0.0 released: 10 Copyright 2019 FUJITSU LIMITED
  • 12. QEMU/SVE Development with Linaro 11 Copyright 2019 FUJITSU LIMITED  2019/4/23: Version 4.0.0 Released
  • 13. QEMU on Armv8+SVE with Fedora  Fedora/aarch64 :  Fedora 29 supports SVE Enabled Kernel and GCC Compilers  Easy to Use: Downloading raw-image and Running with virt-manager •CPU type: max (not cortex-xx) 12 Copyright 2019 FUJITSU LIMITED
  • 14.  In collaboration with LLNL and R-CCS Our goals of SPACK world and testing status Copyright 2019 FUJITSU LIMITED13
  • 15. About SPACK package manager 14 Copyright 2019 FUJITSU LIMITED
  • 16. Software Installation with SPACK on QEMU/Aarch64 w/ SVE  Installing hdf5 15 Copyright 2019 FUJITSU LIMITED
  • 17. Our goals of Arm HPC Ecosystem with SPACK  Pushing forward building up Arm HPC w/SVE Ecosystem  Building tools and applications for Arm HPC w/SVE  Distributing and Sharing execution binaries for not only in computational centers but also in Arm HPC users in the world.  Final Target is binary distribution including OpenHPC distribution  Sharing community build binary packages in the world  Not only private package building environment but also sharing binaries in centers, countries, and world  Not limited open source packages, but also commercial binaries. SPACK Custom User System Manager SPACK Shared Bin. SPACK Custom SPACK Custom User User RPM(Deb) Shared Bin. Computation Center SPACK Custom User System Manager SPACK Shared Bin. SPACK Custom SPACK Custom User User RPM(Deb) Shared Bin. Computation Center Binary Distribution Cloud SPACK Shared Bin. RPM(Deb) Shared Bin. OpenHPC Shared Bin. OpenHPC Shared Bin. OpenHPC Shared Bin. Commercial HPC Bin. Commercial HPC Bin. Commercial HPC Bin. Copyright 2019 FUJITSU LIMITED16
  • 18. Testing Status  Testing Environments:  Platforms: Thunder X2 system and Qemu 3.1 for Aarch64 w/ SVE  OS: CentOS 7.x, Fedora29  Networking environment: Proxy Internet Access w/ User Authentication  First Impression:  Working fine very easily on Aarch64 environment  Good for custom binary building tool for each execution environment •Rpm packages on the Internet are unified binaries.  Sometimes timeout on downloading because of heavily congestion internet •Curl gave-up and build error occurred  Some applications/tools fail to compile not implemented for Aarch64 •Overall test results are as follows Copyright 2019 FUJITSU LIMITED17
  • 19. Result of package installation using Spack  Total 3,225 packages are registered (releases/v0.12.1 6th June 2019)  Confirmation condition  Compiling without modifying anything except URL and checksum  Using gcc 4.8.5  No confirmation for execution  # of Success  Evaluation Status  Around 70% of packages are success on Aarch64  Existing Several failure patterns •Some of them can be fixed by modification of configuration files 18 Copyright 2019 FUJITSU LIMITED NOW (Jun. 2019) Jan. 2019 X86(gcc) 2,386/3225(73%) 2,284/2,907(78%) Arm(gcc) 2,199/3225(68%) 1,336/2,907(45%)
  • 20. Post-K(Now Fugaku) Software Stack  Post-K system supports SBSA/SBBR  Keeping binary compatibility with the other Aarch64 based systems. Copyright 2019 FUJITSU LIMITED Post-K System Hardware Linux OS / McKernel (Lightweight Kernel) FUJITSU Technical Computing Suite / RIKEN Advanced System Software Post-K Applications Management Software Programming EnvironmentHierarchical File I/O Software System management for highly available & power saving operation Job management for higher system utilization & power efficiency Lustre-based distributed file system FEFS OpenMP, COARRAY, Math Libs. Compilers (C, C++, Fortran) Debugging and tuning tools MPI (Open MPI, MPICH) XcalableMP Application-oriented file I/O middleware Post-K Under Development w/ RIKEN 19
  • 21. Open MPI: from SC18 BoF Slides 20 Copyright 2019 FUJITSU LIMITED
  • 22. Open MPI: from SC18 BoF Slides 21 Copyright 2019 FUJITSU LIMITED Half-precision(FP16)datatype development started in cooperation with ANL and Mellanox
  • 23. Open MPI: from SC18 BoF Slides 22 Copyright 2019 FUJITSU LIMITED
  • 24. Lustre Community: OpenSFS and EOFS  OpenSFS: US Based Non-profit Organization  President: Stephen Simms (Indiana University)  EOFS: EU Based Non-profit Organization  President: Frank Baetke (HPE)  Lustre for Arm  Fujitsu is member of OpenSFS and will support Lustre based products.  Two Major Events  Lustre User Group(LUG) •LUG 19@Houston, 2019/5/15-17  Lustre Admins and Devs workshop(LAD) •LAD 19@Paris, 2019/9/24-25 Slides Archives are on each site 23 Copyright 2019 FUJITSU LIMITED
  • 25. 2018/11: Whamcloud has started Lustre client support on Arm based platforms 24 Copyright 2019 FUJITSU LIMITED
  • 26. Arm HPC Software Topics: OSS Application Porting Updates Copyright 2019 FUJITSU LIMITED25
  • 27. OSS apps porting at Arm HPC Users Group  Twelve primary OSS applications are listed and being tested in the Users Group for each compilers, collaboratively w/ Arm Copyright 2019 FUJITSU LIMITED * Registered by Fujitsu ( 26 Application Lang. GCC LLVM Arm Fujitsu LAMMPS C++ Modified Modified Modified Modified GROMACS C Modified Modified Modified Modified GAMESS* Fortran Modified Modified Modified Modified OpenFOAM C++ Modified Modified Modified Modified Siesta* Fortran Ok in as is Issues found Issues found Modified NAMD C++ Modified Modified Modified Modified WRF Fortran Modified Modified Modified Modified Quantum ESPRESSO Fortran Ok in as is Ok in as is Ok in as is Modified NWChem Fortran Ok in as is Modified Modified Modified ABINIT Fortran Modified Modified Modified Modified CP2K Fortran Ok in as is Issues found Issues found Modified NEST* C++ Ok in as is Modified Modified Modified USQCD (MILC) C Ok in as is Modified Modified Modified BLAST* C++ Ok in as is Modified Modified Modified
  • 28. Copyright 2019 FUJITSU LIMITED27
  • Recommended
    View more...
    We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks

    We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

    More details...

    Sign Now!

    We are very appreciated for your Prompt Action!