Inspur Wins Contract for NVLink™ V100 Based Petascale AI Supercomputer from CCNU

DENVER, Nov. 21, 2017 /PRNewswire/ -- On November 16 (US Mountain Time), Inspur announced the news in SC17 that Inspur has been awarded a contract to design and build a Petascale AI Supercomputer based on "NVLink + Volta" for Central China Normal University(CCNU), part of the university's ongoing research efforts in frontier physics and autonomous driving AI research.

The supercomputer will configure 18 sets of Inspur AGX-2 servers as computing nodes, 144 pieces of the latest Nvidia Volta architecture V100 chips that support NvLink 2.0, and the latest Intel Xeon SP (Skylake) processor. It will run Inspur ClusterEngine, AIStation and other cluster management suites, with high speed interconnection via Mellanox EDR Infiniband . The peak performance of the system will reach 1 PetaFlops. With NVLink2.0 and Tesla V100 GPU, the system will be able to simultaneously support both HPC and AI computing.

Inspur AGX-2 is the world's first density AI Supercomputer, supporting 8 *NVIDIA® Volta® 100 GPUs with NVLink 2.0 enabled in a 2U form factor. It offers NVLink 2.0 for faster interlink connections between the GPUs with bi-section bandwidth of 300 GB/s. AGX-2 also features great I/O expansion capabilities, supporting 8x NVMe/SAS/SATA hot swap hard drives and up to 4 EDR InfiniBand HCAs. AGX-2 supports both air-cooling and on-chip liquid-cooling to optimize and improve power efficiency and performance.

AGX-2 can significantly improve the computing efficiency of HPC, with 60T double precision flops per server. For VASP software, used extensively in physics and material science, AGX-2's performance with one P100 GPU equals 8 nodes 2-socket mainstream CPU computing clusters. The Nvlink provided by AGX-2 also features excellent performance in parallel efficiency of multiple GPU cards, with 4x P100 GPU cards in parallel reaching the performance of nearly 20 nodes 2-socket mainstream CPU computing clusters.

For AI computing, the Tesla V100 employed by AGX-2 is equipped with Tensor for deep learning, which will achieve 120 TFLOPS to greatly improve the training performance of deep learning frameworks with NVLink 2.0 enabled.  Based on the Imagenet dataset for deep learning training models, the AGX-2 shows excellent scalability. Configured with 8x V100, the AGX-2 delivers 1898 images/s, which is 7 times faster than a single card and 1.87 times than P100 with the familiar config, when GoogleNet model is trained with TensorFlow.

Central China Normal University plans to further upgrade the AI supercomputer to multi-Peta flops system.

