Prowess Consulting celebrates twenty years in business! Learn more about our past and our plans for the future.

Boost Performance for Real-World Workloads with Built-in Accelerators

Download the research below

Executive Summary

The performance of modern computing workloads is important in helping companies get the results they want. Many companies might think they need additional discrete hardware acceleration for their most critical workloads. But they might not be aware of the ways their existing infrastructures can effectively execute even their top-tier workloads.

Instead of building customized discrete solutions or making additional investment on cloud instances, companies could rely on the acceleration technologies already built into CPUs or vCPUs. These built-in accelerators are designed to provide a range of benefits, including increased application performance, reduced costs, and improved power efficiency.

Prowess Consulting engineers tested how these built-in CPU accelerators can help organizations meet their current computing needs. Whether on premises or in the cloud, these accelerators provide a range of benefits. For this study, Intel sponsored Prowess Consulting to conduct testing and apply benchmarking best practices. Prowess chose Amazon Web Services® (AWS®) cloud instances for the testing platform. We compared the performance of the following Intel® and AMD® processors with different built-in accelerator technologies:

• 3rd Gen Intel® Xeon® Scalable processor (AWS m6i.4xlarge, 16 vCPUs)
• 3rd Generation AMD EPYC™ processor (AWS m6a.4xlarge [16 vCPUs] and AWS m6a.8xlarge [32 vCPUs])

We used three test workloads—NAMD, ResNet-50, and OpenSSL—that were optimized to take advantage of the accelerators. These three applications were chosen to represent common compute-intensive workload categories (respectively): high-performance computing (HPC), artificial intelligence/machine learning (AI/ML), and encryption.

The results from the comparison of the 16-vCPU Intel and AMD instances leaned heavily in favor of the m6i.4xlarge instances with Intel Xeon processors, which provide a broader set of built-in accelerators, compared to AMD EPYC processors. Due to this strong performance by Intel-based instances, Prowess tested 16-vCPU Intel processor–based instances against a larger AMD-based instance with twice that number of AMD vCPUs. This 16 vCPU versus 32 vCPU comparison again largely showed that Intel processor–based m6i.4xlarge instances with built-in acceleration performed significantly better:1

• Up to 7.13x faster encryption with OpenSSL ecdsa-sign than the 32-vCPU AMD processor–based instance at the same cost
• Up to 2.12x more simulation capacity per day than the 32-vCPU AMD processor–based instance at the same cost
• Up to 3.93x more image classification throughput than the 32-vCPU AMD processor–based instance at the same cost