Saturday, November 23, 2024

Benchmarks Of Google’s Axion Arm-based CPU: Competitive Performance & Compelling Value

Must read

Earlier this year Google announced Axion as their first Arm-based CPU for the Google Cloud. Today already they are taking Axion to general availability with the new C4A instances. These new C4A instances are advertised as offering up to 50% better performance and up to 60% better energy efficiency than their current generation x86 instance types. In this article are some of the first public independent performance benchmarks of the Google Axion CPU along with comparing to existing GCE Arm and x86_64 instance types.

Google was kind enough to allow me gratis access to the C4A instances the past several weeks in the Google Cloud / Google Compute Engine for running some benchmarks of their first in-house Arm data center processor. The first-generation Google Axion processors are making use of Arm Neoverse-V2 cores with up to 72 cores per processor.

GCE C4A instance sizes

The Axion Armv9-based processors do enable SVE2, BTI, BF16, I8MM, PAC, and PMU as some prominent additional features enabled. Google is promoting their Axion C4A instances as being great for general purpose workloads, containerized micro-services, open-source databases / in-memory stores, data analytics, CPU-based AI inferencing, and similar workloads.

Google Axion CPU /proc/cpuinfo

Besides featuring the new Axion processors, the C4A instances feature local SSD storage, up to 100G networking, Titanium network and storage offloads, and a variety of sizes from 1 to 72 vCPUs. There are both standard and highmem instances available with the latter providing 8GB of RAM per vCPU rather than 4GB per vCPU as standard.

Google Axion CPU on Linux in Google Cloud

The Google C4A instances are supported by all the major ARM64 enterprise Linux distributions from RHEL to SUSE, Ubuntu, Rocky Linux, and others. For the purposes of my testing I went with using Ubuntu 24.04 LTS across all tested instance types.

Google Axion lscpu

Google’s Axion follows the way of Amazon Graviton and Microsoft Azure Cobalt for the public clouds and hyperscalers in coming out with their own in-house Arm processor designs. Google Compute Engine has offered Arm-based instances with Google Tau VMs powered by Ampere Altra but now with Axion they have taken their Arm processor needs in-house.

Given prior Neoverse-V2 testing at Phoronix with the likes of Graviton4 and NVIDIA GH200 Grace, it was a given that it would be quite a performant experience… For this launch-day testing I was comparing the Google C4A standard memory 48 vCPU instance against other GCE 48 vCPU instances including the C4 using Intel Xeon Platinum Emerald Rapids (Xeon Platinum 8581C) and Tau T2A Ampere Altra 48 vCPU instances. Each 48 vCPU instance tested was comprised of 180~192GB of memory and tested using Ubuntu 24.04 LTS with the Linux 6.8 kernel. No AMD EPYC instances were tested in this comparison since Google Compute Engine currently doesn’t offer any current-gen “C4” AMD instance type. Plus with Google only covering the gratis access for the C4A instance types, the number of instances tested outside of C4A was limited to keep costs low given today’s very challenging environment for web publishers.

As for how the Axion pricing is stacking up, for an Intel Xeon C4 48 vCPU instance with 180GB of RAM, that CPU/memory pricing is $1,731.59 per month or about $2.37 per hour. The T2A Ampere Altra 48 vCPU size with 192GB of memory is $1,349.04 per month or about $1.85 hourly…. And then the new C4A instance type with 48 vCPUs and 192GB of memory is $1,573.30 per month or about $2.16 hourly. So Axion is more than the aging Ampere Altra instances but far less than the C4 Intel type. Pricing based on the US-Central1 data as of writing. Within the benchmark results shown in this article is also performance-per-dollar metrics.

Google Axion C4A 48 vCPU vs. C4 Xeon vs. T2A Ampere Altra

Due to the Google Axion instances (and other GCE VMs tested) not exposing any CPU power metrics that could be systematically queried, there isn’t any CPU power consumption / performance-per-Watt data to share in this article. Thus the focus with my launch-day testing for the Google C4A Axion instance family is around the raw performance and performance-per-dollar against C4 Intel and T2A Ampere Altra instances in Google Cloud. For those curious how Google Axion compares to AWS Graviton4, that will be coming up in a separate article on Phoronix in the next few days.

Latest article