TL;DR: AWS Graviton4 (R8g, C8g, M8g) delivers 15-30% better throughput at 20% lower hourly cost than equivalent x86 instances on EKS. Combined with Karpenter right-sizing and Spot, total cluster spend drops 30-40%. The catch: x86-only dependencies, Windows workloads, and GPU jobs still need x86. We tested across Node.js, Java, and Python ML workloads on production EKS clusters — full benchmarks below.

By Nitin Yadav, SquareOps · Published May 9, 2026 · Last updated May 9, 2026

In this article

Graviton4 promises 40% better price-performance over x86. We tested it on production EKS clusters across Node.js APIs, Java Spring Boot services, Python ML inference, and Nginx proxies. Here's what we found, with real numbers — not marketing claims. Per AWS, Graviton4's Neoverse V2 cores ship 30% better compute than Graviton3; Graviton4 vs x86 performance is now a clear win for most container workloads. SquareOps manages 50+ EKS clusters and the migration is a low-friction lever for AWS Graviton cost savings 2026: 20-40% on compute, more once Karpenter and Savings Plans layer in. Need help executing the migration end-to-end? Our managed Kubernetes services handle multi-arch builds, Karpenter NodePool config, and canary rollouts.

Cut your EKS spend by 30%+ this quarter.
Free EKS cost audit by SquareOps.
Request your audit

What's new in AWS Graviton4 (R8g, C8g, M8g)

Graviton4 is the fourth generation of Amazon's ARM Neoverse-based silicon, now generally available across the EC2 8g family. The headline numbers from AWS launch blog: 30% better compute, 75% more memory bandwidth (DDR5), and ~60% better energy efficiency than Graviton3.

  • R8g — memory-optimised. Right pick for cache layers, JVM heaps, in-memory analytics. Available up to 192 vCPU.
  • C8g — compute-optimised. Best for stateless API tiers, web servers, batch processors.
  • M8g — general-purpose. The default for most EKS workloads when you're not sure.

For broader EC2 instance selection guidance see our EC2 instance selection guide. Graviton4 R8g C8g M8g instances are priced 5-10% above their Graviton3 predecessors but deliver ~30% more throughput; in $/req terms, Graviton4 wins on every workload we benchmarked.

Inside Neoverse V2: where the performance jump comes from

Graviton4's gains aren't marketing. The Neoverse V2 cores ship concrete architectural changes:

  • 8-wide instruction issue (vs 5-wide on V1). More work per cycle for branchy workloads like JIT-compiled JVM and V8.
  • SVE2 vector extensions (256-bit). Auto-vectorisation in modern compilers (GCC 13+, LLVM 16+) translates to measurable wins for numeric Python (numpy, ONNX) and Java HotSpot.
  • Improved branch predictor with TAGE-SC-L. The headline win for Java workloads — branch mispredict penalties dropped ~30% versus Graviton3.
  • DDR5-5600 with ~75% more memory bandwidth. Memory-bound workloads (Redis, in-memory caches, large JVM heaps) see the biggest deltas.
  • Up to 192 vCPU per instance (r8g.48xlarge). Big-iron memory-heavy workloads in a single node, 1.5TB DDR5.
  • ~60% better energy efficiency than Graviton3. Watts/req drops materially — relevant for ESG reporting and dense-rack deployments.

For practical EKS workload sizing: most container teams should default to 8g instances in the c8g (compute), m8g (general), or r8g (memory) family. Stick with c7i or m7i only when you have a specific x86-only dependency. The default pick has shifted in 2026.

Test methodology — what we actually measured

So you can replicate the numbers below: each workload ran in a dedicated EKS node group, single tenant per node, with HPA disabled and resource requests/limits pinned to cpu: 4, memory: 8Gi. Load was driven from a separate three-node Locust cluster in the same VPC; we discarded the first 5 minutes of each run as warmup. p50/p99 captured from a 25-minute steady-state window via a Prometheus + Grafana stack on the cluster. Cost figures use us-east-1 on-demand list pricing as of April 2026.

Graviton4 vs x86 benchmarks: our EKS test results

We ran the same four workload classes on c7i.2xlarge (x86 Intel Sapphire Rapids) and c8g.2xlarge (Graviton4) under identical EKS configurations. Test harness: 50 RPS sustained, p50/p99 latency captured over 30-minute steady state, cost computed from us-east-1 on-demand pricing.

Workloadx86 p99 (ms)Graviton4 p99 (ms)Throughput ΔCost / 1M req
Node.js API (Express)4234+22%$0.48 → $0.31
Java Spring Boot (JIT-warm)6854+18%$0.71 → $0.49
Python ML inference (ONNX)11296+15%$1.18 → $0.83
Nginx reverse proxy86+27%$0.09 → $0.06
EKS workload benchmarks: c7i.2xlarge (x86) vs c8g.2xlarge (Graviton4)

Take-away: Graviton4 EKS performance is 15-30% better at 20% lower hourly cost. The Java result is the most surprising — JIT compilers are typically slower to optimise on ARM, but Neoverse V2's improved branch predictor closes the gap. ARM vs x86 AWS is no longer a debate for stateless workloads.

Cost savings breakdown: what you actually save

Per-hour savings are only the start. The compounding lever is what matters:

  • Graviton hourly: ~20% cheaper than equivalent x86.
  • Karpenter right-sizing: another 15-25% by binpacking into the smallest viable node.
  • Spot for stateless: 50-70% off Spot-eligible workloads.
  • Savings Plans on the steady-state baseline: another 20-30%.

Worked example for a 20-node EKS cluster running mixed APIs:

ConfigurationMonthly CostSavings vs Baseline
x86 (c7i.2xlarge), on-demand$8,400
Graviton4 (c8g.2xlarge), on-demand$5,80031%
Graviton4 + Karpenter + Spot mix$3,90054%

For deeper cost mechanics see our Kubernetes cost optimization service page and broader cloud cost management playbook. Use SpendZero to scan your AWS account and identify x86 instances that can move to Graviton4 — free 5-minute scan, no commitment.

How do you migrate your EKS cluster to Graviton4?

The Graviton migration guide in five steps. Most clusters can pilot in a single afternoon and fully migrate over two-three weeks.

  1. Build multi-architecture container images. Use Docker buildx: docker buildx build --platform linux/amd64,linux/arm64 -t myorg/api:v1 --push .. Most popular base images (Alpine, Ubuntu, Node, Python, Go, OpenJDK) ship ARM64 variants. Multi-architecture container images are non-negotiable for a clean rollover.
  2. Configure Karpenter NodePool for Graviton. Add kubernetes.io/arch: arm64 to your NodePool requirements; Karpenter will auto-select the cheapest c8g/m8g/r8g instance for each pod's resource shape. EKS Graviton Karpenter together is the right pattern.
  3. Canary on a mixed-arch cluster. Add a Graviton4 NodePool alongside your x86 one. Schedule a single non-critical service to it via nodeSelector. Watch metrics for 24-48 hours.
  4. Validate with realistic load. Re-run your usual load test. Confirm latency, throughput, and any dependency breakage (some old C extensions still ship x86-only). For more on auto-scaling node patterns see our EKS Auto Mode vs manual deep-dive.
  5. Full rollover. Once you're confident, drain x86 nodes one by one and let Karpenter replace with Graviton4. Total downtime: zero, if you've done canary right.

Karpenter NodePool example for Graviton4

Drop this NodePool into your cluster to start scheduling onto Graviton4 alongside x86. Karpenter picks the cheapest valid instance based on each pod's resource shape:

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: graviton4
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["arm64"]
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values: ["c8g", "m8g", "r8g"]
        - key: karpenter.k8s.aws/instance-size
          operator: NotIn
          values: ["nano", "micro", "small"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
      taints:
        - key: arm64
          effect: NoSchedule
  limits:
    cpu: "1000"
    memory: 4000Gi
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 30s

The taint forces explicit opt-in: only pods with a matching toleration land on these nodes. That's deliberate — you want a controlled rollout, not surprise scheduling onto ARM during the canary phase. Once you're confident, drop the taint to let Karpenter freely scheduler-mix x86 and ARM. Tag matching pods like this:

spec:
  tolerations:
    - key: arm64
      operator: Exists
      effect: NoSchedule
  nodeSelector:
    kubernetes.io/arch: arm64

Multi-arch CI build with docker buildx

Most teams stumble on multi-arch images when their CI runner can't emulate ARM64. Minimal GitHub Actions workflow that handles it cleanly:

name: build-multiarch
on: [push]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: docker/setup-qemu-action@v3
      - uses: docker/setup-buildx-action@v3
      - uses: docker/login-action@v3
        with:
          registry: ${{ secrets.ECR_REGISTRY }}
          username: ${{ secrets.ECR_USER }}
          password: ${{ secrets.ECR_PASSWORD }}
      - uses: docker/build-push-action@v6
        with:
          context: .
          platforms: linux/amd64,linux/arm64
          push: true
          tags: myorg/api:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

Build time on a typical Node.js app with QEMU emulation: ~3-5 minutes for arm64 (vs ~1 minute native). For larger apps or compiled languages (Go, Rust, Java), use native ARM runners — ubuntu-24.04-arm on GitHub-hosted, or self-hosted Graviton runners for ~10x faster builds. The cost of native ARM CI runners has crossed below x86 in 2026, so there's no longer a reason to suffer QEMU.

Skip the migration headache. SquareOps managed Kubernetes services handle multi-arch builds, Karpenter setup, and canary rollouts end-to-end across 50+ production clusters. Most clients see 30-40% EKS savings within 60 days.

What are the most common Graviton4 migration pitfalls?

Migrations rarely fail because Graviton is broken — they fail because some piece of the stack assumes x86. Pattern recognition matters; four pitfalls we see most often in our managed K8s engagements:

  • Native dependencies in NPM/PyPI packages. Packages like node-sass, node-canvas, tensorflow, or grpc-tools ship pre-built x86 binaries. On ARM, npm/pip falls back to compiling from source — slower builds, occasional alpine-glibc failures. Mitigation: switch node-sass to sass (Dart-native), use tensorflow-cpu-aws ARM-optimised wheels, and prefer debian-slim over alpine base images for any Python ML container. A 30-minute audit of your Dockerfile RUN lines surfaces 90% of these.
  • JIT warmup spikes on Java. JVM workloads hit ARM-tuned HotSpot defaults differently than x86. Typical symptom: p99 latency spike for the first 60-90 seconds after pod start. Mitigation: extend readiness-probe grace periods (initialDelaySeconds: 90), disable PGO until you've characterised the new profile, or use ahead-of-time compilation via GraalVM native-image for cold-start sensitive paths. We've also seen teams pre-warm with synthetic traffic via init containers — overkill for most, but effective.
  • DaemonSets and sidecars without ARM images. Datadog agent, Fluent Bit, Splunk Connect, OpenTelemetry collector — all support ARM64 today, but you have to explicitly pin a multi-arch tag (e.g., :7.55, not :latest-amd64). Audit your DaemonSets first; one missing ARM image will block scheduling on Graviton nodes. Useful one-liner to check: kubectl get ds -A -o jsonpath='{range .items[*]}{.metadata.name}{": "}{.spec.template.spec.containers[*].image}{"\n"}{end}' piped through docker manifest inspect on each tag.
  • Performance regression on non-vectorised C extensions. Some legacy C extensions (older Postgres clients, certain protobuf builds, AES-NI-tied crypto libs) don't take advantage of SVE2 and can run 5-10% slower on ARM. Mitigation: load-test every production-critical service before flipping the default. Most modern libraries have ARM-optimised paths — the risk is concentrated in code that hasn't been touched in 2-3 years.

Each of these takes 1-3 days to debug if you hit it blind. A pre-migration audit catches all four in an afternoon — which is why the SquareOps managed-K8s onboarding always front-loads it. The audit itself is mostly mechanical: scan every Dockerfile, every DaemonSet, every FROM line, every npm install/pip install for known x86-only packages.

When should you NOT use Graviton?

Honest answer: not every workload is a Graviton fit.

  • x86-only dependencies. Oracle DB, some commercial .NET libraries, legacy proprietary binaries. Check vendor support before migrating.
  • Windows workloads. No ARM64 Windows on EC2 yet.
  • GPU workloads. Stay on G5/G6 (x86) for accelerated inference and training; Graviton has no integrated GPU SKUs at present.
  • Niche compiled-from-source code paths. Anything with hand-written x86 SIMD intrinsics will need rework.

For everything else — and that's the vast majority of containerized workloads in 2026 — Graviton energy efficiency alone (60% less power per request) is a sustainability/ESG win on top of the cost story.

Ready to cut your EKS costs by 30%+?

Graviton4 is the easiest cost-optimisation lever in AWS today — bigger savings than reserved instances, lower risk than re-architecting. SquareOps offers managed Kubernetes services with a free EKS cost audit that includes a Graviton-readiness assessment, Karpenter NodePool design, and migration roadmap. Run a 5-minute SpendZero scan first to see the size of the prize. Talk to our team.