Blog
AWS
Kubernetes
3
minutes

Deploying AI Apps with GPUs on AWS EKS and Karpenter

As AI and machine learning workloads continue to grow in complexity and size, the need for efficient and scalable infrastructure becomes more important than ever. In this tutorial, I will show you how to deploy AI applications on AWS Elastic Kubernetes Service (EKS) with Karpenter from scratch, leveraging GPU resources for high-performance computing. We'll use Qovery, an Internal Developer Platform that simplifies the deployment and management of applications, ensuring developers can focus on building their applications rather than managing infrastructure.
September 26, 2025
Romaric Philogène
CEO & Co-founder
Summary
Twitter icon
linkedin icon

Why Use AWS EKS with Karpenter

AWS EKS provides a managed Kubernetes service that simplifies running Kubernetes without needing to install, operate, and maintain your own cluster control plane. Combined with Karpenter, an open-source, high-performance Kubernetes cluster autoscaler, you get a flexible and cost-effective solution that can efficiently manage the provisioning and scaling of nodes based on the application's requirements.

Karpenter specifically helps handle variable workloads by provisioning the right resources at the right time, which is ideal for AI applications with sporadic or compute-intensive tasks requiring GPU capabilities. (read this article I wrote to learn more)

Install AWS EKS and Karpenter with Qovery

To begin, you'll need to set up AWS EKS and Karpenter. Qovery integrates seamlessly into your AWS environment, allowing you to set up EKS with Karpenter with just a few clicks:

  1. Create a Qovery account: connect to the Qovery web console.
  2. Create AWS EKS: Add your AWS EKS cluster and choose the region and configure your cluster specifications.
  3. Enable Karpenter: With the cluster ready, install Karpenter directly from the cluster advanced settings. Qovery automates the integration process, ensuring Karpenter aligns with your EKS settings for optimal performance.
Enable Karpenter for AWS EKS Cluster managed by Qovery

Install NVIDIA device plugin on AWS EKS

The NVIDIA device plugin for Kubernetes is an implementation of the Kubernetes device plugin framework that advertises GPUs as available resources to the kubelet.

This plugin is necessary as it helps manage GPU resources available to Kubernetes pods. For that, we will use the official NVIDIA Helm Chart.

Helm Repository: https://nvidia.github.io/k8s-device-plugin
Helm Chart: nvidia-device-plugin
Helm Version: 0.15.0

With Qovery, you simply need to navigate to Organization Settings > Helm Repositories > Click "Add repository"

Add your NVIDIA Helm Repository 1/2

Then register the NVIDIA repository "https://nvidia.github.io/k8s-device-plugin"

Add your NVIDIA Helm Repository 2/2

Then, I recommend creating a "Tooling" project with a "NVIDIA" environment. ⚠️ Select your EKS with Karpenter cluster.

Create your NVIDIA environment on your AWS EKS with Karpenter cluster

Then you can create a Helm service "nvidia device plugin".

Now, you can deploy the "nvidia device plugin" service to install it on your EKS cluster.

Deploy an App Using a GPU

Deploying an AI application that uses a GPU can be streamlined using Qovery's Helm chart capabilities:

  1. Prepare your application with a Dockerfile and Helm chart: Make sure your application is containerized and ready for deployment.
  2. Push your code to a Git repository connected to Qovery.
  3. Use Qovery to deploy your application: Through the Qovery dashboard, set up your application deployment using the Helm chart, which should specify the necessary GPU resources via nodeSelector.
nodeSelector:
karpenter.sh/nodepool: gpu

Bonus: Using Spot Instances

To further optimize costs, use AWS Spot Instances for your GPU workloads. With Qovery, you can enable Spot Instances in the cluster's advanced settings:

  1. Navigate to the cluster advanced settings in Qovery.
  2. Set "aws.karpenter.enable_spot" to "true". Qovery handles the integration seamlessly, providing cost savings while ensuring resource availability for your applications.
Enable spot instances for AWS EKS with Karpenter

Conclusion

By combining AWS EKS with Karpenter and utilizing Qovery for deployment automation, you can streamline the deployment and management of AI applications that require GPU resources. This setup enhances performance and optimizes costs, making it an excellent choice for developers seeking to deploy AI applications at scale efficiently.

Begin deploying your AI apps today with Qovery and unlock the full potential of cloud-native technologies.

Share on :
Twitter icon
linkedin icon
Tired of fighting your Kubernetes platform?
Qovery provides a unified Kubernetes control plane for cluster provisioning, security, and deployments - giving you an enterprise-grade platform without the DIY overhead.
See it in action

Suggested articles

Kubernetes
 minutes
How to automate environment sleeping and stop paying for idle Kubernetes resources

Scaling your deployments to zero is only half the battle. If your cluster autoscaler does not aggressively bin-pack and terminate the underlying worker nodes, you are still paying for idle metal. True environment sleeping requires tight integration between your ingress layer and your node provisioner to actually realize FinOps savings.

Mélanie Dallé
Senior Marketing Manager
Kubernetes
DevOps
6
 minutes
10 best Kubernetes management tools for enterprise fleets in 2026

The biggest mistake enterprises make when evaluating Kubernetes management platforms is confusing infrastructure provisioning with Day-2 operations. Tools like Terraform or kOps are excellent for spinning up the underlying EC2 instances and networking, but they do absolutely nothing to prevent configuration drift, automate certificate rotation, or right-size your idle workloads once the cluster is actually running.

Mélanie Dallé
Senior Marketing Manager
DevOps
Kubernetes
Platform Engineering
6
 minutes
10 best Red Hat OpenShift alternatives to reduce licensing costs

For years, Red Hat OpenShift has been the safe choice for heavily regulated, on-premise environments. It operates as a secure fortress. But in the public cloud, that fortress acts as an expensive prison. Paying proprietary per-core licensing fees on top of your standard AWS or GCP compute bill is a redundant "middleware tax." Escaping OpenShift requires decoupling your infrastructure from your developer experience by running standard, vanilla Kubernetes paired with an agentic control plane.

Morgan Perry
Co-founder
AI
Product
3
 minutes
Qovery Skill for AI Agents: Deploy Apps in One Prompt

Use Qovery from Claude Code, OpenCode, Codex, and 20+ AI Coding agents

Romaric Philogène
CEO & Co-founder
Kubernetes
 minutes
Stopping Kubernetes cloud waste: agentic automation for enterprise fleets

Agentic Kubernetes resource reclamation is the practice of using an autonomous control plane to continuously identify, suspend, and delete idle infrastructure across a multi-cloud Kubernetes fleet. It replaces manual cleanup and reactive autoscaling with intent-based policies that act on business state, eliminating the configuration drift and cloud waste typical of unmanaged fleets.

Mélanie Dallé
Senior Marketing Manager
Platform Engineering
Kubernetes
DevOps
10
 minutes
What is Kubernetes? The reality of Day-2 enterprise fleet orchestration

Kubernetes focuses on container orchestration, but the reality on the ground is far less forgiving. Provisioning a single cluster is a trivial Day-1 exercise. The true operational nightmare begins on Day 2. Teams that treat multi-cloud fleets like isolated pets inevitably face crushing YAML configuration drift, runaway AWS bills, and severe scaling bottlenecks.

Morgan Perry
Co-founder
AI
Compliance
Healthtech
 minutes
Agentic AI infrastructure: moving beyond Copilots to autonomous operations

The shift from AI copilots to autonomous agents is redefining infrastructure requirements. Discover how to build secure, stateful, and compliant Agentic AI systems using Kubernetes, sandboxing, and observability while meeting EU AI Act standards

Mélanie Dallé
Senior Marketing Manager
Kubernetes
8
 minutes
The 2026 guide to Kubernetes management: master day-2 ops with agentic control

A beginner setting up Kubernetes focuses entirely on Day-1 provisioning, writing Terraform to spin up nodes and feeling victorious when the API server responds. But the real failure point is Day-2. Without an agentic control plane constantly reconciling state, your clusters will inevitably drift, secrets will expire, and idle pods will quietly consume thousands of dollars in cloud spend while your team is busy fighting fires.

Mélanie Dallé
Senior Marketing Manager

It’s time to change
the way you manage K8s

Turn Kubernetes into your strategic advantage with Qovery, automating the heavy lifting while you stay in control.