Blog
AWS
Kubernetes
3
minutes

Deploying AI Apps with GPUs on AWS EKS and Karpenter

As AI and machine learning workloads continue to grow in complexity and size, the need for efficient and scalable infrastructure becomes more important than ever. In this tutorial, I will show you how to deploy AI applications on AWS Elastic Kubernetes Service (EKS) with Karpenter from scratch, leveraging GPU resources for high-performance computing. We'll use Qovery, an Internal Developer Platform that simplifies the deployment and management of applications, ensuring developers can focus on building their applications rather than managing infrastructure.
Romaric Philogène
CEO & Co-founder
Summary
Twitter icon
linkedin icon

Why Use AWS EKS with Karpenter

AWS EKS provides a managed Kubernetes service that simplifies running Kubernetes without needing to install, operate, and maintain your own cluster control plane. Combined with Karpenter, an open-source, high-performance Kubernetes cluster autoscaler, you get a flexible and cost-effective solution that can efficiently manage the provisioning and scaling of nodes based on the application's requirements.

Karpenter specifically helps handle variable workloads by provisioning the right resources at the right time, which is ideal for AI applications with sporadic or compute-intensive tasks requiring GPU capabilities. (read this article I wrote to learn more)

Install AWS EKS and Karpenter with Qovery

To begin, you'll need to set up AWS EKS and Karpenter. Qovery integrates seamlessly into your AWS environment, allowing you to set up EKS with Karpenter with just a few clicks:

  1. Create a Qovery account: connect to the Qovery web console.
  2. Create AWS EKS: Add your AWS EKS cluster and choose the region and configure your cluster specifications.
  3. Enable Karpenter: With the cluster ready, install Karpenter directly from the cluster advanced settings. Qovery automates the integration process, ensuring Karpenter aligns with your EKS settings for optimal performance.
Enable Karpenter for AWS EKS Cluster managed by Qovery

Install NVIDIA device plugin on AWS EKS

The NVIDIA device plugin for Kubernetes is an implementation of the Kubernetes device plugin framework that advertises GPUs as available resources to the kubelet.

This plugin is necessary as it helps manage GPU resources available to Kubernetes pods. For that, we will use the official NVIDIA Helm Chart.

Helm Repository: https://nvidia.github.io/k8s-device-plugin
Helm Chart: nvidia-device-plugin
Helm Version: 0.15.0

With Qovery, you simply need to navigate to Organization Settings > Helm Repositories > Click "Add repository"

Add your NVIDIA Helm Repository 1/2

Then register the NVIDIA repository "https://nvidia.github.io/k8s-device-plugin"

Add your NVIDIA Helm Repository 2/2

Then, I recommend creating a "Tooling" project with a "NVIDIA" environment. ⚠️ Select your EKS with Karpenter cluster.

Create your NVIDIA environment on your AWS EKS with Karpenter cluster

Then you can create a Helm service "nvidia device plugin".

Now, you can deploy the "nvidia device plugin" service to install it on your EKS cluster.

Deploy an App Using a GPU

Deploying an AI application that uses a GPU can be streamlined using Qovery's Helm chart capabilities:

  1. Prepare your application with a Dockerfile and Helm chart: Make sure your application is containerized and ready for deployment.
  2. Push your code to a Git repository connected to Qovery.
  3. Use Qovery to deploy your application: Through the Qovery dashboard, set up your application deployment using the Helm chart, which should specify the necessary GPU resources via nodeSelector.
nodeSelector:
karpenter.sh/nodepool: gpu

Bonus: Using Spot Instances

To further optimize costs, use AWS Spot Instances for your GPU workloads. With Qovery, you can enable Spot Instances in the cluster's advanced settings:

  1. Navigate to the cluster advanced settings in Qovery.
  2. Set "aws.karpenter.enable_spot" to "true". Qovery handles the integration seamlessly, providing cost savings while ensuring resource availability for your applications.
Enable spot instances for AWS EKS with Karpenter

Conclusion

By combining AWS EKS with Karpenter and utilizing Qovery for deployment automation, you can streamline the deployment and management of AI applications that require GPU resources. This setup enhances performance and optimizes costs, making it an excellent choice for developers seeking to deploy AI applications at scale efficiently.

Begin deploying your AI apps today with Qovery and unlock the full potential of cloud-native technologies.

Share on :
Twitter icon
linkedin icon
Ready to rethink the way you do DevOps?
Qovery is a DevOps automation platform that enables organizations to deliver faster and focus on creating great products.
Book a demo

Suggested articles

Cloud
Heroku
Internal Developer Platform
Platform Engineering
9
 minutes
The Top 8 Platform as a Service (Paas) Tools in 2026

Build Your Own PaaS: Stop depending on fixed cloud offerings. Discover the top 8 tools, including Qovery, Dokku, and Cloud Foundry, that let you build a customizable, low-maintenance PaaS on your cloud infrastructure.

Morgan Perry
Co-founder
Kubernetes
 minutes
How to Deploy a Docker Container on Kubernetes: Step-by-Step Guide

Simplify Kubernetes Deployment. Learn the difficult 6-step manual process for deploying Docker containers to Kubernetes, the friction of YAML and kubectl, and how platform tools like Qovery automate the entire workflow.

Mélanie Dallé
Senior Marketing Manager
Observability
DevOps
 minutes
Observability in DevOps: What is it, Observe vs. Monitoring, Benefits

Observability in DevOps: Diagnose system failures faster. Learn how true observability differs from traditional monitoring. End context-switching, reduce MTTR, and resolve unforeseen issues quickly.

Mélanie Dallé
Senior Marketing Manager
DevOps
Cloud
8
 minutes
6 Best Practices to Automate DevSecOps in Days, Not Months

Integrate security seamlessly into your CI/CD pipeline. Learn the 6 best DevSecOps practices—from Policy as Code to continuous monitoring—and see how Qovery automates compliance and protection without slowing development.

Morgan Perry
Co-founder
Heroku
15
 minutes
Heroku Alternatives: The 10 Best Competitor Platforms

Fed up of rising Heroku costs and frequent outages? This guide compares the top 10 Heroku alternatives and competitors based on features, pricing, pros, and cons—helping developers and tech leaders choose the right PaaS.

Mélanie Dallé
Senior Marketing Manager
Product
Infrastructure Management
Deployment
 minutes
Stop tool sprawl - Welcome to Terraform/OpenTofu support

Provisioning cloud resources shouldn’t require a second stack of tools. With Qovery’s new Terraform and OpenTofu support, you can now define and deploy your infrastructure right alongside your applications. Declaratively, securely, and in one place. No external runners. No glue code. No tool sprawl.

Alessandro Carrano
Head of Product
AI
DevOps
 minutes
Integrating Agentic AI into Your DevOps Workflow

Eliminate non-coding toil with Qovery’s AI DevOps Agent. Discover how shifting from static automation to specialized DevOps AI agents optimizes FinOps, security, and infrastructure management.

Mélanie Dallé
Senior Marketing Manager
DevOps
 minutes
Top 10 Flux CD Alternatives: Finding a Better Way to Deploy Your Code

Looking for a Flux CD alternative? Discover why Qovery stands out as the #1 choice. Compare features, pros, and cons of the top 10 platforms to simplify your deployment strategy and empower your team.

Mélanie Dallé
Senior Marketing Manager

It’s time to rethink
the way you do DevOps

Say goodbye to DevOps overhead. Qovery makes infrastructure effortless, giving you full control without the trouble.