Running a CUDA Docker image on an AWS Ubuntu instance enables you to leverage GPU-accelerated computations directly within Docker containers. In this guide, we’ll walk through the process of installing the necessary NVIDIA drivers, Docker, and NVIDIA Container Toolkit to successfully run a CUDA-enabled Docker container on an AWS EC2 Ubuntu LTS instance.

Prerequisites

  • An AWS EC2 instance with GPU support, running Ubuntu LTS (e.g., 22.04 or 24.04).
  • SSH access to the instance.

Step 1: Install the NVIDIA Driver

To enable GPU access, we need to install the appropriate NVIDIA driver. Follow these steps to install the driver.

  1. Update the Package List:
   sudo apt-get update
  1. Install the NVIDIA Driver:
    Install the nvidia-driver-535-server package (you can adjust the driver version if needed).
   sudo apt install -y nvidia-driver-535-server

For more detailed instructions, refer to the official NVIDIA driver installation guide for AWS Ubuntu.

  1. Reboot the System:
    After installation, reboot the instance to apply the changes.
   sudo reboot
  1. Verify the Driver Installation:
    After rebooting, check if the NVIDIA driver is installed correctly:
   nvidia-smi

You should see information about your GPU if everything is set up correctly.

Step 2: Install Docker

Next, install Docker to create and manage containers on your instance.

  1. Remove Any Existing Docker Packages:
   for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove -y $pkg; done
  1. Add Docker’s Official GPG Key:
   sudo install -m 0755 -d /etc/apt/keyrings
   sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
   sudo chmod a+r /etc/apt/keyrings/docker.asc
  1. Add the Docker Repository:
   echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | \
   sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
  1. Update the Package List Again:
   sudo apt-get update
  1. Install Docker Engine:
   sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
  1. Verify Docker Installation:
    To confirm Docker is installed and running, execute:
   sudo docker run hello-world

If you see a “Hello from Docker!” message, Docker is installed correctly.

For further details, see the Docker installation guide for Ubuntu.

Step 3: Install the NVIDIA Container Toolkit

The NVIDIA Container Toolkit is required to enable GPU access within Docker containers.

  1. Add the NVIDIA GPG Key and Repository:
   curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
   curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
   sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
   sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
  1. Update the Package List:
   sudo apt-get update
  1. Install the NVIDIA Container Toolkit:
   sudo apt-get install -y nvidia-container-toolkit
  1. Configure Docker to Use the NVIDIA Runtime:
    Configure Docker to recognize NVIDIA’s runtime:
   sudo nvidia-ctk runtime configure --runtime=docker
  1. Restart Docker to Apply the Changes:
   sudo systemctl restart docker

Refer to the NVIDIA Container Toolkit installation guide for more details.

Step 4: Run a CUDA Docker Image

With Docker and NVIDIA Container Toolkit configured, you can now run a Docker container with CUDA support to leverage GPU-accelerated computation.

  1. Run a CUDA-Enabled Docker Container:
    Use the following command to run a container with access to all available GPUs:
   sudo docker run --gpus all -it nvidia/cuda:12.4.1-runtime-ubuntu22.04 /bin/bash

This command pulls the nvidia/cuda:12.4.1-runtime-ubuntu22.04 image (you can adjust the CUDA version if needed) and starts an interactive shell session within the container.

  1. Verify GPU Access Inside the Container:
    Once inside the container, verify GPU access by running:
   nvidia-smi

If GPU information appears, you have successfully set up a CUDA-enabled Docker container on your AWS Ubuntu instance.

Summary

In this guide, we covered the following steps to set up and run a CUDA Docker image on an AWS EC2 Ubuntu LTS instance:

  1. Installed the NVIDIA driver.
  2. Installed Docker and configured it to use the NVIDIA runtime.
  3. Installed the NVIDIA Container Toolkit.
  4. Ran a CUDA-enabled Docker container with GPU support.

By following these steps, you can fully utilize GPU-accelerated computations within Docker on an AWS EC2 instance. This setup is ideal for running machine learning, deep learning, and other GPU-intensive workloads in a containerized environment.

Leave a Reply

Your email address will not be published. Required fields are marked *