Running a CUDA Docker image on an AWS Ubuntu instance enables you to leverage GPU-accelerated computations directly within Docker containers. In this guide, we’ll walk through the process of installing the necessary NVIDIA drivers, Docker, and NVIDIA Container Toolkit to successfully run a CUDA-enabled Docker container on an AWS EC2 Ubuntu LTS instance.
Prerequisites
- An AWS EC2 instance with GPU support, running Ubuntu LTS (e.g., 22.04 or 24.04).
- SSH access to the instance.
Step 1: Install the NVIDIA Driver
To enable GPU access, we need to install the appropriate NVIDIA driver. Follow these steps to install the driver.
- Update the Package List:
sudo apt-get update
- Install the NVIDIA Driver:
Install thenvidia-driver-535-server
package (you can adjust the driver version if needed).
sudo apt install -y nvidia-driver-535-server
For more detailed instructions, refer to the official NVIDIA driver installation guide for AWS Ubuntu.
- Reboot the System:
After installation, reboot the instance to apply the changes.
sudo reboot
- Verify the Driver Installation:
After rebooting, check if the NVIDIA driver is installed correctly:
nvidia-smi
You should see information about your GPU if everything is set up correctly.
Step 2: Install Docker
Next, install Docker to create and manage containers on your instance.
- Remove Any Existing Docker Packages:
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove -y $pkg; done
- Add Docker’s Official GPG Key:
sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc
- Add the Docker Repository:
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
- Update the Package List Again:
sudo apt-get update
- Install Docker Engine:
sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
- Verify Docker Installation:
To confirm Docker is installed and running, execute:
sudo docker run hello-world
If you see a “Hello from Docker!” message, Docker is installed correctly.
For further details, see the Docker installation guide for Ubuntu.
Step 3: Install the NVIDIA Container Toolkit
The NVIDIA Container Toolkit is required to enable GPU access within Docker containers.
- Add the NVIDIA GPG Key and Repository:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
- Update the Package List:
sudo apt-get update
- Install the NVIDIA Container Toolkit:
sudo apt-get install -y nvidia-container-toolkit
- Configure Docker to Use the NVIDIA Runtime:
Configure Docker to recognize NVIDIA’s runtime:
sudo nvidia-ctk runtime configure --runtime=docker
- Restart Docker to Apply the Changes:
sudo systemctl restart docker
Refer to the NVIDIA Container Toolkit installation guide for more details.
Step 4: Run a CUDA Docker Image
With Docker and NVIDIA Container Toolkit configured, you can now run a Docker container with CUDA support to leverage GPU-accelerated computation.
- Run a CUDA-Enabled Docker Container:
Use the following command to run a container with access to all available GPUs:
sudo docker run --gpus all -it nvidia/cuda:12.4.1-runtime-ubuntu22.04 /bin/bash
This command pulls the nvidia/cuda:12.4.1-runtime-ubuntu22.04
image (you can adjust the CUDA version if needed) and starts an interactive shell session within the container.
- Verify GPU Access Inside the Container:
Once inside the container, verify GPU access by running:
nvidia-smi
If GPU information appears, you have successfully set up a CUDA-enabled Docker container on your AWS Ubuntu instance.
Summary
In this guide, we covered the following steps to set up and run a CUDA Docker image on an AWS EC2 Ubuntu LTS instance:
- Installed the NVIDIA driver.
- Installed Docker and configured it to use the NVIDIA runtime.
- Installed the NVIDIA Container Toolkit.
- Ran a CUDA-enabled Docker container with GPU support.
By following these steps, you can fully utilize GPU-accelerated computations within Docker on an AWS EC2 instance. This setup is ideal for running machine learning, deep learning, and other GPU-intensive workloads in a containerized environment.