How do you initially set up a Virtual Server Instance with a GPU in IBM Cloud?

Generative AI opens many opportunities to address various business use cases. Therefore, you often need to run a Generative AI model on a Graphics Processing Unit (GPU). The transformer used in Generative AI needs intensive computing power. (Helpful blog post: Why GPUs Are Great for AI). IBM Cloud provides an easy instantiation with a Virtual Server Instance (VSI) in a Virtual Private Cloud on demand. This blog post is a short guide: How to set up a Virtual Server Instance (VSI) with a GPU on IBM Cloud?

With a Virtual Server Instance (VSI), you only pay per usage and don’t have an upfront cost. It is available in minutes (when a GPU is available ;-), ideal for testing or short-term usage.

Structure of this blog post:

  1. Create and configure a Virtual Private Cloud
  2. Create a VSI with a GPU
  3. Access the remote machine with SSH from your local machine
  4. Helpful components to install in the VSI
  5. Enable the Ubuntu OS system to access the GPU
  6. Verify if the GPU is accessible from a Python application

Note: In addition, here is a blog post from Niklas Heidloff Deploying a Virtual Server with GPU in the IBM Cloud

1. Create and configure a Virtual Private Cloud

The blog post How to create a single Virtual Server Instance (VSI) in a Virtual Private Cloud (VPC) Infrastructure on IBM Cloud shows how to create a Virtual Private Cloud and Visual Service Instance.

  • Here is an example configuration for a Virtual Private Cloud:
    • Region: eu-de
    • Location: Frankfurt-3
    • Others: Subnet, Access Control List, Security Group, Floating IP, Public Gateway, Routing Table

2. Create a VSI with a GPU

When you define the configuration for a Virtual Server Instance select Change the image profile. In the following dialog, mark the GPU option to see the available GPU options for the VSI. The image below shows the available GPU options at the moment.

The image below shows an example configuration.

  • Example Virtual Server Instance (VSI) configuration:
    • OS: Ubuntu Linux 22.04 LTS Jammy Jellyfish Minimal Install (amd64)
    • Machine: Intel | x86_64 | 8 vCPU | 64 GiB | 16 Gbps
    • GPU: Tesla V100-PCIE-16GB
    • SSH Keys: 3

3. Access the remote machine with SSH from your local machine

export FLOATING_IP="YOUR_IP"
export RSA_FILE=YOUR_KEY
export REMOTE_USER=root
chmod 400 ~/.ssh/${RSA_FILE}
eval $(ssh-agent -s)
ssh-add ~/.ssh/${RSA_FILE}          
ssh -i ~/.ssh/${RSA_FILE} ${REMOTE_USER}@${FLOATING_IP}

4. Helpful components to install in the VSI

These may be helpful packages to install on the Ubuntu OS.

apt update -y
apt install -y docker-compose
apt install -y git
apt install -y python3-pip
apt install -y net-tools
apt install -y jq
apt-get install -y python3-venv

5. Enable the Ubuntu OS system to access the GPU

For NVIDIA follow the installation instructions The image below shows an example of a possible selection for an installation.

The following code is an extract of the commands for the selection in the image above (Januar 2024):

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-3

  • Driver installation
sudo apt-get install -y cuda-drivers

  • Verify the installation
lspci | grep -i nvidia
lshw | grep nvidia
nvidia-smi

Example output for nvidia-smi:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla V100-PCIE-16GB           Off | 00000000:04:01.0 Off |                    0 |
| N/A   37C    P0              40W / 250W |      0MiB / 16384MiB |      3%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

  • (Optional) Remove all downloaded models from Huggingface
ls -a ~/.cache/huggingface/hub/
rm -rf ~/.cache/huggingface/hub/

  • (Optional) Change the Huggingface cache location

Verify your existing model location.

huggingface-cli scan-cache

REPO ID                                REPO TYPE SIZE ON DISK NB FILES LAST_ACCESSED LAST_MODIFIED REFS LOCAL PATH                                                                
-------------------------------------- --------- ------------ -------- ------------- ------------- ---- ------------------------------------------------------------------------- 
TheBloke/CodeLlama-13B-instruct-GPTQ   model             7.3G        3 2 days ago    2 days ago    main /text-to-sql/hfmodels/hub/models--TheBloke--CodeLlama-13B-instruct-GPTQ   
export HF_HOME=/[your_cache_dir]

               or
 
ln -s /[your_cache_dir]/hub ~/.cache/huggingface

  • (Optional) Add the location to your bash configuration file
nano nano ~/.bashrc
# Add custom model location for Hugging Face
export HF_HOME=/[your_cache_dir]

6. Verify if the GPU is accessible from a Python application

python3 -m pip install torch
cd /
mkdir gpu-test
nano /gpu-test/simple-gpu-test.py

Copy and paste the following code into the simple-gpu-test.py file.

import torch

print(f"{torch.cuda.is_available()}")
print(f"{torch.cuda.device_count()}")
print(f"{torch.cuda.current_device()}")
print(f"{torch.cuda.device(0)}")
print(f"{torch.cuda.get_device_name(0)}")

Execute:

python3 /gpu-test/simple-gpu-test.py

I hope this was useful to you and let’s see what’s next?

Greetings,

Thomas

#python, #gpu, #ai, #ibmcloud ,#vsi, #virtualserver, #virtualprivatecloud, #vpc, #aiapplication

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.

Up ↑