Categories
Ubuntu

Fix docker: Error response from daemon: could not select device driver “” with capabilities: [[gpu]]

To solve this problem of running Docker with gpu “docker: Error response from daemon: could not select device driver “” with capabilities: [[gpu]]”, you need to install the nvidia container toolkit

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart containerd

Reference:

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-with-yum-or-dnf

Categories
LLM

Solve Punica Installation and output must be a CUDA tensor error

Punica is very interesting project that showing running multiple LORAs model in single GPU. There are few things need to be done to make this project works in your local and avoiding issue like

  • _kernels.rms_norm(o, x, w, eps) RuntimeError: output must be a CUDA tensor
  • /torch/utils/cpp_extension.py”, line 2120, in _run_ninja_build
  • raise RuntimeError(message) from e
  • RuntimeError: Error compiling objects for extension
  • error: subprocess-exited-with-error
  • rich modules not installed and so on

Here are the steps

  1. Change NVCC version, I’m downgrade it into CUDA 12.1.
  2. Install G++ and GCC (version 10)
MAX_GCC_VERSION=10
sudo apt install gcc-$MAX_GCC_VERSION g++-$MAX_GCC_VERSION
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-$MAX_GCC_VERSION $MAX_GCC_VERSION

sudo apt install g++

3. Install the right torch version based on your CUDA version

pip install torch==2.5.1+cu121 --index-url https://download.pytorch.org/whl/cu121

4. Build from source!

pip install ninja numpy torch

# Clone punica
git clone https://github.com/punica-ai/punica.git
cd punica
git submodule sync
git submodule update --init

# If you encouter problem while compilation, set TORCH_CUDA_ARCH_LIST to your CUDA architecture.
# I'm using RTX4090, so ADA is 8.9. Check your version
export TORCH_CUDA_ARCH_LIST="8.9" 

# Build and install punica
pip install -v --no-build-isolation .

Why build from source works? Because its required to compile  a new CUDA kernel design SGMV