Categories
Ubuntu

Run GPT OSS 20B on VLLM with RTX 4090

Here is a quick way to run OpenAI GPT OSS 20B in RTX 4090 GPU

docker run --name vllm --gpus all -v /YOUR_PATH_TO_MODEL/models--gpt-oss-20b:/model -e VLLM_ATTENTION_BACKEND='TRITON_ATTN_VLLM_V1' \
    -p 8000:8000 \
    --ipc=host \
    vllm/vllm-openai:gptoss \
    --model /model --served-model-name model

You can download the model with

hf download openai/gpt-oss-20b --local-dir ./
Categories
Ubuntu

Set fan speed Nvidia GPU Ubuntu Server Headless

Here are the quick command to adjust NVIDIA GPU on headless ubuntu.

Run this and reboot

sudo nvidia-xconfig --allow-empty-initial-configuration --enable-all-gpus --cool-bits=7

Run Display then execute your NVIDIA-settings fan speed

X :1 &
export DISPLAY=:1

Or, simple way is copy this as fan.sh at home path then set permission with chmod a+x ~/fan.sh.

The usage `~/fan.sh 50 50`, which will adjust fan speed for 2x RTX 4090

❯ cat fan.sh       
#!/bin/bash

# Check if an argument is provided
if [ -z "$1" ] || [ -z "$2" ]; then
    echo "Usage: $0 <fan_speed_gpu0> <fan_speed_gpu1>"
    echo "Please provide fan speed percentages (0-100)."
    exit 1
fi

# Validate input (must be a number between 0 and 100)
if ! [[ "$1" =~ ^[0-9]+$ ]] || [ "$1" -lt 0 ] || [ "$1" -gt 100 ]; then
    echo "Error: Fan speed for GPU0 must be an integer between 0 and 100."
    exit 1
fi

if ! [[ "$2" =~ ^[0-9]+$ ]] || [ "$2" -lt 0 ] || [ "$2" -gt 100 ]; then
    echo "Error: Fan speed for GPU1 must be an integer between 0 and 100."
    exit 1
fi

FAN_SPEED=$1
FAN_SPEED_TWO=$2

# Ensure X server is running
if ! pgrep -x "Xorg" > /dev/null && ! pgrep -x "X" > /dev/null; then
    echo "X server not running, starting a new one..."
    export XDG_SESSION_TYPE=x11
    export DISPLAY=:0
    startx -- $DISPLAY &
    sleep 5
else
    echo "X server is already running."
    export DISPLAY=:0
fi

# Set fan control state and speed for GPU 0
echo "Setting fan speed to $FAN_SPEED% for GPU 0..."
nvidia-settings -a "[gpu:0]/GPUFanControlState=1"
nvidia-settings -a "[fan:0]/GPUTargetFanSpeed=$FAN_SPEED"
nvidia-settings -a "[fan:1]/GPUTargetFanSpeed=$FAN_SPEED"

# Set fan control state and speed for GPU 1
echo "Setting fan speed to $FAN_SPEED_TWO% for GPU 1..."
nvidia-settings -a "[gpu:1]/GPUFanControlState=1"
nvidia-settings -a "[fan:2]/GPUTargetFanSpeed=$FAN_SPEED_TWO"
nvidia-settings -a "[fan:3]/GPUTargetFanSpeed=$FAN_SPEED_TWO"

echo "Fan speed set to $FAN_SPEED% (GPU 0) and $FAN_SPEED_TWO% (GPU 1)."
Categories
Machine Learning

Fix VLLM LMDeploy /usr/bin/ld: cannot find -lcuda: No such file or directory

When running LMDeploy and got this error

2025-06-23 10:43:25,185 - lmdeploy - ERROR - base.py:53 - CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmpsne1hded/main.c', '-O3', '-shared', '-fPIC', '-o', '/tmp/tmpsne1hded/__triton_launcher.cpython-38-x86_64-linux-gnu.so', '-lcuda', '-L/home/dev/miniforge3/envs/lmdeploy/lib/python3.8/site-packages/triton/backends/nvidia/lib', '-L/lib/x86_64-linux-gnu', '-I/home/dev/miniforge3/envs/lmdeploy/lib/python3.8/site-packages/triton/backends/nvidia/include', '-I/tmp/tmpsne1hded', '-I/home/dev/miniforge3/envs/lmdeploy/include/python3.8']' returned non-zero exit status 1.
2025-06-23 10:43:25,185 - lmdeploy - ERROR - base.py:54 - <Triton> check failed!
Please ensure that your device is functioning properly with <Triton>.
You can verify your environment by running `python -m lmdeploy.pytorch.check_env.triton_custom_add`.

When running python -m lmdeploy.pytorch.check_env.triton_custom_add it will show error

❯ python -m lmdeploy.pytorch.check_env.triton_custom_add
/usr/bin/ld: cannot find -lcuda: No such file or directory
collect2: error: ld returned 1 exit status
Traceback (most recent call last):

To solve it, symbolic link

sudo ln -s /usr/local/cuda-12.2/targets/x86_64-linux/lib/stubs/libcuda.so /usr/lib64/libcuda.so

Then

❯ python -m lmdeploy.pytorch.check_env.triton_custom_add                                         
Done.
Categories
Devops

Fix Google COS GPU Docker unable create new device

If you got this error, congratulations, you have the solution here. This is quite complicated problem as below

nvidia-container-cli: mount error: failed to add device rules: unable to generate new device filter program from existing programs: unable to create new device filters program: load program: invalid argument: last insn is not an exit or jmp processed 0 insns (limit 1000000)

Turns out the solution is just run this either in your metadata startup script or inside the Google Container Optimized OS VM.

sysctl -w net.core.bpf_jit_harden=1 

If you want more

bash -c "echo net.core.bpf_jit_harden=1 > /etc/sysctl.d/91-nvidia-docker.conf"
sysctl --system
systemctl restart docker
Categories
Ubuntu

Disable ZSH autocomplete expansion

To disable ZSH annoying expansion without complete path

zstyle ':completion:*' completer _complete _complete:-fuzzy _correct _approximate _ignored _expand
Categories
Ubuntu

Solve Svelte [dev:svelte] [error] No parser could be inferred for file

When running pnpm run dev on Svelte + Vite + Shacdn project, I received error

[dev:svelte] [error] No parser could be inferred for file
[dev:svelte] [error] No parser could be inferred for file
[dev:svelte] [error] No parser could be inferred for file

To solve this, create .prettierrc file and put this

Categories
Ubuntu

Fix [WARNING] Cannot find base config file “./.svelte-kit/tsconfig.json” [tsconfig.json]

When running Shacdn + Svelte, Vite, I got this error :

▲ [WARNING] Cannot find base config file "./.svelte-kit/tsconfig.json" [tsconfig.json]

To solve this, edit package.json and add prepare": "svelte-kit sync",. For example

"scripts": {
		"dev": "vite dev",
		"build": "vite build",
		"build:registry": "tsx scripts/build-registry.ts",
		"br": "pnpm build:registry",
		"preview": "vite preview",
		"test": "playwright test",
		"prepare": "svelte-kit sync",
		"sync": "svelte-kit sync",
		"check": "svelte-kit sync && svelte-check --tsconfig ./tsconfig.json",
		"check:watch": "svelte-kit sync && svelte-check --tsconfig ./tsconfig.json --watch",
		"test:unit": "vitest"
	},
Categories
Ubuntu

Upgrade and Install NVIDIA Driver 565 Ubuntu 24.04

Here are a quick step to upgrade to the latest Driver (which needed for running Docker NVIDIA Nemo)

  1. Uninstall existing NVIDIA libraries
sudo apt purge "nvidia*" "libnvidia*"

2. Install the latest NVIDIA Driver

Add PPA and check the driver version as you wish to install

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update 
sudo ubuntu-drivers list

Then to install

sudo apt install nvidia-driver-565

If you got error Failed to initialize NVML: Driver/library version mismatch the solution is reboot.

If you are using NVIDIA Container Toolkit,

sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Categories
Ubuntu

Remote Desktop Ubuntu 24.04

There are a quick way to do remote desktop for Ubuntu 24.04 by enable its desktop sharing and connect using Remmina from the client. Here is the steps

  1. Enable Desktop Sharing at remote device/laptop/pc

Go to “System” -> “Desktop Sharing” and toggle both Desktop Sharing and Remote Control. In login details, filling the RDP username and Password

2. Connect via Client

Open Remmina and click “+”. Choose RDP and give the credentials the Remote user and password OS (not the RDP yet). Once you connected, then filling with the Login Details in RDP. Yes, we have two users/password here and you can set it to have same value.

Categories
Ubuntu

Solve multi-GPU not detected Docker-in-Docker Google Cloud

When I’m trying to do nvidia-smi inside the docker for multiple-gpus, its gave errors. I’m using docker API python module to run it. Checking on nvidia-gpus, its showing only single device, rather multiple

ls /proc/driver/nvidia/gpus

Solution is to ensure the gpus=all or gpus=2 is initialize properly. Running the docker manually first using

docker run --name caviar --detach --gpus all -it --privileged ghcr.io/ehfd/nvidia-dind:latest

This step showing all the GPUs is loaded. Then, the culprit is at Docker API. the proper way to do it is