05 / SERVER SETUP & GUIDELINES

Setup Guides

Step-by-step setup guides for all supported OS, AI frameworks, and model APIs on your B8N6 dedicated server.

Linux (Ubuntu 22.04) Setup

Complete GPU server setup including NVIDIA drivers, CUDA toolkit, and Python AI environment.

1 Connect to Your Server via SSH

SSH CONNECTION

# Replace with your credentials from the portal ssh root@YOUR_SERVER_IP -p 22 -i ~/.ssh/b8n6_key

2 Update System & Install Dependencies

SYSTEM SETUP

apt update && apt upgrade -y apt install -y build-essential git curl wget htop nvtop \ python3 python3-pip python3-venv software-properties-common

3 Install NVIDIA Drivers & CUDA 12.x

NVIDIA CUDA SETUP

# Add NVIDIA repo wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb dpkg -i cuda-keyring_1.1-1_all.deb && apt update apt install -y cuda-toolkit-12-6 nvidia-driver-550 nvidia-smi # verify

4 Set Up Python Environment

PYTHON VENV

python3 -m venv ~/ai-env source ~/ai-env/bin/activate pip install --upgrade pip torch torchvision \ --index-url https://download.pytorch.org/whl/cu124 python3 -c "import torch; print(torch.cuda.is_available())"

⚡ Add to ~/.bashrc: export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

Windows Server 2022

Setup for Windows Server with WSL2, NVIDIA GPU passthrough, and AI framework support.

1 Connect via RDP or SSH

POWERSHELL

ssh Administrator@YOUR_SERVER_IP # Or RDP: mstsc /v:YOUR_SERVER_IP

2 Enable WSL2 & Ubuntu

POWERSHELL — AS ADMIN

wsl --install wsl --set-default-version 2 wsl --install -d Ubuntu-22.04 wsl -l -v # verify

3 Install NVIDIA CUDA for Windows

POWERSHELL

winget install NVIDIA.CUDA wsl nvidia-smi # GPU accessible from WSL2

4 Install Python & PyTorch

POWERSHELL

winget install Python.Python.3.11 pip install torch torchvision \ --index-url https://download.pytorch.org/whl/cu124 python -c "import torch; print(torch.cuda.is_available())"

ComfyUI Setup

Node-based image generation with FLUX 1.1 Pro and Stable Diffusion. Access via browser tunnel.

1 Clone & Install ComfyUI

BASH

git clone https://github.com/comfyanonymous/ComfyUI.git cd ComfyUI && pip install -r requirements.txt # Install ComfyUI Manager cd custom_nodes git clone https://github.com/ltdrdata/ComfyUI-Manager.git

2 Download FLUX / SDXL Models

BASH — MODEL DOWNLOAD

cd ~/ComfyUI/models/checkpoints wget https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors cd ~/ComfyUI/models/vae wget https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/ae.safetensors

3 Launch & Tunnel

LAUNCH COMFYUI

python3 main.py --listen 0.0.0.0 --port 8188 # SSH tunnel from local machine: ssh -L 8188:localhost:8188 root@YOUR_SERVER_IP -N # Open: http://localhost:8188

⚡ Use --lowvram on A100 40GB. H200 runs full-precision FLUX without flags.

Open WebUI + Ollama

ChatGPT-style interface for self-hosted models via Docker. One-command deployment.

1 Install Docker + NVIDIA Container Toolkit

BASH

curl -fsSL https://get.docker.com | sh systemctl enable --now docker apt install -y nvidia-container-toolkit systemctl restart docker

2 Deploy Open WebUI + Ollama (One Command)

DOCKER

docker run -d \ --name open-webui \ --gpus all \ -p 3000:8080 \ -v open-webui:/app/backend/data \ -v ollama:/root/.ollama \ -e OLLAMA_BASE_URL=http://localhost:11434 \ --restart always \ ghcr.io/open-webui/open-webui:ollama # Access: http://YOUR_SERVER_IP:3000

3 Pull Models via Ollama

BASH

docker exec open-webui ollama pull llama4:scout docker exec open-webui ollama pull deepseek-r1:70b docker exec open-webui ollama pull qwen3:32b

⚡ Connect Claude/Gemini/DeepSeek APIs in Settings → Connections for a unified interface.

Ollama Setup

Run LLMs locally on your B8N6 GPU. CLI-based model management with auto GPU acceleration.

BASH

curl -fsSL https://ollama.com/install.sh | sh ollama --version

BASH — MODEL MANAGEMENT

ollama pull llama4:scout ollama pull deepseek-r1:70b ollama pull qwen3:32b ollama run deepseek-r1:70b ollama list

API

OLLAMA_HOST=0.0.0.0 ollama serve & curl http://localhost:11434/api/generate \ -d '{"model":"llama4:scout","prompt":"Hello!"}'

Anthropic Claude API

Integrate Claude Opus 4.6 and Sonnet 4.6 via the Anthropic Python SDK.

BASH

pip install anthropic export ANTHROPIC_API_KEY="sk-ant-YOUR_KEY_HERE"

PYTHON

import anthropic client = anthropic.Anthropic() message = client.messages.create( model="claude-opus-4-6", max_tokens=1024, messages=[{"role":"user","content":"Hello!"}] ) print(message.content[0].text)

⚡ Models: claude-opus-4-6 · claude-sonnet-4-6 · claude-haiku-4-5-20251001
Docs: https://docs.anthropic.com

Google Gemini API

Access Gemini 3 Pro and Gemini 2.5 Flash including 1M token context and multimodal inputs.

BASH

pip install google-generativeai export GOOGLE_API_KEY="AIzaSy-YOUR_KEY"

PYTHON

import google.generativeai as genai genai.configure(api_key="YOUR_KEY") model = genai.GenerativeModel("gemini-3.0-pro") response = model.generate_content("Explain NVLink 4.0") print(response.text)

⚡ Models: gemini-3.0-pro · gemini-2.5-flash · gemini-2.5-flash-lite
Docs: https://ai.google.dev

DeepSeek API

Use DeepSeek V3.2 and R1 via their OpenAI-compatible API, or self-host via vLLM.

PYTHON — OPENAI COMPATIBLE

from openai import OpenAI client = OpenAI( api_key="YOUR_DEEPSEEK_KEY", base_url="https://api.deepseek.com" ) response = client.chat.completions.create( model="deepseek-chat", messages=[{"role":"user","content":"Hello!"}] ) print(response.choices[0].message.content)

BASH — VLLM SELF-HOST

pip install vllm python -m vllm.entrypoints.openai.api_server \ --model deepseek-ai/DeepSeek-V3 \ --tensor-parallel-size 8 \ --host 0.0.0.0 --port 8000 \ --gpu-memory-utilization 0.95

⚡ Models: deepseek-chat (V3.2) · deepseek-reasoner (R1) · Docs: https://api-docs.deepseek.com

OpenAI API

Access GPT-5.2, o4-mini, GPT-4o and GPT-oss open-weight models.

BASH

pip install openai export OPENAI_API_KEY="sk-YOUR_KEY"

PYTHON

from openai import OpenAI client = OpenAI() response = client.chat.completions.create( model="gpt-5.2", messages=[{"role":"user","content":"Hello!"}] ) print(response.choices[0].message.content) # Reasoning model response = client.chat.completions.create( model="o4-mini", messages=[{"role":"user","content":"Solve step by step..."}], reasoning_effort="high" )

⚡ Models: gpt-5.2 · gpt-5 · o4-mini · o3 · gpt-4o · gpt-oss-120b
Docs: https://platform.openai.com/docs

Rocky Linux 9 Setup

Enterprise-grade RHEL-compatible OS. Ideal for GPU servers requiring maximum stability and CUDA support.

1 Initial Server Access & System Update

SSH + UPDATE

ssh root@YOUR_SERVER_IP dnf update -y dnf install -y epel-release dnf groupinstall -y "Development Tools"

2 Install NVIDIA Drivers & CUDA

CUDA ON ROCKY 9

dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo dnf install -y cuda-toolkit-12-6 nvidia-driver reboot nvidia-smi # verify after reboot

3 Install Python & PyTorch

PYTHON SETUP

dnf install -y python3.11 python3.11-pip python3.11 -m venv ~/ai-env source ~/ai-env/bin/activate pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124

⚡ Rocky Linux 9 = RHEL9-compatible. Great for enterprise environments requiring SELinux and corporate compliance.

Debian 12 (Bookworm) Setup

Rock-solid Debian 12 with NVIDIA GPU and CUDA support. Preferred by many AI researchers for its stability.

1 System Update & Dependencies

DEBIAN SETUP

apt update && apt upgrade -y apt install -y build-essential linux-headers-$(uname -r) curl wget git python3 python3-pip python3-venv

2 NVIDIA Drivers via apt

NVIDIA DEBIAN

apt install -y software-properties-common add-apt-repository contrib non-free-firmware apt update && apt install -y nvidia-driver firmware-misc-nonfree reboot nvidia-smi # verify

3 CUDA Toolkit

CUDA ON DEBIAN

wget https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb dpkg -i cuda-keyring_1.1-1_all.deb apt update && apt install -y cuda-toolkit-12-6

AlmaLinux 9 Setup

AlmaLinux 9: Community RHEL binary-compatible distro. Excellent for production GPU workloads.

1 Initial Setup

ALMALINUX 9

dnf update -y dnf install -y epel-release dnf groupinstall "Development Tools" -y dnf install -y wget curl git htop

2 CUDA Repository & Drivers

CUDA ALMALINUX

dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo dnf install -y cuda-toolkit-12-6 nvidia-driver reboot && nvidia-smi

⚡ AlmaLinux is 1:1 binary compatible with RHEL9. Ideal choice if you're migrating from CentOS.

Kubernetes (K8s) + GPU

Deploy GPU-accelerated Kubernetes clusters for AI inference workloads. Includes NVIDIA Device Plugin and GPU Operator setup.

1 Install kubeadm, kubelet, kubectl

K8S INSTALL

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /" | tee /etc/apt/sources.list.d/kubernetes.list apt update && apt install -y kubelet kubeadm kubectl kubeadm init --pod-network-cidr=192.168.0.0/16

2 Install NVIDIA GPU Operator

GPU OPERATOR

helm repo add nvidia https://helm.ngc.nvidia.com/nvidia helm repo update helm install --wait --generate-name -n gpu-operator --create-namespace nvidia/gpu-operator

3 Deploy GPU AI Workload

GPU POD

kubectl apply -f - <kubectl logs ai-inference

Docker Swarm Cluster

Multi-host Docker Swarm for distributed AI service deployment with GPU support.

1 Install Docker & Initialize Swarm

SWARM INIT

curl -fsSL https://get.docker.com | sh docker swarm init --advertise-addr YOUR_MANAGER_IP # On worker nodes: docker swarm join --token SWMTKN-xxx MANAGER_IP:2377

2 Deploy AI Stack

SWARM DEPLOY

docker stack deploy -c docker-compose.yml ai-stack docker service ls docker stack ps ai-stack

Nginx Reverse Proxy

Set up Nginx as a reverse proxy for your AI services with SSL termination and load balancing.

1 Install Nginx & Certbot

NGINX + SSL

apt install -y nginx certbot python3-certbot-nginx certbot --nginx -d your.domain.com

2 Configure Proxy for AI API

NGINX CONFIG

server { listen 443 ssl; server_name your.domain.com; location /api/ { proxy_pass http://127.0.0.1:8000/; proxy_set_header X-Real-IP $remote_addr; proxy_read_timeout 300s; } }

Postfix + Dovecot Mail

Production mail server with Postfix (SMTP) and Dovecot (IMAP/POP3). Supports TLS and modern authentication.

1 Install Postfix & Dovecot

MAIL SERVER

apt install -y postfix dovecot-core dovecot-imapd dovecot-pop3d spamassassin opendkim # Choose: Internet Site during postfix setup

2 Configure Postfix

/ETC/POSTFIX/MAIN.CF

myhostname = mail.yourdomain.com mydomain = yourdomain.com inet_interfaces = all smtpd_tls_cert_file = /etc/ssl/certs/cert.pem smtpd_tls_key_file = /etc/ssl/private/key.pem smtpd_use_tls = yes

⚡ Set up SPF, DKIM, and DMARC DNS records for email deliverability. Contact support for DNS help.

Mailcow Dockerized

Full-featured mail server suite via Docker. Includes webmail (SOGo), antispam, DKIM, and admin panel.

1 Clone & Configure

MAILCOW SETUP

git clone https://github.com/mailcow/mailcow-dockerized cd mailcow-dockerized ./generate_config.sh # Enter: mail.yourdomain.com when prompted

2 Start Mailcow

DOCKER COMPOSE

docker compose pull docker compose up -d # Admin UI: https://mail.yourdomain.com/admin # Default admin password in mailcow.conf

Nextcloud Self-Hosted Cloud

Deploy Nextcloud for private cloud storage, collaboration, and file sync — a DigitalOcean Spaces or Dropbox alternative on your own server.

1 Install Dependencies

NEXTCLOUD STACK

apt install -y apache2 mariadb-server php8.2 php8.2-{curl,gd,mbstring,xml,zip,mysql,intl,bcmath,gmp} systemctl enable --now apache2 mariadb

2 Deploy via Docker (Recommended)

DOCKER NEXTCLOUD

docker run -d --name nextcloud -p 8080:80 -v nextcloud:/var/www/html -e MYSQL_HOST=db -e NEXTCLOUD_ADMIN_USER=admin --restart always nextcloud:latest

⚡ B8N6 servers are perfect for Nextcloud — NVMe SSD ensures fast file access. Enable GPU transcoding for photos/videos.

GitLab Community Edition

Self-hosted GitLab CE for private code repos, CI/CD pipelines, and team collaboration.

1 Install GitLab CE

GITLAB INSTALL

curl -fsSL https://packages.gitlab.com/install/repositories/gitlab/gitlab-ce/script.deb.sh | bash apt install -y gitlab-ce EXTERNAL_URL="https://gitlab.yourdomain.com" apt install gitlab-ce

2 Configure & Start

GITLAB CONFIGURE

gitlab-ctl reconfigure gitlab-ctl status # Get initial root password: cat /etc/gitlab/initial_root_password

Grafana + Prometheus

Full observability stack for monitoring GPU utilization, AI model performance, and server health.

1 Install Prometheus

PROMETHEUS

wget https://github.com/prometheus/prometheus/releases/latest/download/prometheus-linux-amd64.tar.gz tar -xvf prometheus-linux-amd64.tar.gz cd prometheus-linux-amd64 ./prometheus --config.file=prometheus.yml & # GPU exporter: docker run -d --gpus all -p 9835:9835 utkuozdemir/nvidia_gpu_exporter:1.2.0

2 Install Grafana

GRAFANA

apt install -y grafana systemctl enable --now grafana-server # Access: http://YOUR_SERVER_IP:3000 # Import NVIDIA GPU dashboard ID: 14574

vLLM Inference Server

High-throughput LLM inference engine. Serves DeepSeek, Llama 4, Mistral and any HuggingFace model with OpenAI-compatible API.

1 Install vLLM

VLLM INSTALL

pip install vllm # Or with Docker: docker pull vllm/vllm-openai:latest

2 Serve a Model

SERVE MODEL

python -m vllm.entrypoints.openai.api_server --model deepseek-ai/DeepSeek-V3 --tensor-parallel-size 8 --host 0.0.0.0 --port 8000 --gpu-memory-utilization 0.95 --max-model-len 32768

3 Query the API

API CALL

from openai import OpenAI client = OpenAI(base_url="http://localhost:8000/v1", api_key="none") response = client.chat.completions.create( model="deepseek-ai/DeepSeek-V3", messages=[{"role":"user","content":"Hello!"}] ) print(response.choices[0].message.content)

⚡ Use --tensor-parallel-size equal to your GPU count. H200×8 can serve 70B+ models at full speed.

AI MODELS

OpenAI GPT-5.2 NEW

OpenAI o4-mini NEW

Anthropic Claude Opus 4.6 NEW

Anthropic Claude Sonnet 4.6 NEW

Google Gemini 3 Pro NEW

Google Gemini 2.5 Flash

Meta Llama 4 Maverick NEW

Meta Llama 4 Scout

DeepSeek V3.2 NEW

DeepSeek R1

Mistral Large 3 NEW

Alibaba Qwen3-235B NEW

xAI Grok 4 NEW

FLUX 1.1 Pro

NVIDIA NeMo · TensorRT

OpenAI GPT-5.2 NEW

OpenAI o4-mini NEW

Anthropic Claude Opus 4.6 NEW

Anthropic Claude Sonnet 4.6 NEW

Google Gemini 3 Pro NEW

Google Gemini 2.5 Flash

Meta Llama 4 Maverick NEW

Meta Llama 4 Scout

DeepSeek V3.2 NEW

DeepSeek R1

Mistral Large 3 NEW

Alibaba Qwen3-235B NEW

xAI Grok 4 NEW

FLUX 1.1 Pro

NVIDIA NeMo · TensorRT

⬡ Secure Access

Server Login

Username

Password

Access your dedicated server control panel.
Issues? support@b8n6.com · 24/7 NOC.

◈ Sales Enquiry

Contact Sales

Full Name *

Email *

Company

Requirements *

Response within 2 business hours. Or email sales@b8n6.com

⚙

Under Maintenance

This feature is being upgraded
Estimated: Coming Soon