▶ LIVE
B8B8N6 ENTERPRISE SERVER INFRASTRUCTURE  ·  NVNVIDIA H200 SXM5 — Hopper Architecture Active  ·  NVNVIDIA B200 Blackwell — Next-Gen GPU Cluster Online  ·  SLA99.98% Uptime SLA — Dedicated Infrastructure  ·  🌍 8 REGIONS: USA · UAE · India · Singapore · Sri Lanka · Australia · Kenya · Greece  ·  IBInfiniBand NDR 400Gbps — Ultra-Low Latency  ·  40+ AI Models Online — GPU Clusters Ready  ·  B8B8N6 ENTERPRISE SERVER INFRASTRUCTURE  ·  NVNVIDIA H200 SXM5 — Hopper Architecture Active  ·  NVNVIDIA B200 Blackwell — Next-Gen GPU Cluster Online  ·  SLA99.98% Uptime SLA — Dedicated Infrastructure  ·  🌍 8 REGIONS: USA · UAE · India · Singapore · Sri Lanka · Australia · Kenya · Greece  ·  IBInfiniBand NDR 400Gbps — Ultra-Low Latency  ·  40+ AI Models Online — GPU Clusters Ready  · 
05 / SERVER SETUP & GUIDELINES

Setup Guides

Step-by-step setup guides for all supported OS, AI frameworks, and model APIs on your B8N6 dedicated server.

OS & PLATFORMS
AI FRAMEWORKS
AI MODEL APIS
INFRASTRUCTURE
EMAIL SERVERS
OPEN SOURCE SERVICES
Linux (Ubuntu 22.04) Setup
Complete GPU server setup including NVIDIA drivers, CUDA toolkit, and Python AI environment.
1 Connect to Your Server via SSH
SSH CONNECTION
# Replace with your credentials from the portal ssh root@YOUR_SERVER_IP -p 22 -i ~/.ssh/b8n6_key
2 Update System & Install Dependencies
SYSTEM SETUP
apt update && apt upgrade -y apt install -y build-essential git curl wget htop nvtop \ python3 python3-pip python3-venv software-properties-common
3 Install NVIDIA Drivers & CUDA 12.x
NVIDIA CUDA SETUP
# Add NVIDIA repo wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb dpkg -i cuda-keyring_1.1-1_all.deb && apt update apt install -y cuda-toolkit-12-6 nvidia-driver-550 nvidia-smi # verify
4 Set Up Python Environment
PYTHON VENV
python3 -m venv ~/ai-env source ~/ai-env/bin/activate pip install --upgrade pip torch torchvision \ --index-url https://download.pytorch.org/whl/cu124 python3 -c "import torch; print(torch.cuda.is_available())"
⚡ Add to ~/.bashrc: export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
Windows Server 2022
Setup for Windows Server with WSL2, NVIDIA GPU passthrough, and AI framework support.
1 Connect via RDP or SSH
POWERSHELL
ssh Administrator@YOUR_SERVER_IP # Or RDP: mstsc /v:YOUR_SERVER_IP
2 Enable WSL2 & Ubuntu
POWERSHELL — AS ADMIN
wsl --install wsl --set-default-version 2 wsl --install -d Ubuntu-22.04 wsl -l -v # verify
3 Install NVIDIA CUDA for Windows
POWERSHELL
winget install NVIDIA.CUDA wsl nvidia-smi # GPU accessible from WSL2
4 Install Python & PyTorch
POWERSHELL
winget install Python.Python.3.11 pip install torch torchvision \ --index-url https://download.pytorch.org/whl/cu124 python -c "import torch; print(torch.cuda.is_available())"
ComfyUI Setup
Node-based image generation with FLUX 1.1 Pro and Stable Diffusion. Access via browser tunnel.
1 Clone & Install ComfyUI
BASH
git clone https://github.com/comfyanonymous/ComfyUI.git cd ComfyUI && pip install -r requirements.txt # Install ComfyUI Manager cd custom_nodes git clone https://github.com/ltdrdata/ComfyUI-Manager.git
2 Download FLUX / SDXL Models
BASH — MODEL DOWNLOAD
cd ~/ComfyUI/models/checkpoints wget https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/flux1-dev.safetensors cd ~/ComfyUI/models/vae wget https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/ae.safetensors
3 Launch & Tunnel
LAUNCH COMFYUI
python3 main.py --listen 0.0.0.0 --port 8188 # SSH tunnel from local machine: ssh -L 8188:localhost:8188 root@YOUR_SERVER_IP -N # Open: http://localhost:8188
⚡ Use --lowvram on A100 40GB. H200 runs full-precision FLUX without flags.
Open WebUI + Ollama
ChatGPT-style interface for self-hosted models via Docker. One-command deployment.
1 Install Docker + NVIDIA Container Toolkit
BASH
curl -fsSL https://get.docker.com | sh systemctl enable --now docker apt install -y nvidia-container-toolkit systemctl restart docker
2 Deploy Open WebUI + Ollama (One Command)
DOCKER
docker run -d \ --name open-webui \ --gpus all \ -p 3000:8080 \ -v open-webui:/app/backend/data \ -v ollama:/root/.ollama \ -e OLLAMA_BASE_URL=http://localhost:11434 \ --restart always \ ghcr.io/open-webui/open-webui:ollama # Access: http://YOUR_SERVER_IP:3000
3 Pull Models via Ollama
BASH
docker exec open-webui ollama pull llama4:scout docker exec open-webui ollama pull deepseek-r1:70b docker exec open-webui ollama pull qwen3:32b
⚡ Connect Claude/Gemini/DeepSeek APIs in Settings → Connections for a unified interface.
Ollama Setup
Run LLMs locally on your B8N6 GPU. CLI-based model management with auto GPU acceleration.
BASH
curl -fsSL https://ollama.com/install.sh | sh ollama --version
BASH — MODEL MANAGEMENT
ollama pull llama4:scout ollama pull deepseek-r1:70b ollama pull qwen3:32b ollama run deepseek-r1:70b ollama list
API
OLLAMA_HOST=0.0.0.0 ollama serve & curl http://localhost:11434/api/generate \ -d '{"model":"llama4:scout","prompt":"Hello!"}'
Anthropic Claude API
Integrate Claude Opus 4.6 and Sonnet 4.6 via the Anthropic Python SDK.
BASH
pip install anthropic export ANTHROPIC_API_KEY="sk-ant-YOUR_KEY_HERE"
PYTHON
import anthropic client = anthropic.Anthropic() message = client.messages.create( model="claude-opus-4-6", max_tokens=1024, messages=[{"role":"user","content":"Hello!"}] ) print(message.content[0].text)
⚡ Models: claude-opus-4-6 · claude-sonnet-4-6 · claude-haiku-4-5-20251001
Docs: https://docs.anthropic.com
Google Gemini API
Access Gemini 3 Pro and Gemini 2.5 Flash including 1M token context and multimodal inputs.
BASH
pip install google-generativeai export GOOGLE_API_KEY="AIzaSy-YOUR_KEY"
PYTHON
import google.generativeai as genai genai.configure(api_key="YOUR_KEY") model = genai.GenerativeModel("gemini-3.0-pro") response = model.generate_content("Explain NVLink 4.0") print(response.text)
⚡ Models: gemini-3.0-pro · gemini-2.5-flash · gemini-2.5-flash-lite
Docs: https://ai.google.dev
DeepSeek API
Use DeepSeek V3.2 and R1 via their OpenAI-compatible API, or self-host via vLLM.
PYTHON — OPENAI COMPATIBLE
from openai import OpenAI client = OpenAI( api_key="YOUR_DEEPSEEK_KEY", base_url="https://api.deepseek.com" ) response = client.chat.completions.create( model="deepseek-chat", messages=[{"role":"user","content":"Hello!"}] ) print(response.choices[0].message.content)
BASH — VLLM SELF-HOST
pip install vllm python -m vllm.entrypoints.openai.api_server \ --model deepseek-ai/DeepSeek-V3 \ --tensor-parallel-size 8 \ --host 0.0.0.0 --port 8000 \ --gpu-memory-utilization 0.95
⚡ Models: deepseek-chat (V3.2) · deepseek-reasoner (R1) · Docs: https://api-docs.deepseek.com
OpenAI API
Access GPT-5.2, o4-mini, GPT-4o and GPT-oss open-weight models.
BASH
pip install openai export OPENAI_API_KEY="sk-YOUR_KEY"
PYTHON
from openai import OpenAI client = OpenAI() response = client.chat.completions.create( model="gpt-5.2", messages=[{"role":"user","content":"Hello!"}] ) print(response.choices[0].message.content) # Reasoning model response = client.chat.completions.create( model="o4-mini", messages=[{"role":"user","content":"Solve step by step..."}], reasoning_effort="high" )
⚡ Models: gpt-5.2 · gpt-5 · o4-mini · o3 · gpt-4o · gpt-oss-120b
Docs: https://platform.openai.com/docs
Rocky Linux 9 Setup
Enterprise-grade RHEL-compatible OS. Ideal for GPU servers requiring maximum stability and CUDA support.
1 Initial Server Access & System Update
SSH + UPDATE
ssh root@YOUR_SERVER_IP dnf update -y dnf install -y epel-release dnf groupinstall -y "Development Tools"
2 Install NVIDIA Drivers & CUDA
CUDA ON ROCKY 9
dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo dnf install -y cuda-toolkit-12-6 nvidia-driver reboot nvidia-smi # verify after reboot
3 Install Python & PyTorch
PYTHON SETUP
dnf install -y python3.11 python3.11-pip python3.11 -m venv ~/ai-env source ~/ai-env/bin/activate pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124
⚡ Rocky Linux 9 = RHEL9-compatible. Great for enterprise environments requiring SELinux and corporate compliance.
Debian 12 (Bookworm) Setup
Rock-solid Debian 12 with NVIDIA GPU and CUDA support. Preferred by many AI researchers for its stability.
1 System Update & Dependencies
DEBIAN SETUP
apt update && apt upgrade -y apt install -y build-essential linux-headers-$(uname -r) curl wget git python3 python3-pip python3-venv
2 NVIDIA Drivers via apt
NVIDIA DEBIAN
apt install -y software-properties-common add-apt-repository contrib non-free-firmware apt update && apt install -y nvidia-driver firmware-misc-nonfree reboot nvidia-smi # verify
3 CUDA Toolkit
CUDA ON DEBIAN
wget https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb dpkg -i cuda-keyring_1.1-1_all.deb apt update && apt install -y cuda-toolkit-12-6
AlmaLinux 9 Setup
AlmaLinux 9: Community RHEL binary-compatible distro. Excellent for production GPU workloads.
1 Initial Setup
ALMALINUX 9
dnf update -y dnf install -y epel-release dnf groupinstall "Development Tools" -y dnf install -y wget curl git htop
2 CUDA Repository & Drivers
CUDA ALMALINUX
dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo dnf install -y cuda-toolkit-12-6 nvidia-driver reboot && nvidia-smi
⚡ AlmaLinux is 1:1 binary compatible with RHEL9. Ideal choice if you're migrating from CentOS.
Kubernetes (K8s) + GPU
Deploy GPU-accelerated Kubernetes clusters for AI inference workloads. Includes NVIDIA Device Plugin and GPU Operator setup.
1 Install kubeadm, kubelet, kubectl
K8S INSTALL
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /" | tee /etc/apt/sources.list.d/kubernetes.list apt update && apt install -y kubelet kubeadm kubectl kubeadm init --pod-network-cidr=192.168.0.0/16
2 Install NVIDIA GPU Operator
GPU OPERATOR
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia helm repo update helm install --wait --generate-name -n gpu-operator --create-namespace nvidia/gpu-operator
3 Deploy GPU AI Workload
GPU POD
kubectl apply -f - <kubectl logs ai-inference
Docker Swarm Cluster
Multi-host Docker Swarm for distributed AI service deployment with GPU support.
1 Install Docker & Initialize Swarm
SWARM INIT
curl -fsSL https://get.docker.com | sh docker swarm init --advertise-addr YOUR_MANAGER_IP # On worker nodes: docker swarm join --token SWMTKN-xxx MANAGER_IP:2377
2 Deploy AI Stack
SWARM DEPLOY
docker stack deploy -c docker-compose.yml ai-stack docker service ls docker stack ps ai-stack
Nginx Reverse Proxy
Set up Nginx as a reverse proxy for your AI services with SSL termination and load balancing.
1 Install Nginx & Certbot
NGINX + SSL
apt install -y nginx certbot python3-certbot-nginx certbot --nginx -d your.domain.com
2 Configure Proxy for AI API
NGINX CONFIG
server { listen 443 ssl; server_name your.domain.com; location /api/ { proxy_pass http://127.0.0.1:8000/; proxy_set_header X-Real-IP $remote_addr; proxy_read_timeout 300s; } }
Postfix + Dovecot Mail
Production mail server with Postfix (SMTP) and Dovecot (IMAP/POP3). Supports TLS and modern authentication.
1 Install Postfix & Dovecot
MAIL SERVER
apt install -y postfix dovecot-core dovecot-imapd dovecot-pop3d spamassassin opendkim # Choose: Internet Site during postfix setup
2 Configure Postfix
/ETC/POSTFIX/MAIN.CF
myhostname = mail.yourdomain.com mydomain = yourdomain.com inet_interfaces = all smtpd_tls_cert_file = /etc/ssl/certs/cert.pem smtpd_tls_key_file = /etc/ssl/private/key.pem smtpd_use_tls = yes
⚡ Set up SPF, DKIM, and DMARC DNS records for email deliverability. Contact support for DNS help.
Mailcow Dockerized
Full-featured mail server suite via Docker. Includes webmail (SOGo), antispam, DKIM, and admin panel.
1 Clone & Configure
MAILCOW SETUP
git clone https://github.com/mailcow/mailcow-dockerized cd mailcow-dockerized ./generate_config.sh # Enter: mail.yourdomain.com when prompted
2 Start Mailcow
DOCKER COMPOSE
docker compose pull docker compose up -d # Admin UI: https://mail.yourdomain.com/admin # Default admin password in mailcow.conf
Nextcloud Self-Hosted Cloud
Deploy Nextcloud for private cloud storage, collaboration, and file sync — a DigitalOcean Spaces or Dropbox alternative on your own server.
1 Install Dependencies
NEXTCLOUD STACK
apt install -y apache2 mariadb-server php8.2 php8.2-{curl,gd,mbstring,xml,zip,mysql,intl,bcmath,gmp} systemctl enable --now apache2 mariadb
2 Deploy via Docker (Recommended)
DOCKER NEXTCLOUD
docker run -d --name nextcloud -p 8080:80 -v nextcloud:/var/www/html -e MYSQL_HOST=db -e NEXTCLOUD_ADMIN_USER=admin --restart always nextcloud:latest
⚡ B8N6 servers are perfect for Nextcloud — NVMe SSD ensures fast file access. Enable GPU transcoding for photos/videos.
GitLab Community Edition
Self-hosted GitLab CE for private code repos, CI/CD pipelines, and team collaboration.
1 Install GitLab CE
GITLAB INSTALL
curl -fsSL https://packages.gitlab.com/install/repositories/gitlab/gitlab-ce/script.deb.sh | bash apt install -y gitlab-ce EXTERNAL_URL="https://gitlab.yourdomain.com" apt install gitlab-ce
2 Configure & Start
GITLAB CONFIGURE
gitlab-ctl reconfigure gitlab-ctl status # Get initial root password: cat /etc/gitlab/initial_root_password
Grafana + Prometheus
Full observability stack for monitoring GPU utilization, AI model performance, and server health.
1 Install Prometheus
PROMETHEUS
wget https://github.com/prometheus/prometheus/releases/latest/download/prometheus-linux-amd64.tar.gz tar -xvf prometheus-linux-amd64.tar.gz cd prometheus-linux-amd64 ./prometheus --config.file=prometheus.yml & # GPU exporter: docker run -d --gpus all -p 9835:9835 utkuozdemir/nvidia_gpu_exporter:1.2.0
2 Install Grafana
GRAFANA
apt install -y grafana systemctl enable --now grafana-server # Access: http://YOUR_SERVER_IP:3000 # Import NVIDIA GPU dashboard ID: 14574
vLLM Inference Server
High-throughput LLM inference engine. Serves DeepSeek, Llama 4, Mistral and any HuggingFace model with OpenAI-compatible API.
1 Install vLLM
VLLM INSTALL
pip install vllm # Or with Docker: docker pull vllm/vllm-openai:latest
2 Serve a Model
SERVE MODEL
python -m vllm.entrypoints.openai.api_server --model deepseek-ai/DeepSeek-V3 --tensor-parallel-size 8 --host 0.0.0.0 --port 8000 --gpu-memory-utilization 0.95 --max-model-len 32768
3 Query the API
API CALL
from openai import OpenAI client = OpenAI(base_url="http://localhost:8000/v1", api_key="none") response = client.chat.completions.create( model="deepseek-ai/DeepSeek-V3", messages=[{"role":"user","content":"Hello!"}] ) print(response.choices[0].message.content)
⚡ Use --tensor-parallel-size equal to your GPU count. H200×8 can serve 70B+ models at full speed.
AI MODELS
OpenAI GPT-5.2 NEW
OpenAI o4-mini NEW
Anthropic Claude Opus 4.6 NEW
Anthropic Claude Sonnet 4.6 NEW
Google Gemini 3 Pro NEW
Google Gemini 2.5 Flash
Meta Llama 4 Maverick NEW
Meta Llama 4 Scout
DeepSeek V3.2 NEW
DeepSeek R1
Mistral Large 3 NEW
Alibaba Qwen3-235B NEW
xAI Grok 4 NEW
FLUX 1.1 Pro
NVIDIA NeMo · TensorRT
OpenAI GPT-5.2 NEW
OpenAI o4-mini NEW
Anthropic Claude Opus 4.6 NEW
Anthropic Claude Sonnet 4.6 NEW
Google Gemini 3 Pro NEW
Google Gemini 2.5 Flash
Meta Llama 4 Maverick NEW
Meta Llama 4 Scout
DeepSeek V3.2 NEW
DeepSeek R1
Mistral Large 3 NEW
Alibaba Qwen3-235B NEW
xAI Grok 4 NEW
FLUX 1.1 Pro
NVIDIA NeMo · TensorRT
⬡ Secure Access
Server Login
Access your dedicated server control panel.
Issues? support@b8n6.com · 24/7 NOC.
◈ Sales Enquiry
Contact Sales
Response within 2 business hours. Or email sales@b8n6.com
Under Maintenance
This feature is being upgraded
Estimated: Coming Soon