NewExciting news: Latitude.sh is being acquired by Megaport

./instances/rtx-pro-6000

NVIDIA’s Universal AI Chip

Blackwell architecture meets multi-purpose workloads
RTX PRO 6000

Unleash Next-Gen GPU Power. Bare Metal, No Abstraction

Get ready to run any workload that demands real GPU muscle. With 96GB of VRAM per card, check out what the g4.rtx6kpro powered by NVIDIA’s Blackwell architecture can handle:

LLM training & fine-tuning

No hypervisor overhead, no noisy neighbors — every cycle and memory channel is yours.

Inference at scale

Deliver real-time inference throughput with consistent tail latencies, even at high query volumes.

Rendering & VFX

Power GPU render farms with deterministic performance per node — ideal for CGI, animation, simulation workflows.

Gen-ai & Diffusion Models

Optimize large model pipelines involving tensor ops, vision, and multimodal inference.

Scientific computing & HPC

Utilize full double & mixed-precision performance across GPU cores for physics, molecular, and engineering workloads.

THE GPU 'SWEET SPOT'

AI infrastructure teams often face a trade-off: wait months and overpay for top-tier GPUs, or settle for older chips that can’t handle today’s compute demands.

The RTX PRO 6000 resolves that tension: With Blackwell architecture, next-gen tensor cores, and generous VRAM, it delivers the performance needed for LLM fine-tuning, and inference at scale.

LLM Inference Maximum Throughput

(Tokens/s)
Llama v4 Scout
17857
1105
Llama v3.3 70B
4776
1694
Llama v3.1 8B
22757
8471
RTX PRO 6000 Server Edition
NVIDIA L40S

GPU Comparison Table

GPU modelArchitectureMemory (GB)Max model size*Tensor coresTensor cores gen
NVIDIA B200Blackwell19298B10565ᵗʰ
NVIDIA RTX PRO 6000Blackwell9648B7525ᵗʰ
NVIDIA H100 PCIEHopper8040B4564ᵗʰ
NVIDIA L40SAda Lovelace4824B5684ᵗʰ
*FP16

An Instance for Every Need

From LLM training to Generative AI inference, check out the different variations of our instances powered by NVIDIA’s RTX PRO 6000 Server Edition and deploy the one that best fits your requirements.
For maximum performance and zero overhead, choose RTX PRO 6000 bare metal.

g4.rtx6kpro.large

Starting at $11,529.58/mo
Metal icon
Metal GPU
GPU8 x NVIDIA RTX PRO 6000 Server Edition
VRAM768 GB Total
CPUDual AMD 9355, 64 Cores @ 3.55 GHz
RAM1.5 TB of DDR5
STR4 x 3.8TB NVME
NIC2 x 100 Gbps
For flexible usage and the ultimate scalability, go with the RTX PRO 6000 VM instead.

vm.rtx6kpro.small

Metal icon
GPU VM
GPU1x NVIDIA RTX PRO 6000 Server Edition
VRAM96 GB
vCPU16 vCPUs
RAM128 GB of DDR5
STR500 GB local

Get the most out of your GPUs on bare metal

Deploy fully dedicated NVIDIA RTX PRO 6000 Blackwell instances with zero virtualization overhead, ultra-low latency, and total resource control.
RTX PRO 6000

Dedicated & Deterministic Performance

No hypervisor overhead, no noisy neighbors — every cycle and memory channel is yours.

Blackwell Architecture, Built for Scalability

Next-gen GPUs with improved throughput, AI performance, and compute density over prior generations.

Ultra-Low Latency Networking

Leveraging NVIDIA’s Spectrum X-800 network, our platform enables sub-microsecond RDMA and GPU peer-to-peer transfers across nodes.

Seamless Integration & Control

Full CUDA, cuDNN, TensorRT support with root-level access, driver customization, and GPU direct RDMA.

Accelerated AI Networking powered by Spectrum-X *

We’ve integrated NVIDIA’s Spectrum X-800 switches across our data centers, enabling your GPU nodes to interconnect with minimal latency, full RDMA support, and high bisection bandwidth.

Low-latency RDMA & GPUDirect

Enable zero-copy data paths directly between GPU memory across nodes or within the same rack, eliminating host CPU involvement in inter-GPU transfers.

High-throughput, non-blocking fabric

Massive internal bandwidth ensures your GPUs don’t bottleneck on the network, even during all-to-all communication or large parameter exchanges.

Isolation & QoS built-in

Dedicated lanes and traffic shaping guarantee that your data flows remain predictable and secure, even at scale.

Scale clusters seamlessly across racks and zones

Build multi-node training or inference clusters with full high-speed connectivity across the fabric.

* Spectrum-X is only available in selected instances. You can request access to Spectrum-X powered instances through this form.

Purpose-built for AI workloads

Enjoy complete control over hardware configurations and optimized software stacks tailored to your AI workloads. Integrate seamlessly with your favorite AI tools and libraries, and never look back.

./built-for-developers

Start with Terraform

Create GPU instances programmatically with Latitude.sh's powerful, friendly API. Get started quickly with integrations like Terraform and client libraries for your preferred programming language. 
123456789101112131415161718192021222324252627
View provider

Power your AI models