The HP Z6 G5 A is the smallest Threadripper Pro OEM workstation on the market and the rational mid-tier pick under HP's flagship Z8 Fury G5. Reviewers across PCMag, AnandTech, StorageReview, Phoronix, and DEVELOP3D consistently praised its build quality, toolless serviceability, and 96-core CPU ceiling — StorageReview gave it their 'highest recommendation for a high-end tower workstation.' For local-LLM use, configurations with 1–3 RTX 6000 Ada GPUs (48 GB VRAM each at ~960 GB/s) deliver in the 25–40 tokens/sec range on Llama-3-70B Q4 single-GPU and substantially more with multi-GPU tensor parallelism. Note that none of the published professional reviews ran formal Llama-3 70B Q4 benchmarks, so LLM-specific performance numbers here are from single-GPU norms rather than published HP Z6 measurements specifically.

Full review
Design and Form Factor
The HP Z6 G5 A is built around AMD's Ryzen Threadripper Pro 7000 WX-Series, scaling from 12 cores up to the 96-core 7995WX in the same 4U chassis. The unit measures 169 x 465 x 445 mm and ships with a built-in carrying handle that DEVELOP3D's review specifically called out as a thoughtful detail — high-core-count workstations are heavy, and an integrated handle matters more than its triviality suggests. The interior is toolless, with modular drive cages and PCIe support trays designed for servicing without screws. StorageReview praised the Z6 G5 A as the 'smallest Threadripper Pro OEM workstation' on the market, positioning it as a credible desk-side alternative to a full tower for buyers who don't need the Z8 Fury G5's 4-GPU footprint.
AI and Multi-GPU Configuration
The Z6 G5 A supports up to three dual-height professional GPUs — typically an RTX A6000 or RTX 6000 Ada Generation for serious AI work, or smaller A4000/A5000 cards for compute-light visualization. With three RTX 6000 Ada cards, the system pools 144 GB of VRAM at roughly 960 GB/s per card, which is substantially more memory bandwidth per dollar than the unified-memory boxes in this guide and enough to hold Llama-3-70B Q4 (~40 GB) entirely in VRAM with headroom for KV cache and a second concurrent model. Phoronix's testing focused on traditional rendering and CFD workloads where the 7995WX outpaced everything in its class; StorageReview ran UL Procyon AI Inference benchmarks and showed Real-ESRGAN completing in 83.8 seconds on RTX 6000 Ada via TensorRT versus 2,891 seconds on CPU — the kind of two-orders-of-magnitude speedup that justifies a multi-GPU box for any team doing AI development at scale.
Where It Falls Short
The Z6 G5 A's biggest practical issue is thermals on the highest core counts. StorageReview measured 95°C all-core temperatures on the 7995WX under sustained load — within spec, but at the high end of comfort. Buyers picking the 96-core variant should plan for adequate room ventilation. The other constraint is that none of the published professional reviews ran formal large-language-model benchmarks; reviewers focused on rendering, simulation, and traditional workstation workloads. LLM-specific performance numbers in this guide are extrapolated from single-GPU RTX 6000 Ada norms rather than measured on the Z6 G5 A directly. Finally, configuration pricing scales steeply: a 12-core entry config starts near $3,100, a typical 1-GPU AI build lands around $5,500, and a 96-core 3-GPU monster pushes $18,000+.
Who It's Best For
The HP Z6 G5 A is the right pick for AI teams that need multi-GPU capability and traditional workstation workloads on the same machine — mixed AI/CFD/rendering environments, research labs, and product teams whose engineers compile, simulate, and train on the same desk-side hardware. It's the rational tier under the Z8 Fury G5 for buyers who don't need 4 GPUs or the Z8's larger chassis. Versus the Puget Genesis II it trades the bespoke configurator for HP's enterprise service network. Versus the DGX Spark it's three to four times the price but offers a fundamentally more flexible machine — one that can run Windows, drive multiple displays, and serve as a primary workstation rather than a dedicated AI dev box.
Strengths
- +Smallest Threadripper Pro OEM tower on the market — compact 4U chassis with built-in handle
- +AMD Ryzen Threadripper Pro 7000 WX-Series scales from 12 to 96 cores at the same chassis price floor
- +Toolless serviceability, modular interior, ECC DDR5 — enterprise pedigree at mid-tier pricing
- +Supports up to 3 dual-height pro GPUs (RTX A6000, RTX 6000 Ada) for serious multi-GPU AI work
Watch-outs
- −95°C all-core CPU thermals reported under sustained load (StorageReview)
- −Pricing scales steeply — 96-core configs push $18,000+
- −No published Llama 70B Q4 tokens/sec figures in mainstream reviews — LLM-specific benchmarking is thin
How it compares
The HP Z6 G5 A is the mid-tier sweet spot in this lineup. Versus the HP Z8 Fury G5 (its flagship sibling), it's a smaller chassis with the same Threadripper Pro CPU family at a noticeably lower entry price — trading the Z8's 4-GPU ceiling for a 3-GPU ceiling and a more desk-friendly footprint. Versus the Puget Genesis II, it offers similar build pedigree without Puget's bespoke configurator and handpicked components, at a meaningfully lower starting price. Versus the DGX Spark, it's a different class of machine — the HP Z6 G5 A is a multi-GPU general workstation, the Spark is a single-purpose 128 GB unified-memory dev box. Pick the HP Z6 G5 A when you need both AI horsepower and traditional workstation workloads (rendering, simulation, multi-app productivity) on the same machine.
Who this is for
At a glance: Best for mid-tier — Threadripper Pro multi-GPU under Z8 pricing.
Why you’d buy the HP Z6 G5 A
- Smallest Threadripper Pro OEM tower on the market — compact 4U chassis with built-in handle.
- AMD Ryzen Threadripper Pro 7000 WX-Series scales from 12 to 96 cores at the same chassis price floor.
- Toolless serviceability, modular interior, ECC DDR5 — enterprise pedigree at mid-tier pricing.
Why you’d skip it
- 95°C all-core CPU thermals reported under sustained load (StorageReview).
- Pricing scales steeply — 96-core configs push $18,000+.
- No published Llama 70B Q4 tokens/sec figures in mainstream reviews — LLM-specific benchmarking is thin.
Rating sources
“Well worth your consideration.”
“96-core Threadripper Pro 7995WX impresses.”
“Highest recommendation for a high-end tower workstation. Best of 2023.”
“Incredibly powerful AMD workstation for creators and developers.”
“Remarkably powerful, smallest Threadripper Pro OEM workstation.”
Our 4.5 score is the average of these published ratings. Ratings marked * were derived from the reviewer’s written analysis or video transcript — the publisher didn’t print an explicit numeric score, so we inferred one from their own words. Click through to verify. More about methodology.



