The problem

Cloud GPU rental looks like the obvious answer until you put a calculator on it. An H200 instance at $3-4/hour is fine for a weekend prototype. Run it at a 30% duty cycle for a year — roughly 2,600 billable hours — and you are at $7,800-$10,400 in pure compute, before egress, before the storage tier, before the orchestration overhead. Push to 50% utilisation and the same workload that justified a $58K on-prem build pays for itself inside 18-24 months, with the hardware still on the depreciation schedule for years after.

The second problem is data residency. The records ATCS handles — vendor W-9s, 1099 candidate detail, payroll registers, bank reconciliations, full general ledger — are exactly the records your CPA, your bonded clients, and a future IRS examiner will want documented chain-of-custody on. A third-party cloud tenant is fine for marketing assets. It is a harder conversation when the data is the books themselves.

The third problem is backup. The honest version of SMB backup is that the IT generalist took a snapshot when they remembered, the off-site copy was last verified eight months ago, and nobody has actually restored from it. We treat backup as a tested system, not a checkbox.

How it works

Right-sized hardware bundles

Four tiers, each a real bill of materials we will quote line by line.

Starter — single RTX 5090 (32GB GDDR7), 128GB DDR5 ECC, custom workstation chassis. Typical delivered cost around $10.5K. Sized for one to three concurrent users running classification, document extraction, and reconciliation against books up to roughly 50K transactions a year.

Growth — dual RTX 5090 (32GB each), 256GB DDR5 ECC RDIMM, Asus Pro WS W790-SAGE SE board, Xeon W-2400 platform. Typical delivered cost around $15.4K. The low-end anchor for firms running parallel inference and embedding workloads on real client volume.

Scaling — single H200 PCIe (141GB HBM3e), 512GB DDR5, 4U Supermicro AS-4125GS-TNRT or Dell PowerEdge R760xa. Typical delivered cost around $58K. The 141GB of HBM3e on a single card eliminates the model-sharding overhead that an H100 build would force on you.

Enterprise — dual H200 PCIe with NVLink bridging, 768GB DDR5, 4U server in a 42U rack, 6kVA online UPS, switched metered PDU, redundant 25GbE networking. Typical delivered cost around $108K. The high-end anchor for multi-entity firms running continuous inference plus historical re-indexing.

Encrypted daily backups

Every tier ships with a Synology NAS as the first-line backup target — DS923+ for Starter, DS1823xs+ for Growth and Scaling, RS1221+ rackmount for Enterprise. Daily snapshots run encrypted at rest with AES-256, replicated off-site to Backblaze B2 or Wasabi at $6-7/TB/month. Retention runs 7 days at the Starter tier and extends to 30 days with weekly archival at Enterprise. Restores are dry-run quarterly against a clean target so the runbook is exercised, not theoretical.

Power and rack engineering

The bundles are not just GPUs in a box. Starter and Growth ship with a 1500VA line-interactive UPS and a labelled cable plan. Scaling moves to a 3kVA online double-conversion UPS and a dedicated 208V/30A circuit to feed the H200, which alone draws up to 600W under load. Enterprise lands on a 6kVA online UPS, switched PDU with per-outlet metering, and a thermal plan accounting for roughly 14,000 BTU/hr of heat rejection — the kind of detail your facilities contractor will want before you sign a closet conversion.

Cloud comparison and payback math

The pricing calculator exposes the side-by-side: your projected duty cycle, current RunPod and Lambda Labs H200 hourly rates, three-year total cost of ownership against the equivalent on-prem build, and the month at which the on-prem path crosses into cheaper territory. No hand-waving — the inputs are editable and the outputs update live.

What you get

Dual RTX 5090 workstation with 256GB ECC RDIMM, or H200 PCIe rack server with 141GB HBM3e per card
Synology NAS sized to tier with encrypted daily snapshots
Off-site cloud replication to Backblaze B2 or Wasabi with retention by tier
UPS-backed power with tested failover, sized from 1500VA tower to 6kVA online rack
208V/30A circuit specification and thermal plan for H200 deployments
42U rack, switched PDU, and 25GbE networking on the Enterprise tier
Quarterly DR drills with documented restore times and signed runbook
Spare-parts SLA for GPU and PSU failure with cross-shipped replacements
Three-year hardware warranty pass-through plus on-site break-fix coverage
Asset tagging, serial inventory, and a documented bill of materials for your CPA's fixed-asset schedule
Network segmentation so the inference VLAN never touches the general office network
Onboarding handoff with a written runbook your IT lead can actually follow

FAQ

Why on-prem instead of cloud?

Two reasons. Math: at 30%+ duty cycle on H200-class workloads, on-prem pays back inside 18-24 months and depreciates for years after. Data: your vendor TINs, payroll, and P&L stay on equipment you own, in a building you control.

Why H200 instead of H100?

The H200 PCIe ships with 141GB of HBM3e versus 80GB on the H100. For the model sizes ATCS runs in production, that means a single card holds the working set without sharding, which simplifies the deployment and removes a class of latency spikes. The price delta is small enough that recommending H100 in 2026 is hard to defend.

Does Acer make a workstation that fits the Growth bundle?

Honestly, no. Acer does not currently stock a SKU that accepts dual RTX 5090s with the W790 platform and 256GB of ECC RDIMM. We deliver Growth as a custom build on the Asus Pro WS W790-SAGE SE board with a vetted chassis and PSU pairing.

What happens if a GPU fails?

The spare-parts SLA covers cross-shipped GPU and PSU replacements within one business day on Scaling and Enterprise tiers, two business days on Starter and Growth. Workstation tiers ship with a documented swap procedure your IT lead can execute; rack tiers include an on-site visit if you prefer.

How is backup tested?

Quarterly. We restore a representative dataset to a clean target, time the operation, verify the cryptographic integrity of the restored volume, and sign the runbook. You receive the report. If the restore fails, that is the finding — and we fix it before the next quarter.

What automation handles that an IT consultant shouldn't have to

IT consultants charge $200-300/hour, and they earn it on the work that actually requires hands and judgment. But most of what shows up on a hardware quote isn't engineering — it's sizing arithmetic, vendor part-number lookups, and a spreadsheet that gets emailed back and forth for two weeks. ATCS automates that layer so you stop paying senior-rate hours for clerical work. The consultant still gets called for what only a person on-site can do.

AI / automation does this better

Size the RAM uplift from 64GB to 256GB across three vendors and return the price delta in seconds — a consultant does it in two days and a follow-up email asking which DIMM rank you wanted.
Re-price the bundle live as supply shifts. Your consultant's quote from three weeks ago is already stale; the calculator on this page isn't.
Compute the 18-month payback against current RunPod and Lambda H200 hourly rates at a 30% duty cycle, without a spreadsheet exchange or a "let me get back to you on cloud pricing."
Map workload inputs to the right bundle tier — starter (1× RTX 5090, 128GB), growth (2× 5090, 256GB), scaling (1× H200, 512GB), enterprise (2× H200 PCIe, 768GB, 42U rack) — instead of debating it on a discovery call.
Generate the asset BOM with serials at install time and keep it in sync with the inventory record. Consultants typically deliver this as a PDF that goes stale the first time a stick of RAM is swapped.
Schedule and log the quarterly DR drill with the audit-log evidence attached, rather than waiting for someone to remember it's been a year.
Verify encrypted Synology snapshots replicated to Backblaze B2 or Wasabi overnight, every night, without a human checking a dashboard.
Quote three configuration variants side-by-side at 2 a.m. when you're trying to make a budget deadline.

What only a human IT consultant should still do

Walk the server closet, measure clearance, and confirm a 4U chassis actually fits with proper cable management and airflow in front and behind.
Specify the 208V/30A circuit run with a licensed electrician and pull the permit. No automation signs off on power.
Negotiate spare-parts SLA terms and next-business-day replacement language with the vendor account rep — that's a relationship, not a form field.
Rack and cable the hardware, label the runs, and validate that the redundant PSU is actually on a different circuit than the primary.
Sign off on the physical install and the first power-on. Liability lives with a person, not a calculator.

The split is honest: the consultant focuses on the physical install, the electrical work, and the vendor relationship. The platform handles the sizing math, the BOM sourcing, the payback comparison, and the ongoing backup and DR evidence. You still pay your consultant for the hours that need a human in the room — you stop paying $300/hour for arithmetic.

Where to next

For exact numbers for your firm, the pricing calculator lets you set duty cycle, user count, and storage footprint and shows the three-year TCO against current cloud GPU rates. Pair that with the bookkeeping solutions page to see what runs on top, or the security & data residency page for the encryption and access posture.

On-prem private AI hardware, sized for your books — not someone else's cloud