Cirrascale announced this week it will offer Google Gemini inference on-premises via Google Distributed Cloud, running on Dell appliances with Intel and Nvidia hardware. General availability is planned for late June or early July.

The setup runs without TPUs, which Cirrascale acknowledges means lower raw performance than Google's own infrastructure, but the point isn't performance. It's data residency.

The model itself never touches a hard drive: it runs in memory, and if the hardware detects an intrusion, it shuts down and the model is gone. That's a meaningful security posture for the government, defense, healthcare, and finance customers this is aimed at.

The market context makes the timing clear. Gartner forecasts worldwide sovereign cloud IaaS spending will hit $80 billion in 2026, a 35.6% increase year-over-year, driven primarily by governments and regulated industries. Europe is growing faster than North America and is on track to overtake it in sovereign cloud spending by 2027.

The pressure isn't coming from compliance teams alone. Google Cloud's own leadership has noted that enterprise customers have shifted from a cloud-first to a sovereign-first strategy, with AI and sovereignty now the two dominant topics in customer conversations. Cirrascale is positioning at exactly that intersection.

The service layer on top is worth noting. Cirrascale isn't just shipping hardware with Gemini installed. It's adding token-rate management, user queuing, load balancing across regions, and ongoing operational support.

That's an inference-as-a-service wrapper around a model the customer doesn't own, running on infrastructure they do. Whether that tradeoff lands will depend on how tightly regulated the target customer's environment actually is, and how much performance they're willing to give up for the air-gap.

The previews starting now will answer that question before GA.

Forwarded this message? Subscribe to Uplink.
Follow us on social media to stay in the loop
Contact us with questions, comments, or leads

Keep Reading