What is included in this Hosted.ai update V2.2.1?

This update introduces major improvements across compute provisioning, infrastructure access, automation, and operational visibility. Highlights include: - unified provisioning across GPU services and virtual machines - bare metal GPU instance support - browser-based SSH access - stronger deployment automation - real-time Kubernetes service health monitoring - performance and scalability improvements across core platform services

What does unified provisioning mean in practice?

It means GPU services and virtual machines can now be managed through a more consistent platform. This helps reduce operational complexity and creates a more standardized experience across workload types.

Why is bare metal support important?

Bare metal is valuable for workloads that need direct hardware control, predictable performance, or specialized infrastructure requirements. It also gives customers more flexibility as they scale beyond standard virtualized environments.

What is browser-based SSH access?

Users can now open an SSH session directly from the hosted.ai interface using an embedded browser-based console. This removes the need for separate SSH clients in many common workflows.

Which compute types support browser-based SSH access?

Browser-based SSH access is available across GPU services, virtual machines, and bare metal instances.

How does this help users?

It reduces friction, speeds up access, and makes it easier to work in environments where installing local SSH tools or using additional networking setup is inconvenient.

What automation improvements are included?

hosted.ai now supports stronger deployment-time automation for GPU services, making it easier to apply software stacks and operational configuration consistently during workload rollout.

Can hosted.ai support more advanced containerized and Kubernetes-style workloads now?

Yes. This update expands support for workloads that benefit from more advanced system and service behavior, including scenarios involving nested container and Kubernetes-style environments.

What is the Kubernetes health monitoring dashboard?

It is a real-time service integrity view in the admin experience that provides health visibility, live status awareness, warnings, and audit tracking for Kubernetes services.

Why does Kubernetes health monitoring dashboard matter?

It improves operational visibility, helps teams diagnose issues faster, and reduces dependence on manual checks.

Were there also stability or usability improvements in this update v2.2.1?

Yes. This update includes a number of platform improvements across service creation, editing flows, billing accuracy, scheduler stability, node handling, and email branding behavior.

Will this update change billing behavior?

The update improves billing accuracy and visibility in several areas. For example, it ensures workspace billing reflects usage more accurately and avoids billing certain GPU services before they are active and running.

Who should care most about this release?

This update is especially relevant for: - teams deploying GPU-backed AI services - operators managing mixed compute environments - infrastructure providers offering GPU services - organizations that need more flexibility across virtualized and bare metal infrastructure

How can customers learn more or get a walkthrough?

Customers can contact hosted.ai to schedule a walkthrough of the new capabilities and discuss how these updates fit their infrastructure needs.

News

hosted.ai 2.4: new GPU monetization options, smoother neocloud operations

Version 2.4 of the hosted.ai GPUaaS platform is available now, bringing new ways to sell GPUaaS, and enhanced usability and stability for neoclouds and their customers. Let’s get into it!

#BuildingInPublic – the platform story so far

hosted.ai has rolled out a major platform update that makes it easier for teams to deploy, manage, and scale AI infrastructure across GPU services, virtual machines, and bare metal.

This update brings together several improvements that matter directly to operators, platform teams, and AI builders:

better performance and scalability in core platform orchestration
a more consistent provisioning and management experience across GPU services and VMs
new bare metal GPU instance support
browser-based SSH access across compute types
stronger automation for GPU service deployments
real-time Kubernetes service health visibility

The result is a platform that is easier to operate, more flexible for customers, and better suited for modern AI workloads ranging from inference and application hosting to distributed training and Kubernetes-based deployments.

New to hosted·ai? Learn more about our GPU cloud platform or get in touch for a demo.

What’s new in hosted·ai v2.4:

1. Sell bare metal GPU instances

v2.4 solidifies the bare metal GPU server management and instance provisioning capabilities that we introduced earlier this year.

With hosted·ai, you can onboard bare metal GPU nodes and manage the full node lifecycle (offline → online → in service → not in service).

You can provision servers to users via auto-assignment, or by manual allocation from the admin panel.

Bare metal node billing is handled with the same hosted·ai billing engine you use for other flavors of GPU cloud (elastic GPUaaS, and GPU VMs).

Bare metal nodes include SSH console access, and provisioning is via Ansible. DCIM, coming soon.

What this means:

Through a single portal, you can now sell all flavors of GPU cloud – dedicated GPUaaS, shared GPUaaS, VMs with GPU passthrough, and bare metal GPU servers – with a consistent management UI and unified billing control.

2. Sell GPUaaS with user-selected and VIP GPU scheduling

The hosted.ai platform has the most flexible GPU scheduling engine on the market. In v2.4, this has expanded with two new scheduling options that can be configured for your GPUaaS products:

Dynamic / user-selected GPU scheduling: this enables you to create GPUaaS products that give the user the ability to choose a minimum GPU resource percentage they will receive from a shared GPU pool. The hosted·ai scheduler guarantees that percentage of resources for their workloads, and the user is billed accordingly.

VIP priority scheduling: this enables you to create GPUaaS products with prioritized workload scheduling: VIP user workloads are prioritized when multiple tenants access a pool simultaneously, and the user is billed accordingly.

Why it matters:

These scheduling options help providers offer more predictability for multi-tenant workloads; offer premium/guaranteed offerings; and enable latency-sensitive inference workloads to co-exist with training workloads on multi-tenant GPU pools without impacting the end user experience.

3. RootFS persistence for GPUaaS pods

The root file system of GPUaaS pods can now be made persistent across reboots. This has been implemented using a host-path storage plugin. There is no separate remote volume, no periodic data sync, and zero performance overhead.

RootFS persistence is enabled by default on new instances:

Ansible service execution runs the first time the system boots, but not on instance reboot
A new ‘factory reset’ with data wipe feature has been implemented to fully erase persistent storage when required
Storage quotas are enforced with sysbox (ENOSPC at 96% usage) to prevent pod out-of-storage crashes
df/reboot wrappers are used for full VM-like behaviour inside pods, showing the actual usage for storage inside an instance

Why this is important:

It’s a big quality of life improvement for neoclouds and their developer customers: planned (or unplanned) pod reboots won’t lose installed packages, drivers, or configurations on restart. Hosted.ai GPUaaS pods now behave much like VM environments.

4. High Availability for KVM clusters

hosted·ai v2.4 introduces automatic primary/secondary failover for KVM cluster panel nodes, using etcd for distributed leader election. A periodic HA agent aligns controller services and SQL database replication to the elected leader without manual intervention. VMs can be individually set to auto-restart or not.

High Availability is enabled via a toggle in the hosted·ai cluster management panel. It enforces a 3-node minimum, and supports full disable/revert.

What this means:

High Availability improves SLA reliability for hosted VM infrastructure. It eliminates single points of failure in KVM cluster management, reduces unplanned downtime for VM workloads, and provides automated recovery without operator intervention on primary node failure.

5. Prometheus metrics framework for KVM

In hosted·ai v2.4, we have replaced the legacy RRD file-based stats system with a Prometheus + libvirt exporter architecture.

It provides per-VM metrics (vCPU, memory, disk I/O, network I/O) and cluster-wide node exporter metrics, with real-time dashboards, AlertManager webhook integration, and batch 10-minute collection. Prometheus configuration is auto-regenerated with a 10-second target sync, as nodes are added or removed.

How does this help?

This delivers a scalable, standards-based observability foundation for KVM infrastructure — eliminating filesystem scan overhead, enabling reliable alerting (node down, disk thresholds), and providing historical metric storage across all cluster nodes to support operational visibility at scale.

Next steps

This release also includes many smaller improvements and fixes based on customer feedback.

To upgrade from previous versions to hosted·ai v2.4, please contact your account manager or our customer success team.
If you’re new to hosted·ai, get in touch for a demo and we’ll walk/talk you through the platform. Thanks!