Engineering / Infrastructure

Systems Engineer

Full-Time

Bengaluru, India

Hosted.ai is a turnkey, multi-tenant AI cloud platform enabling service providers to launch and scale GPU-as-a-Service (GPUaaS) with maximum efficiency. We are pioneering GPU over-commit and infrastructure optimization technologies to help enterprises and service providers run AI workloads more profitably.

Role Overview

We are seeking an experienced Systems Engineer to design, implement, and test automation for application deployment and infrastructure management. The role requires strong expertise in Ansible automation, Kubernetes operations, and system-level scripting, with a focus on GPU-aware workloads and advanced cluster management.

Key Responsibilities

Develop and test automation workflows using the Ansible App Library.
Write, maintain, and optimize Ansible playbooks for application and infrastructure automation.
Script and automate system tasks using Bash and Ansible.
Debug and test automation processes, ensuring reliability and scalability.
Manage and optimize Kubernetes clusters, including:
- Pod lifecycle management
- Networking and firewall configurations
- GPU device mapping within Kubernetes
- Custom Resource Definition (CRD) creation and configuration
Collaborate with engineering and operations teams to integrate automation into CI/CD pipelines.
Ensure secure, scalable, and efficient cluster automation for AI workloads.

Requirements

Strong hands-on experience with Ansible automation and App Library development.
Strong hands-on experience with Linux system administration and cluster environments.
Proficiency in Bash scripting and playbook creation.
Deep knowledge of Kubernetes, including:
- Pod management and orchestration
- Networking and firewalling in Kubernetes clusters
- GPU resource scheduling, mapping, and utilization
- Custom Resource Definitions (CRDs) and operator patterns
Experience in debugging and testing automation workflows.
Strong problem-solving skills and ability to work in a fast-paced, product-focused environment.

Nice to Have

Experience with GPU workloads or high-performance computing environments.
Exposure to cloud-native monitoring and observability tools (Prometheus, Grafana, etc.).

Why Join Us?

Work on automation and Kubernetes challenges powering AI infrastructure.
Be part of a high-impact team delivering GPUaaS at scale.
Gain deep exposure to GPU cloud orchestration and system-level engineering.
Competitive compensation, growth opportunities, and a culture of technical excellence.