Engineering / Infrastructure
Systems Engineer
Full-Time
Bengaluru, India
Role Overview
We are seeking an experienced Systems Engineer to design, implement, and test automation for application deployment and infrastructure management. The role requires strong expertise in Ansible automation, Kubernetes operations, and system-level scripting, with a focus on GPU-aware workloads and advanced cluster management.
Key Responsibilities
- Develop and test automation workflows using the Ansible App Library.
- Write, maintain, and optimize Ansible playbooks for application and infrastructure automation.
- Script and automate system tasks using Bash and Ansible.
- Debug and test automation processes, ensuring reliability and scalability.
- Manage and optimize Kubernetes clusters, including:
- Pod lifecycle management
- Networking and firewall configurations
- GPU device mapping within Kubernetes
- Custom Resource Definition (CRD) creation and configuration
- Collaborate with engineering and operations teams to integrate automation into CI/CD pipelines.
- Ensure secure, scalable, and efficient cluster automation for AI workloads.
Requirements
- Strong hands-on experience with Ansible automation and App Library development.
- Strong hands-on experience with Linux system administration and cluster environments.
- Proficiency in Bash scripting and playbook creation.
- Deep knowledge of Kubernetes, including:
- Pod management and orchestration
- Networking and firewalling in Kubernetes clusters
- GPU resource scheduling, mapping, and utilization
- Custom Resource Definitions (CRDs) and operator patterns
- Experience in debugging and testing automation workflows.
- Strong problem-solving skills and ability to work in a fast-paced, product-focused environment.
Nice to Have
- Experience with GPU workloads or high-performance computing environments.
- Exposure to cloud-native monitoring and observability tools (Prometheus, Grafana, etc.).
Why Join Us?
- Work on automation and Kubernetes challenges powering AI infrastructure.
- Be part of a high-impact team delivering GPUaaS at scale.
- Gain deep exposure to GPU cloud orchestration and system-level engineering.
- Competitive compensation, growth opportunities, and a culture of technical excellence.