Cloud Reliability Engineer
A bit about the team:
We are growing our WSA Cloud Center of Excellence team within R&D, we are dedicated to ensuring the continuous reliability and performance of our cloud-based infrastructure. Our team leverages cutting-edge technology to build and maintain resilient systems that meet the demands of modern, cloud-native applications. We will be growing the team moving forward and hence, are looking for additional profiles to join the team.
Position Overview:
We are seeking an experienced Site Reliability Engineer to join our team, with a focus on monitoring, alerting, and infrastructure stability. This role primarily involves maintaining the reliability and performance of our systems hosted in Azure Cloud.
As a core member of our Site Reliability Engineering (SRE) team, you will use tools such as Azure Monitors, Azure Devops, Power Shell, Kubernetes (K8s), Argo, and Grafana to ensure that our systems are available, scalable, and performing optimally.
Key Responsibilities:
- Ensure System Uptime and Reliability: Monitor and maintain cloud-based applications and infrastructure, ensuring minimal downtime and efficient incident response.
- Build and Optimize Monitoring and Alerting Systems: Set up and continuously improve comprehensive monitoring and alerting frameworks to detect and address issues proactively.
- Cloud Infrastructure Management: Manage, optimize, and scale systems on Azure cloud platforms, ensuring high performance and cost-effectiveness.
- Incident Management and Response: Act as the first line of defense in identifying, diagnosing, and resolving technical issues in real-time or escalate them to the appropriate teams.
- Automation and Infrastructure as Code (IaC): Utilize IaC tools to automate infrastructure provisioning and management, promoting reproducibility and reducing manual interventions.
- Tooling and Observability: Leverage technologies such as Grafana for observability and Argo for CI/CD automation, enhancing our ability to respond swiftly and effectively to infrastructure needs.
- Collaboration: Work closely with cross-functional teams to align on SRE best practices, share insights, and support development and operational goals.
Requirements:
- Experience with Cloud Platforms: 5+ years of experience in cloud environments, with a primary focus on Azure.
- Monitoring and Alerting Skills: Strong experience with monitoring tools (e.g., Grafana, Prometheus) and a background in setting up alerts and dashboards.
- Incident Management: Proven track record in diagnosing and troubleshooting complex system issues, with a focus on fast incident response and resolution.
- Collaboration and Communication: Excellent communication skills, with an ability to work collaboratively with various technical teams and stakeholders.
- Kubernetes Expertise: Proficiency with Kubernetes (K8s) for orchestrating and managing containerized applications.
- Automation and IaC: Hands-on experience with any Scripting language (e.g., Python, Shell script, Power shell) Infrastructure as Code (e.g., Terraform, Ansible) for automating cloud infrastructure management.
Preferred Qualifications:
- Familiarity with CI/CD tools, particularly Argo for workflow automation.
- Certification in Azure, AWS, or Kubernetes.
- Experience working in an SRE or DevOps capacity in a multi-cloud environment.
Why Join Us?
At WSA Cloud Center of Excellence, you’ll work in an environment that encourages technical innovation and continuous learning.
Other benefits include:
Competitive compensation
Work-life balance
Collaboration with global team members
Meal vouchers
Reimbursements for internet and phone
Access to skill development resources and certifications.
Supportive, collaborative, and friendly work environment
Cultural events
Hybrid-work environment
Medical care
Maternity and paternity leave
Discounts on hearing aids for friends and family
Fostering Diversity, Equity, and Inclusion
Join our team to make an impact on the reliability and scalability of cutting-edge cloud infrastructure, collaborate with top talent, and take on complex challenges in cloud reliability.
About WSA
Every day our 12.000 colleagues in 130 markets help millions of people regain and benefit from the miracle of hearing. Going beyond together, we achieve annual revenues of around EUR 2.3 billion. Our portfolio of technologies spans the full spectrum of hearing care, from distinct hearing brands and digital platforms to managed care, hearing centers and diagnostics locations.
- Department
- Research & Development
- Role
- Cross Software Development
- Locations
- Hyderabad
Hyderabad
Cloud Reliability Engineer
Loading application form
Already working at WS Audiology APAC?
Let’s recruit together and find your next colleague.