Job Title: Principal Engineer / Team Lead
Req Id:
14239
Location:
Description:
-
- Oversee daily operations of cloud infrastructure, ensuring high availability, scalability, and security.
- Monitor and optimize cloud resources for cost efficiency and performance.
- Implement and maintain backup, disaster recovery, and business continuity plans.
- Leverage Terraform, CloudFormation, and Ansible for automated provisioning and infrastructure management.
- Implement and maintain security policies, Ensure compliance with industry standards like PCI-DSS, ISO 27001, and SOC 2.
-
- Ensure system reliability, availability, and scalability using SRE best practices.
- Establish error budgets, SLIs, SLOs, and SLAs for effective incident management and post-mortems.
-
- Develop and execute program plans aligned with organizational goals and customer needs.
- Collaborate with cross-functional teams (Engineering, Product, Security, and Business) for seamless execution.
- Define and track Key Performance Indicators (KPIs) for cloud performance, cost optimization, and incident response.
- Monitor Operational Metrics:
- Uptime and Availability (e.g., 99.99% SLA compliance)
- Incident Response Time & Resolution Time
- MTTR (Mean Time to Recovery), MTBF (Mean Time Between Failures)
- Cost Efficiency and Resource Utilization
-
- Ensure adherence to SLAs for uptime, incident response, and resolution times.
- Manage escalations and incidents, ensuring root cause analysis and preventive measures.
- Coordinate with support teams to ensure 24x7 operational support and on-call rotation.
- Develop and maintain Disaster Recovery (DR) strategies and runbooks.
- Conduct BCDR drills and simulations to ensure preparedness and minimize downtime.
-
- Act as the primary point of contact for customers, ensuring clear communication of operational status, incidents, and resolutions.
- Conduct regular business reviews to present performance metrics, SLAs, and improvement initiatives.
- Address customer concerns, gather feedback, and ensure high customer satisfaction and retention.
- Collaborate with Solution Architects and Cloud Engineers to ensure seamless customer onboarding.
- Provide guidance and best practices for cloud adoption, cost optimization, and security compliance.
-
- Lead a team of Cloud Engineers, DevOps Engineers, and SREs, driving productivity and engagement.
- Foster a culture of continuous improvement, collaboration, and innovation.
- Conduct performance reviews, goal setting, and professional development for team members.
-
- Identify opportunities for process automation and operational efficiencies.
- Implement AI-driven monitoring and predictive analytics for proactive incident management.
- Drive cost optimization initiatives Stay updated on emerging cloud technologies, DevOps tools, and industry best practices
- Create and maintain operational dashboards for real-time visibility into cloud performance and incidents.
-
- Prepare weekly/monthly operational reports for leadership reviews.
- Maintain detailed SOPs, runbooks, and incident reports.
-
- Contribute to strategic planning sessions to align operations with organizational goals.
- Participate in budget planning and cost optimization initiatives.
Key Skills Required:
- Cloud Platforms: AWS Azure, GCP
- DevOps Tools: Jenkins, GitHub, Terraform, Ansible, Kubernetes, Prometheus, Grafana
- Programming/Scripting: Python, Bash, YAML, JSON
- Security & Compliance: IAM, GuardDuty, CloudTrail, Security Hub
- Monitoring & Logging: CloudWatch, ELK Stack, Splunk, Prometheus
- Agile Methodologies: Scrum, Kanban, CI/CD Practices
- Leadership & Communication: Team Leadership, Customer Interaction, Strategic Planning
Join the Cloud4c Talent Community
If you're looking for a place that elevates creativity with humanity, work that is as innovative as it is fun, and people who lead with both head and heart, you've found it—and our doors are open for you. Click to register with our Talent Community. We'll keep your information and reach out to you when we post opportunities in the future that might be a fit.
Sign Up