Work within the team on the various products that the team supports.
Develop tools to improve our ability to rapidly deploy and effectively monitor services in a large-scale distributed environment.
Work with teams to design, develop, and implement innovative software solutions related to the DevOps and Agile transformation of the enterprise.
Ensure Cloud environments are compliant with security policies.
Find out the solutions to achieve highly available, highly scalable systems and reliability.
Maintain, support, and enhance CI/CD environment.
All are monitored and measured.
Evaluate infrastructure cost and find out the solution to optimize cost.
Troubleshoot and perform root cause analysis as well as implement corrective/preventive actions when needed.
Define and document best practices and operational procedures regarding solution deployment and infrastructure maintenance to ensure a smooth handover to other teams.
JOB REQUIREMENTS
Must have:
4+ years of professional DevOps or Site Reliability Engineering experience in a fast-paced work environment.
Solid understanding of Linux and Container technology.
Hands-on knowledge of Docker.
Experience hands-on Infrastructure and Configuration as Code abilities eg Terraform(strongly preferred), Ansible, Packer.
Experience in building CI/CD pipeline automation, tooling (Github Action, Jenkins(strongly preferred)), and Compliance as code.
Experience with cloud services is essential, in particular, our core AWS Technologies (Organizations, Account Design, VPC, Subnet and Network segmentation, EC2, ASG, Lambda, S3, SQS, SNS, ECS, EKS, RDS, Lambda, Cloudwatch, etc ).
Ability to create scripts using Bash, Python, or Golang. Must have the habit of cleaning code, reusing code, and implementing the unit test.
Excellence in analytical and problem-solving skills.