Job Description
As a Senior Site Reliability Engineer, you will play a vital role in our organization by ensuring the seamless integration of development and operations processes, driving automation, and enhancing system reliability. You will collaborate with cross-functional teams to establish and improve our DevOps and SRE practices, focusing on continuous integration, delivery, and the proactive management of infrastructure and services. Additionally, you will be responsible for monitoring and troubleshooting system performance, implementing automation tools, and identifying areas for optimization.
Responsibilities:
- Collaborate with development, operations, and QA teams to implement DevOps and SRE best practices
- Design and maintain continuous integration and delivery (CI/CD) pipelines to enable efficient software releases
- Develop and manage infrastructure-as-code (IaC) solutions using tools such as Terraform, Ansible, or
- Automate operational tasks, deployments, and monitoring
- Develop and Improve service level objectives and error budgets for applications
- Monitor system performance and respond to incidents promptly, implementing effective incident response and resolution procedures
- Identify opportunities for system optimization and scalability improvements
- Collaborate with security teams to ensure proper security measures and compliance with iso27001 and soc2 standards
- Stay up to date with industry trends and emerging technologies, assessing their potential impact on our DevOps and SRE practices
Requirements:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field
- 8 years of experience in DevOps, SRE, or a related role, demonstrating a strong understanding of the principles and practices
- Proficiency in scripting and automation using tools such as NodeJS, Python, Ruby
- Expertise with CI/CD tools (e.g., Jenkins, GitLab CI/CD, CircleCI)
- Knowledge of infrastructure-as-code (IaC) tools like Terraform, Ansible, or CloudFormation
- Expertise with containerization technologies (Docker, Kubernetes) and cloud platforms (AWS, Azure, GCP)
- Expertise with cloud platforms (AWS, Azure, GCP)
- Strong troubleshooting and problem-solving skills
- Excellent communication and collaboration abilities