Job description
We are seeking a highly skilled Site Reliability Engineer (SRE) with strong expertise in Microsoft Azure, Docker, Containerization, and Kubernetes to join our dynamic team.
Responsibilities:
- Collaborate with product teams to architect, develop, and improve systems that are both highly available and scalable, ensuring applications perform reliably and efficiently.
- Work with cross-functional teams to establish and define service level objectives (SLOs) and service level agreements (SLAs) for essential applications and components.
- Manage and maintain monitoring and observability tools, alerts, and dashboards to ensure visibility into system health and performance, proactively identifying and addressing any performance issues or availability concerns.
- Act as a key player alongside the Incident Commander in leading remediation efforts during major incident response.
- Participate actively in post-incident reviews to determine root causes and implement strategies to prevent future incidents.
- Automate routine tasks and processes to enhance efficiency and minimize the need for manual intervention.
- Develop and update documentation for system architecture, configurations, and troubleshooting procedures.
- Conduct capacity planning and resource management to ensure optimal performance and scalability of systems.
- Keep abreast of industry best practices, new technologies, and emerging trends in site reliability engineering, and contribute to the SRE community.
- Advocate for the prioritization of technical debt as part of the squad's delivery planning process.
- Implement self-healing mechanisms as necessary.
Qualifications:
Candidates must demonstrate strong expertise (5+ years) in the following areas:
- Site Reliability Engineering (SRE) practices
- DevSecOps methodologies
- Observability platforms
- Automation tools and platforms
Candidates should have some experience in:
- Cloud-native application development
- Working with relational and document databases
- Service integration patterns
Candidates must also have strong experience with a subset of the technologies in our current stack, including:
Cloud Platforms:
- Microsoft Azure
- Google Cloud Platform
Programming Languages and Frameworks:
- Powershell
- Bash
- C#
- .NET Core
- NodeJS
- GoLang
Database Systems:
- Microsoft SQL Server
- MongoDB
- CosmosDB
- PostgreSQL
Continuous Integration (CI) Tools:
- Azure DevOps
- GitHub
- Jenkins
Infrastructure Tools:
- Kubernetes
- Terraform
Observability Tools:
- Application Insights
- Dynatrace
- Azure Log Analytics
- Google Cloud Monitoring
Benefits
- Supportive and diverse workplace culture
- Professional Team Environment
- Flexible working Environment
Job Reference # 265484
To be considered for the role click the 'apply' button or for more information about this and other opportunities please contact Aditi Yadav on 02 9464 5530 or email: [email protected] and quote the above job reference number.
Paxus values diversity and welcomes applications from Indigenous Australians, people from diverse cultural and linguistic backgrounds and people living with a disability. If you require an adjustment to the recruitment process, including the application form in an alternate format, please contact me on the above contact details.
![](https://counter.adcourier.com/YXlhZGF2LjA3NTEwLjExMTUxQHBheHVzYXUuYXBsaXRyYWsuY29t.gif)