
Website designstudio Design Studio Inc.
Designing for humans
Job Description
About the Role
We’re seeking a Site Reliability Engineer to join our infrastructure team and help ensure the reliability, scalability, and performance of our systems. In this role, you’ll design, implement, and maintain our monitoring and alerting systems, and collaborate with engineers to improve our infrastructure.
Responsibilities
- Design, implement, and maintain monitoring and alerting systems
- Collaborate with engineers to improve infrastructure reliability and performance
- Troubleshoot and resolve system issues
- Participate in on-call rotations
- Document processes and procedures
Requirements
- BS/MS degree in Computer Science or a related technical field
- 3+ years of experience in SRE or a related role
- Strong knowledge of monitoring and alerting tools (e.g., Prometheus, Grafana, Nagios)
- Proven experience with cloud platforms (AWS, GCP)
- Familiarity with infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation)
- Excellent interpersonal and communication skills
Benefits
- Competitive salary and equity formula
- Comprehensive health, dental, and vision insurance
- Unlimited PTO and flexible work hours
- 401k matching
- Lunch stipend and game room
About the Company
Our company is a leading provider of cloud-based telecommunications services, empowering businesses to connect and collaborate seamlessly.
Job ID: site-reliability-engineer-JlBDI