Socotra, Inc.
Senior SRE Engineer
Job Description
Our SRE team is responsible for ensuring that the underlying infrastructure is running smoothly and that systems and tools are working as expected. They use automation tools to monitor and observe software reliability in the production environment, and are also experienced in finding problems in software and writing codes to fix them.
What you’ll do:
- System reliability and availability: Architect, design, and implement strategies to ensure the high availability, reliability, and fault tolerance of our infrastructure and applications.
- Incident management: Lead incident response efforts, identify root causes, and implement preventative measures to minimize future incidents.
- Performance optimization: Identify performance bottlenecks, conduct performance analysis, and optimize system and application performance.
- Automation and tooling: Drive automation initiatives, develop and maintain tools, scripts, and frameworks to streamline ...