United States Digital Space LLC

SRE/DevOps Engineer - 66765

📍 Location
toronto, on
⏰ Job Type
Full-time
📅 Posted
June 17, 2026
Apply Now

Job Description

Function Cloud & Data Engineering

Job Title L1 SRE Operations Engineer

Responsibilities

Monitor system health, alerts, dashboards, and logs across cloud and on‑prem infrastructure.

Isolate functional issue with application versus platform.

Execute standardized runbooks for incident resolution, deployments, and routine tasks.

Perform initial triage of incidents and escalates to L2/L2+ as needed to mitigate the issue.

Document new issues, gaps in runbooks, and automation opportunities.

Provide excellent communication to stakeholders during incidents.

Support onboarding of new applications into the operations framework.

Mandatory Skills

System & Infrastructure Monitoring

– Ability to use monitoring dashboards such as Grafana, Datadog, Splunk, Argos, AIOps to identify anomalies, follow alert workflows, and engage escalation. Example: When a Kubernetes pod crash-loop is flagged in Pro...

Ready to Apply?

Take the next step in your career - we're hiring now!

Apply for this Position