Astra North Infoteck Inc.

Site Reliability Engineer – GenAI Platform

📍 Location
Mirabel, Quebec
⏰ Job Type
Full-time
📅 Posted
March 17, 2026
Apply Now

Job Description

Job Description
  • Experience: 8+ years of experience as a Site Reliability Engineer or in a similar role, with hands-on experience in supporting IaaS platforms with networking and system engineer-ing knowledge.

  • Roles and Responsibilities:

    • Operate, monitor, and maintain the infrastructure supporting GenAI applications (training, inference, feature store, data ingestion, model serving)

    • Design and build automation for core platform capabilities, reducing manual toil

    • Develop and maintain infrastructure-as-code (IaC) for provisioning and managing compute, storage, network, GPU clusters, Kubernetes / container orchestration, etc.

    • Establish, monitor, and enforce SLOs/SLIs/SLAs, error budgets, alerting, and dashboards

  • Ready to Apply?

    Take the next step in your career - we're hiring now!

    Apply for this Position