Operations Manager (Infrastructure / Platform / Data Centre)

Singapore Permanent View Job Description
This role leads the daily operations of a large‑scale GPU‑as‑a‑Service platform, ensuring high availability, security, and performance across both GPU and data centre infrastructure. It involves managing teams, vendors, and incident response while driving continuous improvements in a rapidly expanding AI operations environment.
  • Lead end‑to‑end operations for one of Asia's most advanced AI compute platforms
  • Lead the operations team while working at AI infrastructure, data centre

About Our Client

The organisation is a major technology provider building cutting‑edge AI infrastructure in Singapore. It is committed to high‑performance compute, operational excellence, and advancing responsible AI capabilities across the region

Job Description

  • Lead and coordinate end‑to‑end operations for a GPU‑as‑a‑Service platform and supporting data centre environments.
  • Oversee hardware, networking, environmental systems, security, and associated software platforms.
  • Manage day‑to‑day operations teams, vendors, and partners to ensure smooth platform performance.
  • Drive the implementation of upgrades, enhancements, and infrastructure initiatives.
  • Establish, validate, and improve operational frameworks for GPU hardware, software, and data centre systems.
  • Lead incident response, perform root cause analysis, and ensure timely communication to stakeholders.
  • Present operational insights, risks, and improvement plans to senior leadership.
  • Ensure compliance with SLA/SLO targets and operational standards.
  • Build, mentor, and develop a high‑performing operations team.
  • Lead security incident management and enforce cybersecurity best practices.
  • Participate in scheduled and on‑call operational support.

The Successful Applicant

For this role, we are looking for a leader with strong experience in service‑level and platform operations, particularly within environments that operate like cloud infrastructure services. The ideal candidate should have hands‑on and strategic exposure across:

  • Server, networking, and storage infrastructure operations
  • Physical security and cybersecurity frameworks within data centre or platform environments
  • Cloud‑like service delivery models involving high availability, scalability, and operational SLAs
  • End‑to‑end accountability for system uptime, resilience, and customer impact
  • Operational ownership across the full stack, from hardware to platform-level services



What's on Offer

  • Comprehensive health and wellness benefits
  • Professional development and continuous training opportunities
  • Internal mobility and long‑term growth potential
Contact
Lydia Chen (Lic No: R22108104 / EA no: 18C9065)
Quote job ref
JN-012026-6924078
Phone number
+65 6416 9829

Job summary

Function
IT
Specialisation
IT Support
What is your area of specialisation?
Technology & Telecoms
Location
Singapore
Contract Type
Permanent
Consultant name
Lydia Chen (Lic No: R22108104 / EA no: 18C9065)
Consultant contact
+65 6416 9829
Job Reference
JN-012026-6924078

Diversity & Inclusion at Michael Page

We don't just accept difference - we celebrate it. We encourage applicants from all backgrounds to apply for this role and are committed to building inclusive, diverse workplaces where everyone can thrive. If you require any support or reasonable adjustments during the recruitment process, please let us know.