Operation Manager - Data Center (Physical Infrastructure )

Singapore Permanent SG$9,000 - SG$10,000 per month (SG$108,000 - SG$120,000 per year) View Job Description
As an Operations Manager, GPU Operations leads the day-to-day management of Singtel's GPU-as-a-Service (GPUaaS) platform, ensuring high availability, performance, security, and reliability while serving as the primary operational partner to engineering teams on upgrades, observability, security, and continuous improvement initiatives.
  • Global exposure and opportunities to work on cross-border projects
  • High Leadership Visibility & Impact on Business Outcomes

About Our Client

A global leader renowned for innovative solutions, robust infrastructure, and driving digital transformation headquartered in Singapore.

Job Description

  • Serve as the overall lead and the point of accountability for end-to-end GPUaaS and data centre operations, including operational reporting.
  • Oversee day-to-day platform and facility operations across GPU hardware, networking, environmental systems, security controls, and supporting software.
  • Lead and coordinate internal operations teams, vendors, and consultants during routine activities as well as critical incidents.
  • Partner with engineering and external stakeholders to deliver platform upgrades and data centre improvement initiatives.
  • Develop, review, and refine operational processes to maintain platform stability across compute, power, cooling, and infrastructure components.
  • Take charge of major incidents, drive root cause analysis, and ensure clear, timely updates to customers and stakeholders.
  • Provide regular updates to the management on operational performance, risks, and improvement plans.
  • Ensure incidents are triaged and escalated appropriately based on severity, business impact, and SLA/SLO commitments.
  • Build, lead, and motivate a strong operations team with a focus on accountability and continuous improvement.
  • Set clear performance expectations, coach team members, and support ongoing professional development.
  • Oversee security incident management and uphold security and compliance standards within the GPUaaS environment.
  • Stay current with industry security developments and implement safeguards to protect customer workloads and platform integrity.
  • Support scheduled maintenance activities and participate in on-call duties when required.



The Successful Applicant

  • Bachelor's degree in Computer Science, Information Technology, or a related field.
  • At least 8 years of experience in data centre operations, with a minimum of 3 years in a leadership capacity.
  • Solid understanding of data centre infrastructure, including servers, networking, storage, and both physical and cybersecurity controls.
  • Practical experience with electrical and mechanical systems, facilities management, and preventive maintenance practices.
  • Demonstrated ability to lead teams and manage vendors effectively.
  • Strong organisational skills with the ability to adapt to evolving operational demands.
  • Hands-on experience with Linux and hypervisor administration in GPU or GPUaaS environments.
  • Strong analytical and troubleshooting skills, with a proactive approach to performance optimisation and system reliability.
  • Working knowledge of storage technologies, including capacity planning, troubleshooting, and data protection strategies.
  • Experience managing GPU infrastructure, including configuration, monitoring, and performance tuning.
  • Familiarity with liquid cooling technologies used in high-density GPU environments.
  • Understanding of GPU cluster architectures and AI/HPC environments, including collective communications (e.g. NCCL, RDMA), high-performance networking (e.g. InfiniBand), and containerised or orchestrated platforms supporting AI and HPC workloads.

What's on Offer

As a growing firm with a tightly-knit team, the successful candidate will get the chance to contribute to a highly performing team while having the autonomy to make certain decisions for the team.

Contact
Winson Low (Lic No: R22106039/ EA no: 18C9065)
Quote job ref
JN-032026-6959635
Phone number
+65 6416 9865

Job summary

Function
IT
Specialisation
Infrastructure
What is your area of specialisation?
Technology & Telecoms
Location
Singapore
Contract Type
Permanent
Consultant name
Winson Low (Lic No: R22106039/ EA no: 18C9065)
Consultant contact
+65 6416 9865
Job Reference
JN-032026-6959635

Diversity & Inclusion at Michael Page

We don't just accept difference - we celebrate it. We encourage applicants from all backgrounds to apply for this role and are committed to building inclusive, diverse workplaces where everyone can thrive. If you require any support or reasonable adjustments during the recruitment process, please let us know.