Project Director - HPC, AI, and GPU Infrastructure Deployment

Larsen & Toubro Ltd

Chennai

Not disclosed

Work from Office

Full Time

Min. 15 years

Job Details

Job Description

Project Director - HPC / AI / GPU Infrastructure Deployment

Role Overview

We are seeking a seasoned Project Director – HPC / AI Infrastructure Deployment to lead large-scale, high-density compute programs involving GPU clusters, HPC workloads, and AI infrastructure. The role demands end-to-end ownership of deploying 10+ MW IT load data center environments, ensuring delivery of high-performance GPU-based compute platforms with cutting-edge networking and storage architectures.

Roles & Responsibilities

  • Lead and deliver large-scale HPC / AI GPU cluster deployments (e.g., NVIDIA B200 / B300 GPU platforms) within defined timelines and budgets
  • Drive execution of AI stack deployment (e.g., NVIDIA NVAIE) across hybrid/cloud/on-prem environments
  • Manage multi-vendor ecosystems including OEMs, SI partners, and hyperscale technology providers
  • Deploy and scale high-density GPU racks with liquid/air-cooled thermal strategies
  • Design and oversee InfiniBand (IB) and high-speed Ethernet networks
  • Experience with NVIDIA/Mellanox InfiniBand fabrics
  • Configuration and optimization using UFM (Unified Fabric Manager)
  • Strong understanding of BCM (Broadcom Ethernet switching) platforms
  • Architect and implement Leaf-Spine network topology for ultra-low latency AI workloads
  • Ensure effective integration of storage systems (parallel file systems, NVMe-based storage)
  • Oversee deployment of Kubernetes-based GPU orchestration platforms
  • Experience with containerized AI workloads and distributed training clusters
  • Exposure to NVIDIA AI Enterprise (NVAIE), CUDA, and GPU virtualization frameworks
  • Manage data center design, build, and repurposing for HPC workloads
  • Oversee MEP (Mechanical, Electrical, Plumbing) systems implementation
  • Enure optimized thermal management (liquid cooling, rear door heat exchangers, immersion cooling where applicable)
  • Ensure optimized power density (kW/rack) planning
  • Ensure optimized energy efficiency (PUE optimization)
  • Establish robust governance frameworks aligned to:


a. HLD/LLD design validation

b. SOP adherence

c. Quality assurance benchmarks

  • Implement risk mitigation strategies for large-scale deployments (supply chain, OEM dependencies, technology integration risks)
  • Monitor program milestones and ensure SLA-based deliveries
  • Drive structured cabling design (fiber-heavy HPC fabric, spine-leaf connectivity)

Qualifications & Experience

  • B.E/B.Tech in Electrical / Electronics / Computer Science Engineering
  • 15–25 years of experience in Data center infrastructure deployment, HPC / AI workload environments, large-scale IT infrastructure programs

Mandatory / Preferred Certifications

  • PMP / PRINCE2 (mandatory for program governance)
  • CDCP / CDCS / CDCPM certifications

Strongly preferred:

  • NVIDIA AI Infrastructure / DGX / AI Factory certifications
  • OEM certifications (Dell, HPE, Lenovo HPC systems)

Job role

Work location

Chennai

Department

Project & Program Management

Role / Category

Technology / IT Project Management

Employment type

Full Time

Shift

Day Shift

Job requirements

Experience

Min. 15 years

About company

Name

Larsen & Toubro Ltd

Job posted by Larsen & Toubro Ltd

Apply on company website