All jobs

Director, AI Data Center Operations

100% Remote Full-time Open now

NVIDIA DGX Cloud is an AI supercomputing service that provides enterprises with instant access to NVIDIA's high-performance AI infrastructure and software, including dedicated DGX AI supercomputing clusters, optimized software stacks, and expertise. At NVIDIA, data centers are the engine behind AI. Join us to develop, launch, and operate the facilities that power the most advanced computing in the world. We're in pursuit of a Director of AI Data Center Operations to lead the evolution of NVIDIA AI data centers. In this role, you will build a team and play a significant part in helping to craft and guide the future of AI & GPUs operations in the Data Center. Are you passionate about AI & data center operations ? Do you strive for quality? If so, join our team at NVIDIA, where we are dedicated to delivering GPU-powered services around the world! What You'll Be Doing

  • Lead the commissioning, bring-up, and operational readiness of new data centers.
  • Collaborate with software and hardware teams to define and implement repeatable procedures.
  • Own the operations, maintenance, and reliability of the infrastructure of an AI datacenter.
  • Develop and enforce operations strategy & processes, ensuring strict adherence to SLAs across critically important infrastructure.
  • Define and implement procedures for minimal downtime and quality controls to strive to achieve continuous uptime.
  • Feeding requirements to software and hardware teams
  • Creation of documentation that the ecosystem can use to run their own AI Data Centers

What We Need To See

  • BS, MS degree in Computer Engineering/Science, or related field (or equivalent experience) with 15+ overall years of relevant work experience and 8+ years of management experience.
  • 8+ years of expertise in managing extensive data center operations or critical infrastructure.
  • Expertise in BMS & Power management.
  • Experience building 24/7 teams from 0
  • Experience working with remote hands
  • Proven track record of managing infrastructure from deployment through long-term operations.
  • Experience driving reliability with robust processes, rapid field response, and recovery.

With competitive salaries and a generous benefits package, NVIDIA is widely considered to be one of the technology industry's most desirable employers. We have some of the most forward-thinking and versatile people in the world working with us, and our engineering teams are growing fast in some of the most impactful fields of our generation: Deep Learning, Artificial Intelligence, and Autonomous Vehicles. If you're a creative engineer who enjoys autonomy and shares our passion for technology, we want to hear from you. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 284,000 USD - 425,500 USD for Level 5, and 332,000 USD - 500,250 USD for Level 6. You will also be eligible for equity and benefits . Applications for this job will be accepted at least until November 10, 2025.NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. JR2005624 Apply tot his job Apply To this Job

You might also like

Sr. Staff Hardware and Test Automation Software Eng

100% Remote Full-time

Sr Staff CPU Core/Unit Verification Engineer, Functional Safety

100% Remote Full-time

Medical Devices, Business/Systems Analyst – (Remote, US)

100% Remote Full-time

(Remote) Director of Applied Science - Healthcare AI

100% Remote Full-time

[Hiring] Senior Staff Engineer, Enterprise AI @GE HEALTHCARE

100% Remote Full-time

Senior Architect, Artificial Intelligence Security - Databricks / Azure - Remote

100% Remote Full-time

AI Product Owner

100% Remote Full-time

Lead Data Scientist - Generative AI

100% Remote Full-time

Senior Product Design Manager, AI

100% Remote Full-time

Sr. Product Manager, AI Search

100% Remote Full-time

Entry Level Phlebotomist - Paid Training

100% Remote Full-time

Sr. Azure Infrastructure Engineer - Hybrid - Mechanicsville, VA

100% Remote Full-time

Bilingual Virtual Care & Operations Coordinator (Remote – Latin America)

100% Remote Full-time

Experienced Customer Support Representative – Night and Weekend Shift – Coppell, TX

100% Remote Full-time

Experienced Customer Service Representative – Apple Product Support and Technical Assistance

100% Remote Full-time

Church Support Specialist | Remote

100% Remote Full-time

[Work From Home] Senior Program Manager, Amazon Customs and Trade

100% Remote Full-time

Enterprise Account Executive - Brazil

100% Remote Full-time

Senior QA Automation - Advertising - Brazil

100% Remote Full-time

Remote Supply Chain Project Manager | WFH

100% Remote Full-time