Solution Architect, DGX Cloud

Nvidia • Full-time • Japan, Tokyo • 22h ago

Do you want to be part of the team that brings Artificial Intelligence (AI) emerging technology to the field? We are looking for hardworking Solution Architect (SA) to guide and enable the successful adoption at scale of DGX Cloud and NVIDIA AI Enterprise Software in production.

NVIDIA DGX Cloud is an AI platform for enterprise developers, optimized for the demands of generative AI. The DGX Cloud SA team is dedicated to shaping the future of DGX Cloud by actively gathering and incorporating customer feedback and product requirements. Our team will help optimize the onboarding process for DGX Cloud customers, ensuring fast time to insights and exceptional experience. Additionally, we will collaborate with internal teams to scale expertise and knowledge through training and the creation of repeatable guides. Our focus on building demos, qualifications, and assets will streamline the pre-sales process, ultimately increasing sales and adoption of DGX Cloud.

What you’ll be doing:

Work closely with DGX Cloud Customers, become their trusted technical advisor, advocate for their needs, and ensure they are successful in accomplishing their business goals with the platform.

Accelerate customer onboarding and time to insights with DGX Cloud
Scale knowledge, reach, and opportunities by building and educating vertical teams and communities on DGX Cloud
Provide technical education and facilitate field product feedback to improve DGX Cloud
Enable successful first-time integration and deployment of NVAIE Emerging SW products with DGX Cloud

What we need to see:

Strong foundational expertise, from a BS, MS, or Ph.D. degree in Engineering, Mathematics, Physics, Computer Science, Data Science, or similar (or equivalent experience)
5+ years of proven experience with one or more Tier-1 Clouds (AWS, Azure, GCP or OCI) and cloud-native architectures and software
Proven experience in technical leadership, strong understanding of NVIDIA technologies, and success in working with customers
Expertise with container orchestration platforms such as Kubernetes, Docker Swarm, or OpenShift
Expertise with parallel filesystems (e.g. Lustre, GPFS, BeeGFS, WekaIO) and high-speed interconnects (InfiniBand, Omni Path, and Gig-E)
Strong coding and debugging skills, and demonstrated expertise in one or more of the following areas: Machine Learning, Deep Learning, Slurm, Kubernetes, MPI, MLOps, LLMOps, Ansible, Terraform, and other high-performance AI cluster solutions
Experience with high performance or large scale computing environments
Excellent verbal, written communication, and technical presentation skills in Japanese and business level English communication

Ways to stand out from the crowd:

Hands-on experience with DGX Cloud, NVIDIA AI Enterprise AI Software, Base Command Manager, NEMO and NVIDIA Inference Microservices, Run:ai
Experience with integration and deployment of software products in production enterprise environments, and microservices software architecture
Contributions to technical forums, publications, or presentations at industry conferences (e.g., SC, KubeCon, or GTC)