Setup, configuration, general maintenance and troubleshooting of HPC Cluster for CAE Dept.
Manage large & diverse HPC environment including design, build, capacity planning
Knowledge on High Performance Computing HPC like managing CAE Softwares, troubleshooting failed HPC jobs, PBS/SLURM/LSF/SGE or any scheduler knowledge will be added advantage New CAE application integration to the existing HPC Cluster
Application knowledge on CAE applications like STARCCM, Abaqus, Numeca, LS-DYNA, Preonlab, Converge, Console
Should have a working experience on Altair Applications like ANSA, Hypermesh, Hyperworks, Medina
Knowledge on Altair PBS, License server management.
Evaluate and recommend systems CAE software and hardware for enterprise systems.
Work with core production support personnel in IT and Engineering to automate deployment and operation of the infrastructure
LDAP configuration and Integration
Manage and maintain monitoring to ensure uptime and SLA levels.
Primary Skills
Minimum 6+ years of HPC experience (required).
Having Hands on experience in HPC Infra
Working knowledge on HPC schedulers like PBS, SLURM
Providing application support for CAE applications like STARCCM, Abaqus, Numeca, LS-DYNA.
Troubleshooting knowledge on HPC jobs
Work with CAE Dept closely, get all the requirements and provide best solutions to the end user
Must be able to work with and provide support for cross functional groups and technical areas (compute, storage, network, applications)
Secondary Skills
Must have firm understanding of Linux internals and have automated system building, patching, and configuration management
Knowledge in systems management automation using industry-standard and open-source tools such as Python, Bash, Puppet, Ansible.
Good understanding of various server technologies available to deploy servers in DC and also Vendor Management
Excellent Communication Skills, team coordination and interpersonal skill
Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem.
Job Classification
Industry: IT Services & Consulting Functional Area / Department: Engineering - Hardware & Networks Role Category: IT Network Role: System Administrator / Engineer Employement Type: Full time