Job Description
Job Title: High Performance Computing OR HPC Architect / Senior Consultant Exp Level: Minimum 12 years The person should have experience from an standard infrastructure datacentre point of view but must also have the following experience and knowledge: Extensive experience managing HPC networking at an expert or near expert level involving technologies and fabrics based on InfiniBand, Omni- path, etc. This includes but not limited to the common tools, monitoring, sampling telemetry data and best practices around these type of fabrics. Specific knowledge around technologies utilized in HPC clusters, for example RDMA and NVMe- oF. Knowledge around HPC network topologies and why they are used for different types of clusters and sizes (fat tree, 3D Torus, Hypercubes etc. ). Knowledge and experience around working with clustered/ distributed filesystems and scale- out storage technologies (for example Lustre, Ceph, Isilon etc. ). Previous experience in how to monitor HPC infrastructure - ( the observer problem ). Must have experience to understand sample rates and how to tap telemetry data that supports our mission without affecting the client workloads negatively. Extensive experience managing and troubleshooting HPC environments from an infrastructural perspective meaning everything from local node problems, blade chassis, intra- and interconnect traffic in the topology, process/ job management and the process of tracing backwards to understand what workload affects the cluster in an abnormal way. Should have previous experience in job scheduling with for example IBM Spectrum LSF (used by client), Slurm, MOAB or equivalent. Even though this is out of scope from solution perspective, it makes a great difference overall when it comes to manage the HPC infrastructure in a good manner. For Capgemini HPC team to engage and work together with client HPC application team this experience and knowledge much preferred. At least medium level of computer science and computer architecture knowledge. Bonus if person has previous experience working directly with MPI, developer teams or did actual programming in HPC area. Bonus if person has HPE cluster experience considering most of the client clusters are delivered by HPE. Extra bonus if person has experience of hybrid HPC clusters or GPU- based/ cuda clusters knowing the direction of client (autonomous cars, smart traffic systems etc. ). Ref: 292482 Posted on: May 25, 2019 Experience level: Experienced (non- manager) Education level: Bachelor's degree or equivalent
Job Classification
Industry: IT-Software, Software Services
Functional Area: IT Software - Application Programming, Maintenance,
Role Category: Programming & Design
Role: Programming & Design
Employement Type: Full time
Education
Under Graduation: B.Tech/B.E. in Computers
Post Graduation: M.Tech in Computers
Doctorate: Doctorate Not Required, Any Doctorate in Any Specialization
Contact Details:
Company: Capgemini Technology
Location(s): Hyderabad
Keyskills:
Computer science
Investor relations
Networking
Social media
Scheduling
high performance computing
Troubleshooting
infrastructure services
Monitoring