U3 INFOTECH PTE. LTD.
Linux System Engineer (HPC- High performance Computing)
Senior Executive Contract 3년 이상 경력
기술
ComputingOperationHPCScriptingStorageRed Hat LinuxWindowson-call supportParallel ProgrammingLinux
직무 설명
KeyAccountabilities
- HPC Systems Operations: Administer, operate, and maintain Linux-based HPC clusters, including compute, storage, and high-speed networking
- Manage and support:
HPC job schedulers (e.g. Slurm, PBS Pro, LSF)
Parallel file systems (Lustre, GPFS/Spectrum Scale, BeeGFS)
Cluster management and provisioning tools
Perform system monitoring, patching, upgrades, and capacity planning.
Troubleshooting and resolve hardware, software, OS, and network issues across HPC environments - Participate in on-call or escalation support rotations as needed
- Work with our software engineer to support our AI/DL applications and our desktop engineer to help with user problems as required.
- Advice and guidance to researchers for HPC application development, debugging, optimization and parallelization
Strong hands-on experience with:
- Linux operating systems (RHEL, Rocky, SUSE)
- HPC schedulers and resource managers
- Parallel file systems
- Understanding of HPC performance tuning and optimization techniques.
- Linux Operating systems
---------------------------
Please refer to U3’s Privacy Notice for JobApplicants/Seekers at https://u3infotech.com/privacy-notice-job-applicants/. When youapply, you voluntarily consent to the collection, use and disclosure of yourpersonal data for recruitment/employment and related purposes.