This is a Senior System position located in the Vickburg, MS area.
Responsibilities: Maintain operations for a High Performance Computer (HPC) including different nodes (compute, GPU, etc), storage, and network management. Extensive knowledge and experience of Linux operating systems (RHEL), cluster management systems, workload management systems, parallel file systems, networking and security. Maintain hardware infrastructure supporting the HPC including infiniband data and Ethernet network switches, different node types. Monitoring and maintaining system health on the HPC system(s) - compute, network and storage. Responsible for the installation, maintenance, configuration, and integrity of computer software and implementing operating system enhancements that will improve the reliability and performance of the applications and systems. Provide Systems Administration for systems supporting various GOTS and COTS applications. Coordinate hardware and software upgrades and installations with network and secu rity teams. Ensure compliance with established ERDC cybersecurity compliance standards for HPC and Linux computing systems. Troubleshoot technical issues affecting the infrastructure and apply field proven methodologies to determine root cause and return services to expected levels. Implement and ensure operations for DOD security controls such as Security Technical Implementation Guidelines (STIG).
Qualifications: Bachelor's degree from an accredited university. 8 Years professional systems/Linux administration experience. This position requires a U.S. Secret Clearance, with the ability to obtain a U.S. Top Secret/SCI Clearance for which the US Government requires U.S. Citizenship. Experience with MPI software and compilers (Intel, OpenMPI and/or MPICH). Experience using Job Schedulers such as PBS Pro. Proficiency with RHEL operating systems installation, maintenance, and support. Ability to troubleshoot and resolve complex problems relating to HPC system configuration and operations. Working knowledge of Linux Server operating system and proficiency with one or more shell environments. Proficiency with the managing the following network technologies DNS, CephFS, Infiniband. Experience with systems backup and recovery methodologies. Ability to use logic and reasoning to identify the strengths and weaknesses of alternative solutions. Strong analytical skills and ability to deliver with minimal supervision. Strong written and verbal communication skills.
About i3
Headquartered in Huntsville, AL, i3 is a national leader in providing innovative technical and engineering solutions to a broad customer base across the U.S. DoD. Specializing in missile and aviation engineering and logistic services, electronic warfare and electromagnetic affects analysis, UAS system integration and flight operations, full lifecycle C5ISR engineering services, engineering analysis, cybersecurity and IT/IA innovative solutions and virtual training, simulation & serious game development and implementation. We were founded in 2007 with the intent to do business differently. Our focus is to leave our team members and customers better than we found them. Our ultimate goal is to strengthen our Nation and our warfighter.
Perks and Benefits at i3: 100% team member owned. Outstanding insurance coverage. 401(k) match. Health and wellness incentives. Tuition and certification reimbursement. Generous PTO. Fun culture with company activities. Countless opportunities to give back to the community through our charitable organization, i3 Cares.
We work hard. We compete hard. We play hard.