Loading
Hire.Monster

Data Center Facility Operations Reliability Engineer - Full-time

Company
Montpelier, Vermont, US
ОфисDevOpsСША

Обязанности

  • This person will work at the leading edge of Facility Operations to identify and manage asset reliability risks and various stages of end-to-end asset lifecycle for the Data Center Operations
  • Managing stakeholders spread across time zones is a significant challenge and key to the success of our individual projects and overall asset management, quality and reliability program
  • Prevent operational gaps in reliability engineering expertise across all asset management activities
  • Proactively review, identify, and mitigate risks of equipment failures, unscheduled downtime, and reactive maintenance
  • Ensure all new assets are methodically and consistently onboarded into Company’s asset management ecosystem
  • Maintain rigorous asset onboarding processes to enable accurate tracking and seamless integration into maintenance programs
  • Establish and maintain a robust asset criticality framework to prioritize resources and mitigate risk
  • Lead Failure Mode and Effects Analysis (FMEA) to predict failure modes, prioritize risks, and develop preventive actions
  • Develop and execute Reliability Centered Maintenance (RCM) programs to balance cost, risk, and performance
  • Assess operational risks associated with asset failures, maintenance strategies, and process deviations
  • Develop, maintain, and update the Global Maintenance Library of plans, procedures, and best practices
  • Govern the review and implementation of changes to maintenance strategies and procedures
  • Ensure all maintenance changes are data-driven, risk assessed, and systematically implemented
  • Support accurate accounting of asset depreciation and amortization through timely asset tracking
  • Serve as a subject matter expert and technical lead for Enterprise Asset Management (EAM) implementation and optimization
  • Create and maintain asset useful life models to forecast replacement needs and optimize total cost of ownership
  • Provide technical leadership for condition-based, time-based, and specialized reliability maintenance initiatives
  • Analyze asset health metrics and KPIs to identify risks, predict failures, and measure reliability improvements
  • Collaborate with Operations and Maintenance to optimize scheduling and execution of maintenance activities
  • Mentor staff in reliability methodologies and foster a environment of proactive asset management
  • Sustain continuous improvement of asset management workstreams and processes
  • 25% to 50% travel domestically and internationally
  • Experienced in Reliability Centered Maintenance (RCM) and Failure Maintenance Effect Analysis (FMEA) activities for maintenance /process/equipment design optimization to meet reliability requirements
  • Proficient in usage of EAM solutions to extract data and develop meaningful insights
  • Certifications in Maintenance & Reliability such as CMRP, CRL, CRE
  • Experience with data center equipment such as critical cooling systems, generators, main switchboards, network gear
  • Proficient in developing and executing test plans for assets

Требования

  • Bachelor’s degree in Mechanical, Electrical Reliability Engineering or similar technical discipline
  • 10+ years of experience in reliability engineering (related to electrical or mechanical cooling equipment)

Knowledgeable of relevant ISO standards (ISO 14224, ISO 17359, ISO 55000)

  • Experience with Program/Project management and cross-functional team management
  • Certifications in Maintenance & Reliability such as CMRP, CRL, CRE
  • 30
  • Company participates in the E-Verify program in certain locations, as required by law
  • Experienced in Reliability Centered Maintenance (RCM) and Failure Maintenance Effect Analysis (FMEA) activities for maintenance /process/equipment design optimization to meet reliability requirements
  • 24
  • Experience with data center equipment such as critical cooling systems, generators, main switchboards, network gear
  • Proficient in developing and executing test plans for assets

Навыки

  • Proficient in data analysis techniques that can include Process Control, Reliability modeling and prediction, Fault Tree Analysis, Weibull Tree Analysis, Six Sigma (6σ) Methodology

Условия

*Public Compensation:* $143,000/year to $198,000/year + bonus + equity + benefits *Public Compensation:* $143,000/year to $198,000/year + bonus + equity + benefits

Опубликовано: 11.01.2026