- In this role, you will utilize your knowledge in hyperscale data center structured cabling, fiber connectivity solutions, and IP networking to support the delivery of field repair services
You will work closely with internal business teams, external vendors, network design/engineering, and other cross-functional teams to develop strategies for integrating new technologies and better supporting existing technologies across our operational fleet
- Your objective will be to understand technical drivers for the business and organize strategies to ensure operational supportability meets those needs
- Additionally, you will help represent ENS across various stakeholder groups to help achieve critical projects
- Our team is dedicated to improving the operational efficiency and reliability of one of the world's most dynamic and fast-paced networks
Incident Response: Collaborate with cross-functional teams, managed service providers, and third-party vendor partners to investigate complex technical and process issues during major incidents/site events (SEVs) on edge, caching, and network infrastructure
- Work closely with team members to understand the root cause of incidents and contribute to their resolution
- Change Management: Identify security & business continuity issues affecting the network and infrastructure at Meta and work with the team to implement effective changes
- Provide feedback on existing tools, processes, and policies to help scale with the rapid expansion of the Meta platform and customer base
- Operations: Drive operations by identifying improvement opportunities across policies, processes, and procedures to improve efficiency and quality of activities
- Ensure standards compliance across the network and optimize service delivery
- Collaboration and Partnership: Work closely with internal customers to address their needs and issues, and influence future iterations of data center and network designs for seamless integration of new infrastructure and scalability
- Team Leadership: Work with cross-functional teams within and outside the organization to deliver business outcomes predictability across global sites
- Analyze operational datasets to detect and prioritize problems and create aligned team projects, programs, and roadmaps
- Risk Management: Collaborate with partner teams to design and implement aligned processes that identify and manage data and asset protection risks, as well as operations continuity issues across the network
- Information and Data Assurance: Ensure relevant operational process, procedure, and policy documentation is effectively managed, and the data required to support operations is complete and accurate in systems
- Automation: Collaborate with the team to analyze operational events to identify new automation opportunities and achieve our goals of all Tier-1 faults in the network being fully remediated by software
- Help others understand our requirements and drive their roadmaps, and in some cases, directly implement lightweight solutions in code
- Data Measurement: Drive quality into the metrics we report, measuring and analyzing escalation issues, fault/event trends, infrastructure capacity, and vendor performance failures
- Formulate appropriate metrics and definitions of success to drive quality, efficiency, cost, and timeliness, evolving these over time to match changes to the infrastructure and business requirements
- Communication: Provide clear and effective communication around personal and team goals, progress, outcomes, and lessons learned across assigned scope
- Travel: International and domestic travel may be required up to 25% of the time, depending on the needs of the business
Ability to provide technical guidance to external vendors to direct repair actions