Fevrok logo
Network Reliability Engineer
2 years ago

Network Reliability Engineer



Richmond, Virginia;Richardson, Texas; Charlotte, North Carolina



**Job Description:**



The SRE will partner directly with Software Engineering, CTI Engineering, and Production Services teams to improve reliability and observability for the services they support by planning and implementing any instrumentation, tooling, ticketing, alerting and on-call routines defined in observability designs. They typically support services with less strenuous reliability requirements as they learn SRE standards and practices. SREs will engage in production triage efforts and Problem Management routines, using the experiences to continue to grow their SRE knowledge and to start identifying potential gaps in the observability design or implementation. The SRE will also focus heavily on software development activities, with a focus toward delivering automated solutions to eliminate operational toil and suggesting code enhancements to software engineering teams to help improve the reliability or observability of the service.



The Network Reliability Engineer (also known as SRE) role will partner directly with Software Engineering, Core Technology Infrastructure (CTI) Engineering, and Technology Services roles to define objective reliability goals for the services they support to gain operational visibility into meeting those goals through instrumentation, tooling, dashboards, and automation.



**Position Summary:**



The area of focus for the Network Reliability Engineer role will be to increase service stability through automation, tools, and processes. This individual will be engaged in major production triage efforts and work with problem management in the identification of root cause of highly impactful or complex issues as required. This individual will use the knowledge gained in those efforts to partner closely with software developers, production services, architects, and Infrastructure teams to drive delivery of automated solutions to eliminate operational inefficiencies and improve stability. This position will interface directly with internal stakeholders and external suppliers/providers, architecture, product engineering, product management, senior and business management. Strong communication and problem-solving skills are a must. The Network Reliability Engineer (SRE) will be considered a subject matter expert in their field and is expected to stay current with various technologies, organizational goals, and industry trends to drive end to end value.



**Key Responsibilities:**



+ Collaborate with Engineering and Production Services teams to understand technical solutions and define strategies for network automation to reduce operational inefficiencies

+ Develop and maintain a catalog of reliability scripts, tools and libraries that can be leveraged for common instrumentation, automation, and operational needs to identify and remediate network events

+ Provide next level escalation support for production triage efforts

+ Manage a continuous improvement / continuous development (CI/CD) pipeline for network development and testing

+ Participate in the documentation of application/network flows for various support needs

+ Provide technical guidance and mentorship to junior members of the tea



**Required Skills**



+ 5+ years of experience in networking principles and protocols

+ 5+ years of experience in software development supporting production networks

+ 5+ years of experience with automation tools such as Python, Ansible, YAML or Django, API calls (to ticketing systems and network devices), and frontend web development

+ Experience with Linux/Unix and system management

+ Experience with observability data platforms (Cribl)

+ Experience with message buses (e.g. Apache Kafka and NiFi)

+ Understanding of Git workflows, continuous improvement / continuous development (CI/CD) concepts and how they can be applied to network automation and a testing framework

+ Experience with JIRA and Confluence

+ Understand configuration management with tools such as Forward Networks and HPNA

+ Knowledge and experience using (both proactive and reactive) advanced tooling. Inclusive of but not limited to NetScout, Wireshark, Splunk, SevOne, HPNA, NNMI, OBM, IBM Watson, etc.

+ Experience with Agile and Lean philosophies

+ Experience operating with colleagues across different time zones with a flexible approach to working hours to successfully interact and communicate on a global level



**Desired Skills**



+ Experience in Networking-related disciplines within a design, implementation, or operations role

+ Relevant Industry certifications in Network Technologies

+ Experience working within Financial services (Insurance, Banking, Investment banking)

+ Experience with network vendors such as Cisco, Arista, F5, VMware, McAfee, Bluecat, Aruba



**Job Band:**



H5



**Shift:**



1st shift (United States of America)



**Hours Per Week:**



40



**Weekly Schedule:**



**Referral Bonus Amount:**



0



**Job Description:**



The SRE will partner directly with Software Engineering, CTI Engineering, and Production Services teams to improve reliability and observability for the services they support by planning and implementing any instrumentation, tooling, ticketing, alerting and on-call routines defined in observability designs. They typically support services with less strenuous reliability requirements as they learn SRE standards and practices. SREs will engage in production triage efforts and Problem Management routines, using the experiences to continue to grow their SRE knowledge and to start identifying potential gaps in the observability design or implementation. The SRE will also focus heavily on software development activities, with a focus toward delivering automated solutions to eliminate operational toil and suggesting code enhancements to software engineering teams to help improve the reliability or observability of the service.



The Network Reliability Engineer (also known as SRE) role will partner directly with Software Engineering, Core Technology Infrastructure (CTI) Engineering, and Technology Services roles to define objective reliability goals for the services they support to gain operational visibility into meeting those goals through instrumentation, tooling, dashboards, and automation.



**Position Summary:**



The area of focus for the Network Reliability Engineer role will be to increase service stability through automation, tools, and processes. This individual will be engaged in major production triage efforts and work with problem management in the identification of root cause of highly impactful or complex issues as required. This individual will use the knowledge gained in those efforts to partner closely with software developers, production services, architects, and Infrastructure teams to drive delivery of automated solutions to eliminate operational inefficiencies and improve stability. This position will interface directly with internal stakeholders and external suppliers/providers, architecture, product engineering, product management, senior and business management. Strong communication and problem-solving skills are a must. The Network Reliability Engineer (SRE) will be considered a subject matter expert in their field and is expected to stay current with various technologies, organizational goals, and industry trends to drive end to end value.



**Key Responsibilities:**



+ Collaborate with Engineering and Production Services teams to understand technical solutions and define strategies for network automation to reduce operational inefficiencies

+ Develop and maintain a catalog of reliability scripts, tools and libraries that can be leveraged for common instrumentation, automation, and operational needs to identify and remediate network events

+ Provide next level escalation support for production triage efforts

+ Manage a continuous improvement / continuous development (CI/CD) pipeline for network development and testing

+ Participate in the documentation of application/network flows for various support needs

+ Provide technical guidance and mentorship to junior members of the tea



**Required Skills**



+ 5+ years of experience in networking principles and protocols

+ 5+ years of experience in software development supporting production networks

+ 5+ years of experience with automation tools such as Python, Ansible, YAML or Django, API calls (to ticketing systems and network devices), and frontend web development

+ Experience with Linux/Unix and system management

+ Experience with observability data platforms (Cribl)

+ Experience with message buses (e.g. Apache Kafka and NiFi)

+ Understanding of Git workflows, continuous improvement / continuous development (CI/CD) concepts and how they can be applied to network automation and a testing framework

+ Experience with JIRA and Confluence

+ Understand configuration management with tools such as Forward Networks and HPNA

+ Knowledge and experience using (both proactive and reactive) advanced tooling. Inclusive of but not limited to NetScout, Wireshark, Splunk, SevOne, HPNA, NNMI, OBM, IBM Watson, etc.

+ Experience with Agile and Lean philosophies

+ Experience operating with colleagues across different time zones with a flexible approach to working hours to successfully interact and communicate on a global level



**Desired Skills**



+ Experience in Networking-related disciplines within a design, implementation, or operations role

+ Relevant Industry certifications in Network Technologies

+ Experience working within Financial services (Insurance, Banking, Investment banking)

+ Experience with network vendors such as Cisco, Arista, F5, VMware, McAfee, Bluecat, Aruba



**Shift:**



1st shift (United States of America)



**Hours Per Week:**



40



Learn more about this role



Full time



JR-22084005



Band: H5



Manages People: No



Travel: No



Manager:



Talent Acquisition Contact:



Stefany Lax [C]



Referral Bonus:



0



Bank of America and its affiliates consider for employment and hire qualified candidates without regard to race, religious creed, religion, color, sex, sexual orientation, genetic information, gender, gender identity, gender expression, age, national origin, ancestry, citizenship, protected veteran or disability status or any factor prohibited by law, and as such affirms in policy and practice to support and promote the concept of equal employment opportunity and affirmative action, in accordance with all applicable federal, state, provincial and municipal laws. The company also prohibits discrimination on other bases such as medical condition, marital status or any other factor that is irrelevant to the performance of our teammates.




To view the "EEO is the Law" poster, CLICK HERE (https://www.dol.gov/ofccp/regs/compliance/posters/pdf/eeopost.pdf) .


To view the "EEO is the Law" Supplement, CLICK HERE (https://www.dol.gov/ofccp/regs/compliance/posters/pdf/OFCCP\_EEO\_Supplement\_Final\_JRF\_QA\_508c.pdf) .



Bank of America aims to create a workplace free from the dangers and resulting consequences of illegal and illicit drug use and alcohol abuse. Our Drug-Free Workplace and Alcohol Policy (Policy) establishes requirements to prevent the presence or use of illegal or illicit drugs or unauthorized alcohol on Bank of America premises and to provide a safe work environment.




To view Bank of Americas Drug-free workplace and alcohol policy, CLICK HERE .

©2025 Fevrok. All Rights Reserved.