Honeywell is charging into the Industrial\rIoT revolution with the establishment of Honeywell Connected Enterprise (HCE),\rbuilding on our heritage of invention and deep, on-the-ground industry\rexpertise. HCE is the leading industrial disruptor, building and connecting\rsoftware solutions to streamline and centralize the assets, people and\rprocesses that help our customers make smarter, more accurate business\rdecisions. Moving at the speed of software, we are creating, innovating and\rdelivering solutions fast, challenging the way things have always been done,\rpiloting new ways for all of us to work, and expecting our successes to set new\rstandards for our customers and for Honeywell.
Be part of a team that designs, develops and integrates highly complex software functions within Honeywell HCE. You will use your experience and judgment to plan and accomplish goals. You will also generate innovative solutions in work situations; trying different and novel ways to deal with problems and opportunities.
KEY RESPONSIBILITIES
\r\t
Leading cross functional teams and projects to improve agility, reliability and quality of our mission critical software products
\r\t
Leading software engineers focused on building, managing & maintaining global cloud infrastructure for highly distributed saas platform
\r\t
Hands-on design, analysis, development and troubleshooting of highly-distributed large-scale production systems and event-driven, cloud-based services
\r\t
Primarily Linux Administration, managing a fleet of Linux and Windows VMs as part of the application solutions
\r\t
Involved in Pull Requests for site reliability goals
\r\t
€¢\tB2B SaaS Operations
\r\t
€¢\tAdvocate IaC (Infrastructure as Code) and CaC (Configuration as Code) practices within Honeywell HCE
\r\t
€¢\tOwnership of reliability, up time, system security, cost, operations, capacity and performance-analysis
\r\t
Monitor and report on service level objectives for a given applications services. Work with the business, Technology teams and product owners to establish key service level indicators.
\r\t
€¢\tEnsuring the repeatability, traceability, and transparency of our infrastructure automation
\r\t
Support on-call rotations for operational duties that have not been addressed with automation
\r\t
Support healthy software development practices, including complying with the chosen software development methodology (Agile, or alternatives), building standards for code reviews, work packaging, etc.
\r\t
Create and maintain monitoring technologies and processes that improve the visibility to our applications' performance and business metrics and keep operational workload in-check.
\r\t
Partnering with security engineers and developing plans and automation to aggressively and safely respond to new risks and vulnerabilities.
\r\t
Develop, communicate, collaborate, and monitor standard processes to promote the long-term health and sustainability of operational development tasks.
\r\t
Participate in technical training events, game day scenarios, and professional conferences
\r
YOU MUST HAVE
\r\t
7 + years of experience in system administration, application development, infrastructure development or related areas
\r\t
5 + years of experience in SaaS Ops, managing software/cloud operations teams
\r\t
5 + years of experience with programming in languages like Javascript, Python, PHP, Go, Java or Ruby
\r\t\r\t\r\t
3 + years of experience with Mastery of infrastructure automation technologies (like Terraform, CodeDeploy, Puppet, Ansible, Chef)
\r\t
3+ years expertise in container/container-fleet-orchestration technologies (like Kubernetes, Openshift, AKS, EKS, Docker, Vagrant, etcd, zookeeper)
\r\t
3 + years of experience with Cloud and container native Linux administration/build/management skills, along with Demonstrated expertise building and managing highly scaled production infrastructure in the cloud (Azure required; GCP, AWS, OpenStack a plus)
\r
WE VALUE
\r\t
Versatility with troubleshooting diverse sets of hosting technologies strongly desired. These include web server platforms, application platforms, operating systems, network components, virtualization technologies, storage, and database platforms.
\r\t
Expertise with cloud- continuous-deployment- based software development lifecycles (e.g. CI/CD)
\r\t
Cloud database operations and deployment experience (RDS MySQL/Postgres/Aurora), Caching operations & deployment experience (memcache, Redis)
\r\t
Expertise with Lean/Agile deployment processes (Blue/Green, ZDT, Canary, load balancers/DNS strategies A/B test, feature flagging methodologies)
\r\t
Familiarity with site and infrastructure monitoring systems (like ELK, Datadog, AppDynamics, New Relic, Splunk, Sumologic, Grafana)
\r\t
Strong problem solving, root cause analysis and systems engineering skills
\r\t
Excellent presentation and communication skills
\r\t
Ability to design and manage escalation response plans from monitoring, react, respond, remediate and retrospect in culturally aligned (proactive, customer focused, collaborative, data-driven) ways.
\r\t\r\t
Expertise with SDLC branching, SCM, and code deployment systems (Bitbucket, git/gitflow, Jenkins, CircleCI, TravisCI, etc.)
\r
JOB ID: req357222
Category: Engineering
Location: 715 Peachtree Street, N.E.,Atlanta,Georgia,30308,United States
Exempt
Engineering (GLOBAL)
Honeywell is an equal opportunity employer. Qualified applicants will be considered without regard to age, race, creed, color, national origin, ancestry, marital status, affectional or sexual orientation, gender identity or expression, disability, nationality, sex, religion, or veteran status.
","title":"Lead Architect, Site Reliability Engineering