Draper, UT

Full Time

4y ago

Company Description

FireEye is the intelligence-led security company. Working as a seamless, scalable extension of customer security operations, FireEye offers a single platform that blends innovative security technologies, nation-state grade threat intelligence, and world-renowned Mandiant® consulting. With this approach, FireEye eliminates the complexity and burden of cyber security for organizations struggling to prepare for, prevent, and respond to cyber attacks. Learn more about FireEye's world-class solutions and global footprint at https://www.fireeye.com/company.html.

Job Description

The SRE role at FireEye is critical to our success by helping ensure service availability, identifying and automating manual processes, and bridging the gaps between product development teams and operations. Implementing operational improvements large and small in availability, latency, performance, efficiency, change management, monitoring, incident response, and capacity planning are all within scope for this role. Whether it’s done through code, the introduction of modern tools, and/or better processes continuous improvement and efficiency is the goal.

Solve problems relating to mission critical services and build automation to prevent problem recurrence, with the goal of automating response to all non-exceptional service conditions. SREs will be focused on maximum availability, reliability, security, and performance for FireEye cloud services. You will be an integral part of our Global Site Reliability Engineering team in Draper.

What You Will Do:

· Collaborate with product developers to ensure non-functional requirements of availability, performance, security, and maintainability.

· Work with release engineers to ensure software delivery pipeline is as efficient as possible.

· Develop tools that improve monitoring, telemetry, visualization, alerting, and reporting.

· Design, write and deliver software to improve the availability, scalability, latency, and efficiency of FireEye’s cloud services

· Influence and create new designs, architectures, standards, and methods for large-scale distributed systems

· Collaborate with a world-class engineering team to propose features that solve recurring patterns of customer complaints

· Engage in service capacity planning and demand forecasting, software performance analysis and system tuning

· Participate in on call rotation. Participate, collaborate, and provide guidance in retrospectives.

· Find scalability bottlenecks and areas for performance improvements

· Strong team player with a high degree of flexibility.

· Systematic problem-solving approach, coupled with a strong sense of ownership and drive

Qualifications

· 8 years’ experience and industry cloud certifications

· Expert in Azure Architecture and Security management and automation. AWS Certifications preferred.

· Experience in Cloud Software Engineering, Cloud Site Reliability Engineering, & Cloud Operations

· Must be a US Citizen (Contractual Requirements)

· Production experience with Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) technology stacks.

· Blend of both Development and SRE mindset (i.e., software and infrastructure)

· Experience with Go/Python (strongly preferred), Perl or Ruby, or Java/C++ (one of the OOP language), specifically for systems automation

· Experience in cloud provisioning code development and tools (Terraform, CloudFormation, etc.)

· Networking: knowledge and understanding of network theory, such as different protocols (TCP/IP, UDP, ICMP, etc), MAC addresses, IP packets, DNS, OSI layers, and load balancing

· Firm grasp of at least one modern programming language (Java/Go/Python/Ruby), beyond basic scripting (Shell, Perl, Bash)

· Solid experience using configuration management frameworks (e.g., Ansible/Chef/Puppet)

· Release software through tooling (git, Jenkins, custom scripts, Docker)

· Experience with algorithms, data structures, complexity analysis and software design.

· Systematic problem-solving skills coupled with a strong sense of ownership and drive.

· Procedural and troubleshooting documentation skills

Additional Qualifications:

· Expertise in designing, analyzing and troubleshooting large-scale distributed systems

· Familiarity with running web services at scale; understanding of Unix systems internals and networking, AWS, Azure.

· Understanding of Unix/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols along the way

· Good knowledge of virtualization technologies and container technologies

· Experience with containers and HA clusters; experience with Docker and Amazon ECS /Kubernetes/ Mesosphere/Docker Swarm a plus

· Experience with APM solutions

· Basic understanding of most of Jira.

· Cloud certification or equivalent experience is preferred

Additional Information

At FireEye we are committed to our #OneTeam approach combining diversity, collaboration, and excellence. All qualified applicants will receive consideration for employment without regard to race, sex, color, religion, sexual orientation, gender identity, national origin, protected veteran status, or on the basis of disability.

Minimum Salary: $102,800. Final salary will be determined commensurately with cost of living, experience level, and/or any other legally permissible considerations.

Incentive Compensation: Eligibility for annual bonus subject to individual and company performance; eligibility for award of Restricted Stock Units subject to eligibility requirements, approval from FireEye’s Compensation Committee, and vesting terms

Benefits: Employer subsidized benefits include Medical, Dental, Vision, Life, and Disability Insurance. Subject to eligibility requirements, FireEye also offers the ability to participate in 401(k), Flexible Spending Accounts, Health Savings Accounts, Dependent Care Spending Accounts, and Employee Stock Purchase Program. FireEye also provides Paid Time Off, Flexible Paid Sick Time, and Paid Holidays.

Apply for this job

Click on apply will take you to the actual job site or will open email app.

Click above box to copy link

Copied

Share this job via

Azure Site Reliability Engineer

Mandiant

Company Description

Job Description

Qualifications

Additional Information