Lead DevOps Engineer
Atlanta, GA | Direct Hire
Lead DevOps Engineer
Atlanta, Georgia, United States · Engineering
You will be responsible for build, support, operation, and maintenance of our production cloud-based messaging services. You will ensure our historical 99.99%+ availability doesn’t falter as we grow beyond 10, 000 transactions per second and billions of messages per month. Alongside our development and QA teams you will analyze, design, and automate our continuous delivery pipeline for peak efficiency.
The Lead DevOps Engineer is responsible for defining and implementing tools and processes for monitoring, alerting, automation, scalability, security, and high availability for hybrid (on-premises and cloud based) SaaS solutions.
- Learn in detail the existing infrastructure and application architecture for all components
- Quickly become an expert in the products and features we offer
- Actively contribute to requirements specifications for new and enhanced product features to ensure Operations requirements are captured and satisfied
- Work closely with Java Developers to ensure implemented solutions are robust and scalable
- Seek out and implement solutions for automation of application build, test, deployment, and configuration management
- Document operations support procedures and train staff
- Assist Sales and Support organizations as required
- Lead cross-functional teams in triage and resolution of production incidents and anomalies
- Provide tools, dashboards, and reports that clearly show current and historical operational status and metrics
- Implement, refine, and evolve tools and processes to maintain 5 nines availability for 24×7 cloud-based SaaS solutions
- Minimum 7 years of experience operationally supporting enterprise or commercial applications
- Minimum 4 years of experience in a lead operations role
- Bachelor’s Degree in Computer Science or Engineering, Master’s preferred
- Outstanding Linux system administration and tuning skills
- Familiarity with multiple virtualization tools and techniques
- Can quickly diagnose and resolve the most insidious infrastructure issues, then take appropriate actions to prevent a recurrence
- Versatile scripter for administration, operations, and monitoring tasks
- Lead product selection and deployment for operations monitoring, alerting, system dashboard, and reporting tools.
- Experienced in holistic design concepts including SOA, loose coupling, message queuing, scalable redundancy, geographically distributed systems, etc.
- Very comfortable and productive with Agile processes
- Competent and active proponent of DevOps principles
- Strong advocate of build, test, deployment, and configuration automation
- Some familiarity deploying and operating applications in AWS including EC2, VPC, IAM, Route 53, ELB, CloudWatch, CloudFormation, S3, Kinesis, RDS/MySQL and Aurora, S3, RedShift, Data Pipeline
- ITIL v3 trained and certified