SRE (.Net)

St louis, MO 63146

Posted: 12/17/2020 Employment Type: Contract To Hire Category: Site Reliability Engineer Job Number: 54789

Job Description


Site Reliability Engineering (SRE) is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. SRE ensures that internal and external services meet or exceed reliability and performance expectations while adhering to company engineering principles.


Responsibilities
  • Engage in and improve the software development lifecycle – from inception and design, through development, deployment, operation and refinement for greater reliability.
  • Influence and design infrastructure, architecture, standards and methods for large-scale systems
  • Support services prior to production via infrastructure design, software platform development, load testing, capacity planning and launch reviews
  • Maintain services during deployment and in production by measuring and monitoring key performance and service level indicators including availability, latency, and overall system health
  • Automate system scalability and continually work to improve system resiliency, performance and efficiency
  • Practice sustainable incident response as part of an on-call rotation and through blameless postmortems
  • Remediate tasks within corrective action plan via sustainable, preventative, and automated measures whenever possible

Qualifications
  • BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent practical experience
  • Experience developing and/or administering software in cloud infrastructure
  • Experience in monitoring infrastructure and application uptime and availability to ensure functional and performance objectives.
  • 5-7 years of experience in languages such as Python, Ruby, Bash, PHP, Perl, javascript and/or node.js
  • Demonstrable cross-functional knowledge with systems, storage, networking, security and databases
  • System administration skills, including automation and orchestration of Linux/Windows using Chef, Puppet, Ansible, Salt Stack and/or containers (Docker, Kubernetes, etc.)
  • Proficiency with continuous integration and continuous delivery tooling and practices
  • Strong analytical and troubleshooting skills

Preferred qualifications:
  • Expertise designing, analyzing and troubleshooting large-scale distributed systems.
  • Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
  • Experience managing Infrastructure as code via tools such as Terraform or CloudFormation
  • A passion for automation with a desire to eliminate toil whenever possible
  • Experience building software or maintaining systems in a highly secure, regulated or compliant industry
  • Experience and passion for working within a DevOps culture and as part of a team
Apply Online
Apply with LinkedIn Apply with Facebook Apply with Twitter

Send an email reminder to:

Share This Job:

Related Jobs:

Login to save this search and get notified of similar positions.