Site Reliability Engineer

Минск

Description:

Our customer is the world’s first Open Source Online Video Platform, providing video solutions, software and services for video creation, publishing, management, engagement, monetization and analysis.

We need an experienced SRE to join our growing OTT-SRE team, managing Company’s CloudTV product, revolutionizing the TV industry.

The ideal candidate will have hands-on experience in managing large-scale production environments, with a focus on automation, and a passion for continuous improvement.

 

Responsibilities:

  • As part of the SRE-OTT team, you will own the overall aspects of a large-scale AWS-based, production environment, required to meet a high level of SLA.
  • You will achieve those by leveraging technology innovation, both from external sources (such as AWS) and internal development.
  • You will be required to have excellent troubleshooting and investigation capabilities, in order to understand the root cause of the issue and strive to solve those from the ground, working closely with our RnD developers.

Requirements:

  • Experience working in the production cloud environment (AWS, GCP).
  • Knowledge in Docker and K8s management.
  • Web / Application servers – Apache, Nginx.
  • Advanced networking knowledge – Load balancers, firewalls, VPNs, TCP / IP.
  • Experienced with provisioning tools: Terraform, Cloudformation, chef.
  • Experienced with scripting languages: python, bash, Powershell.
  • Experience with monitoring tools, such as Grafana, Prometheus, Kibana – leveraging data to insights on the system behavior.
  • Excellent English both written and verbal.
Спасибо!
Ваше сообщение отправлено