Cloud Operations Engineer

You are a Technical Support Engineer; passionate about customers and their experience with services. You have a breadth of technical knowledge (vs depth) and you are responsible to manage incidents and technical operations on our production platform. You understand our overall systems architecture and are able to drive swift resolution of incidents by coordinating and with various technical teams. (DevOps, Infrastructure, Engineering, Suppliers …) You have experience in automation and will drive improvement to streamline monitoring, alerting and incident resolution processes, in collaboration with our DevOps and Engineering teams.

Technical skills

  • Strong experience in Incident & Event Management (NOC, App Support...)
  • Experience with support and troubleshooting of 24×7 high volume transactional Web applications
  • Knowledge of Windows and Linux systems
  • Experience of Cloud infrastructure and platform services, (we run on AWS and Azure)
  • One of those : Inter-system, microservices and message bus technologies
  • APM systems such as Dynatrace, AppDynamics and/or New Relic
  • Alerting tool as PagerDuty
  • Experience in Scripting languages such as Python, Bash and PowerShell.

You will

  • Monitor our Production systems and react to alerts swiftly
  • Ensure 24×7 availability of our product platform working with the Tech teams
  • Participate in the development of monitoring & alerting strategies with the DevOps team across multiple cloud environments, in particular AWS, using advanced monitoring tools like AppDynamics
  • Manage incidents, categorization, triage, resolution and escalation
  • Communicate appropriately with our business stakeholders on incidents (Customer Service...)
  • Participate in an oncall/shift rota
  • Implement automation with our DevOps team in order to work more effectively and efficiently
  • Perform various Technical Operations in collaboration with the DevOps and Infrastructure teams (patching, log management, space management ...)
  • Develop various technical runbooks in collaboration with other tech teams
  • Participate in the continuous improvements of our operational processes (Incident, Problems, Change ...)
  • Provide input in Post Incident Review / Post Mortem and take initiative in order to prevent and reduce incidents

We offer

  • Friendly environment and team
  • High compensation
  • Flexible schedule for keeping a work-life balance
  • A completely remote type of job or the mix of remote and office work
  • Additional options in the form of health insurance / sports / equipment
  • English-speaking club with a native speaker.

Want to apply for this job? Do it here!

max file size 5MB; allowed extensions: txt, pdf, docx

Share this vacancy with a friend!