Senior SRE – Sofia, BG

Location: Sofia, Bulgaria

About Xtravirt

Xtravirt is a UK leader in private cloud consulting and managed services, specialising in VMware Cloud Foundation and VMware-based digital infrastructure. For over 18 years, Xtravirt has supported enterprises across multiple sectors with the design, delivery and ongoing optimisation of secure, scalable and efficient private cloud environments. Recognised as a Broadcom Pinnacle Partner, Xtravirt is committed to helping organisations accelerate their digital transformation and realise the full value of their cloud investments. 

Senior Site Reliability Engineer (SRE) - the role, purpose and mission

We have an exciting opportunity for an experienced Senior Site Reliability Engineer to join our SRE team. Site Reliability Engineering (SRE) is a key function within Xtravirt, ensuring our customers’ cloud platforms are secure, resilient, automated, and optimised for performance at scale. Working closely with our Cloud Automation, Architecture, Advisory and Managed Services teams, SRE sits at the heart of operational excellence – driving reliability, availability, and continuous improvement across enterprise environments.

As a Senior SRE, you will play a critical role in designing, implementing, automating and maintaining highly available private cloud platforms, primarily built on VMware technologies. You will combine deep VMware and infrastructure expertise with strong consultancy skills, and be responsible for ensuring the stability, scalability, security and automation of enterprise cloud platforms.

Your mission is to improve system reliability, reduce operational toil through automation, and act as a trusted advisor to customers – ensuring their cloud environments are future-ready, resilient and aligned to best practices.

By joining our team, you will receive a competitive remuneration package, benefits, and a personalised development plan to help you achieve your professional goals.

Key Responsibilities:

  • Act as a trusted SRE consultant for enterprise customers, understanding operational challenges and delivering reliable, scalable solutions.
  • Maintain high availability, stability and performance of VMware-based private cloud platforms.
  • Plan, design and implement seamless upgrades and lifecycle management activities for VMware Cloud Foundation environments.
  • Design and deploy automated solutions to enhance operational efficiency, reduce manual intervention and improve service reliability.
  • Define, gather and analyse key performance metrics, logs and alerts. Automate responses to incidents and performance anomalies.
  • Ensure infrastructure and services meet enterprise-grade security standards, including access control, auditing and reporting.
  • Support and validate third-party integrations within VMware Cloud Foundation ecosystems, identifying and mitigating interoperability risks.
  • Work closely with architects, project managers, service managers and customer stakeholders to deliver successful engagements.
  • Lead troubleshooting across compute, storage, networking, hypervisor and OS layers to resolve complex issues.
  • Support junior engineers, contribute to internal knowledge bases and promote continuous learning across the SRE function.
  • Identify opportunities to enhance processes, tooling and delivery methodologies.

Key Skills & Experience

The position requires strong technical depth, excellent communication skills and a proactive, ownership-driven mindset. Candidates should demonstrate experience in:

  • Expertise in at least one major Cloud platform (VMware, AWS, Azure or GCP)
  • Knowledge with Networking, Protocols, OSI model, DNS
  • Knowledge with Security and Cryptography, CA, AD
  • Administration and troubleshooting experience with Linux-based systems
  • Excellent problem-solving, communication and collaboration skills in multi-national setting
  • Independent and self-driven individual with ability to adapt and learn new products and technologies quickly
  • Knowledge of containerization and orchestration (e.g. Kubernetes)
  • Experience and/or certifications in VMware Cloud Foundation (VCF) platform and VMware product portfolio

Candidates should demonstrate a strong operational mindset, the ability to balance technical depth with business needs, and a passion for reliability engineering.

Benefits of working with Xtravirt

  • 25 days’ holiday plus Bank Holidays
  • Hybrid working
  • Healthcare, pension and life assurance
  • Employee benefits platform
  • Long service awards
  • Regular evaluation and salary reviews
  • Refer a friend bonus scheme
  • Staff recognition awards
  • Regular company and social events

apply today

Become part of the team