Company Description
Building the bank of tomorrow takes more than skills.
It means combining our differences to imagine, discuss, code, develop, test, learn… and celebrate every step together. Share our vibes? Join Swissquote to unleash your potential.
We are the Swiss Leader in Online Banking and we provide trading, investing and banking services to +500’000 clients, through our performant and secured digital platforms.
Our +1000 employees work in a flexible way, without dress code and in multicultural teams.
By having a huge impact on the industry, they are growing their skills portfolio and boosting their career in a fast-pace environment.
We are all in at Swissquote. As an equal opportunity employer, we welcome candidates from all backgrounds, experiences and perspectives to join our team and contribute to our shared success.
Are you all in? Don’t be shy, apply!
Job Description
You will join the IT Observability & Performance team at Swissquote, whose mission is to deliver situational awareness via telemetry, detection and forecasting.
This entails collaboration with cross-functional teams, as well as IT management, to collect actionable telemetry data, drive cost optimization through FinOps practices, and empower metric-driven decision-making. Your expertise will shape a proactive, agile, and high-performing IT environment, ensuring the reliability and efficiency of our financial systems.
As a Platform Engineer, you will design, implement and manage advanced telemetry solutions. Your expertise will help build towards our vision of a self-service platform.
You will also play a pivotal role in analyzing system performance, enabling root-cause analysis, and fostering continuous improvement in our IT infrastructure, all while aligning with Site Reliability Engineering (SRE) principles as outlined in the SRE Handbook.
- Develop and deploy telemetry frameworks using tools like ELK Stack, Grafana, and Prometheus to monitor system performance, availability, and reliability.
- Design and implement alerting mechanisms with tools like PagerDuty to enable rapid anomaly detection and response.
- Analyze telemetry data to identify trends, performance bottlenecks, and potential issues, providing actionable insights.
- Enable teams to perform root-cause analysis and proactively detect performance issues through layman dashboards to enhance system resilience.
- Support IT management in automating and tracking Service Level Objectives (SLOs), Key Performance Indicators (KPIs), and error budgets in alignment with SRE principles.
- Drive FinOps initiatives by optimizing observability-related costs for our internal cloud and implementing self-service metrics, logs, and traces.
- Generate comprehensive reports for IT management on system health, incident trends, compliance requirements and regulatory needs.
- Contribute to continuous improvement by recommending and implementing telemetry-driven enhancements to IT infrastructure.
Qualifications
Minimum Qualifications
BS/MS in Computer Science, Engineering, or a related technical field involving programming (e.g., Physics, Mathematics), or equivalent experience.
Knowledge and hands-on experience with:
- Infrastructure as Code and GitOPS principles, with tools like Github Actions, Ansible or Terraform
- Observability tools, with tools like ELK, Grafana, Prometheus or OpenTelemetry
- Alerting & on-call experience, with tools like Nagios, PagerDuty or incident.io
Strong knowledge of development, operations, networking, storage, or security.
Proficiency in at least one programming language such as Python, Go, Rust, Java, or Bash.
Systematic approach to problem-solving and a strong sense of ownership, accountability, and communication.
Preferred Qualifications
- Experience deploying and managing observability solutions in Kubernetes, containerized environments, or standalone VMs.
Understanding of modern IT infrastructure (Kubernetes, containers, service mesh, standalone VMs).
Expertise in defining and implementing SLOs, KPIs, and error budgets following SRE principles.
Familiarity with FinOps practices and tools like OpenCost for cost optimization.
Proficiency with Infrastructure as Code (IaC) tools like Terraform or Ansible for maintaining observability infrastructure.
Ability to quickly learn and adopt emerging technologies, methodologies, and solutions
Knowledge of distributed tracing tools (e.g., APM, OpenTelemetry, Jaeger, Zipkin) and their application in complex architectures.
Additional Information
Why Join us ?
At Swissquote, you’ll work in a dynamic, innovative environment where your contributions directly impact the reliability and performance of mission-critical financial systems.
By applying SRE principles and leveraging cutting-edge observability tools, you’ll help drive our mission to deliver secure, scalable, and cost-efficient IT solutions.
If you are passionate about empowering teams through data, building robust systems that illuminate performance and risks, and driving IT forward with resilience and transparency, we’d love to hear from you.
SQ2