בית > משרות > Site Reliability Engineer (SRE) – מהנדס/ת אמינות אתר (SRE)

About Finubit: Finubit is a fast-moving startup creating the bank’s next-generation cloud platform — a modern, Kubernetes-native and AI-driven foundation that powers engineering for over a thousand developers. We’re rethinking how banks build, deploy, and operate systems at scale — combining GitOps, ChatOps, and AI automation to enable self-service, reliability, and observability across every environment. At Finubit, you’ll join a small, expert team building the backbone of a modern engineering organization — from platform automation to AI-based infrastructure orchestration. About the Role: As an SRE, you’ll help ensure the reliability, scalability, and performance of a multi-cluster Kubernetes ecosystem that powers the bank’s engineering platform. You’ll combine software engineering, observability, and automation to build systems that detect, prevent, and self-heal — powered by Temporal and AI ChatOps. Responsibilities: What You’ll Do: * Design reliability systems for multi-cluster Kubernetes environments. * Build self-healing, failover, and incident-response automation using Argo Workflows + Temporal. * Define and measure SLOs, SLIs, and reliability metrics. * Operate observability tools — Prometheus, Grafana, Loki, Tempo. * Implement incident playbooks and automation within ChatOps. * Collaborate with developers to build resilience and performance into applications. Requirements: What We’re Looking For: * Understanding of Kubernetes, automation, and container orchestration. * Familiar with Terraform/Terragrunt and GitOps. * Comfortable with observability stacks (Prometheus, Grafana, Loki, Tempo). * Proficient in Python or Go for tooling. * Excited to apply AI and automation to reliability engineering. Why You’ll Love Working Here: * Define what reliability means for AI-driven cloud systems. * Build automation that transforms operations into intelligent workflows. * Join a collaborative team focused on learning, scale, and impact.

דרישות המשרה

What You’ll Do: * Design reliability systems for multi-cluster Kubernetes environments. * Build self-healing, failover, and incident-response automation using Argo Workflows + Temporal. * Define and measure SLOs, SLIs, and reliability metrics. * Operate observability tools — Prometheus, Grafana, Loki, Tempo. * Implement incident playbooks and automation within ChatOps. * Collaborate with developer

הגשת מועמדות

אתר משרות

לוח משרות חופשי

Site Reliability Engineer (SRE) – מהנדס/ת אמינות אתר (SRE)

משרה קודמת: Senior Software Engineer- Core - מהנדס/ת תוכנה בכיר/ה - ליבה
משרה הבאה: QA Automation Team Lead - ראש/ת צוות אוטומציית QA

משרה מלאה

IL (ישראל ארצי)

פורסם לפני 6 שעות

Finubit

סקירה כללית

דרישות המשרה

Site Reliability Engineer (SRE) – מהנדס/ת אמינות אתר (SRE)

משרה קודמת: Senior Software Engineer- Core - מהנדס/ת תוכנה בכיר/ה - ליבה משרה הבאה: QA Automation Team Lead - ראש/ת צוות אוטומציית QA משרה מלאה IL (ישראל ארצי) פורסם לפני 6 שעות Finubit

סקירה כללית

דרישות המשרה

שתף את המשרה ברשתות החברתיות

שתף את המשרה באימייל

משרה קודמת: Senior Software Engineer- Core - מהנדס/ת תוכנה בכיר/ה - ליבה
משרה הבאה: QA Automation Team Lead - ראש/ת צוות אוטומציית QA

משרה מלאה

IL (ישראל ארצי)

פורסם לפני 6 שעות

Finubit