Production Support & Reliability Engineer

Cyberhaven

•

ГибридDevOpsСША$$130k-$150k

Cyberhaven

ГибридDevOpsСША$$130k-$150k

Обязанности

This Production Support & Reliability Engineer role is focused on customers who deploy Cyberhaven’s Content Inspection (CI) stack in their own cloud or datacenter environments (AWS, GCP, Azure, Kubernetes)
You will be the primary L3 owner for on‑prem CI deployments and incidents, especially as DSPM on‑prem connectors become the norm
You must be deeply technical, comfortable with Linux, containers, Kubernetes, and virtual appliances, and enjoy operating at the intersection of support, SRE, and product engineering
This role reports to the Director of Technical Support
Act as the primary L3 owner for all support cases involving on-prem Content Inspection (CI), including deployments, upgrades, performance, correctness, and stability
Own customer issues end-to-end, from initial triage and reproduction in lab or test clusters through root-cause analysis, remediation, or escalation
Serve as the internal and external point of contact for high-severity CI incidents, joining customer bridges and coordinating closely with SRE, R&D, and Product until resolution
Lead or co-lead on-prem CI deployments and upgrades alongside Professional Services and SRE, validating prerequisites, reviewing Kubernetes and Helm configurations, and coordinating maintenance windows and rollback plans
Monitor and support on-prem CI environments, understand CI metrics and logs, and drive first-line response to capacity, health, latency, and stability issues
Act as the L3 specialist for DSPM on-prem connectors that depend on CI, validating new capabilities and supporting large-scale and design-partner deployments
Design, build, and maintain a Support-owned on-prem CI lab across AWS, GCP, and Azure to reproduce customer issues, validate fixes, and test upgrades and mitigations
Create and maintain internal runbooks and knowledge base documentation for CI deployments, upgrades, troubleshooting, and escalation best practices
Enable internal teams by training Support, Professional Services, CX, SRE, and R&D on CI workflows, escalation patterns, and customer-facing considerations
Drive supportability improvements by filing and owning engineering feedback related to diagnostics, logging, health checks, and safer upgrade and rollback behavior

Partner cross-functionally with SRE, R&D, and Product Management to improve CI reliability and scalability, align releases with customer commitments, and participate in design and readiness reviews

Требования

5+ years experience in technical support, SRE, or production operations for a SaaS or security product
Experience deploying and supporting virtual appliances or on‑prem products in enterprise environments
Comfortable reading and interpreting logs and metrics from distributed systems; able to form and test hypotheses quickly
Able to manage high‑priority incidents calmly, communicate clearly with enterprise customers, and coordinate multiple internal teams
Curious, self‑directed, and comfortable taking ownership of ambiguous problems until they are fully resolved
Excellent written and verbal communication skills; able to turn repeated patterns into clear runbooks and training material

Joining Cyberhaven is a chance to revolutionize data security

Навыки

Deep, hands‑on experience with Linux, containers, and Kubernetes (EKS, GKE, AKS, or self‑managed clusters)

Strong understanding of networking (load balancers, TLS, DNS, proxies, firewalls) in hybrid/on‑prem settings

Зарплата

$130'000-150'000

Опубликовано: 11.01.2026