Site Reliability Engineer
Point72 seeks a Senior Site Reliability Engineer to join its Technology team, focusing on automating operational workflows, building observability with Datadog, and partnering with development teams to enhance reliability. This role offers the chance to work with cutting-edge open source solutions in a leading global investment firm.
Responsibilities
- Design and implement automated operational workflows to improve system reliability and reduce manual intervention
- Build and maintain observability solutions using tools such as Datadog, to deliver metrics, monitoring, alerting, and dashboards
- Partner with development teams to improve application reliability, deployment safety, and performance through SRE best practices
- Develop and maintain CI/CD pipelines and deployment automation using Bitbucket/Jenkins, GitHub Actions, and related tooling
- Engineer scalable solutions for production environments across Linux and Windows systems
About the role
JOB TITLE Site Reliability Engineer A Career with Point72’s Technology Team As Point72 reimagines the future of investing, our Technology group is constantly improving our company’s IT infrastructure, positioning us at the forefront of a rapidly evolving technology landscape.
We’re a team of experts experimenting, discovering new ways to harness the power of open source solutions, and embracing enterprise agile methodology. We encourage professional development to ensure you bring innovative ideas to our products while satisfying your own intellectual curiosity.
) - Operational and performance knowledge of SQL Server and MongoDB - Familiarity with cloud platforms (AWS or similar) and hybrid architectures - Solid understanding of networking concepts such as DNS, load balancing, and TCP/IP - Experience working closely with application development teams in an SRE or DevOps role - Experience with Kubernetes, OpenShift, and containerized workloads - Knowledge of infrastructure‑as‑code tools (Terraform, CloudFormation, ARM) - Experience implementing automated scaling and performance tuning - Background in reliability engineering or DevOps in an enterprise environment - Familiarity with security and compliance considerations in production systems - Strong bias toward automation over manual processes - Focus on improving long‑term reliability rather than reactive firefighting - Comfortable owning systems end‑to‑end and driving improvements -Clear communication skills with the ability to work effectively across engineering, platform, and operations teams -Commitment to the highest ethical standards About Point72 Point72 is a leading global alternative investment firm led by Steven A.
Cohen. Building on more than 30 years of investing experience, Point72 seeks to deliver superior returns for its investors through fundamental and systematic investing strategies across asset classes and geographies. We aim to attract and retain the industry’s brightest talent by cultivating an investor-led culture and committing to our people’s long-term growth. com/ .