•

USA, Remote, Oregon, US

AIУдалённаяРазработкаСНГ$$150k - $$180k

USA, Remote, Oregon, US

AIУдалённаяРазработкаСНГ$$150k - $$180k

Cluster Design: Architect scalable GPU cluster topologies including compute nodes, interconnect (InfiniBand, Ethernet), storage, and control planes
Performance Modeling: Analyze AI/ML workloads (e.g. LLM training, inference) to inform design tradeoffs across latency, bandwidth, and GPU density
Network Architecture: Align with network architect relevant design and validate low-latency, high-throughput interconnects (e.g., InfiniBand HDR/NDR, RoCEv2) at POD and DC scale
Storage Integration: Work with storage teams to optimize performance for training datasets, checkpointing, and others
Reliability & Monitoring: Understand and analyze signal from monitoring systems to the detect flows in design

Collaboration: Partner with site reliability, networking, storage, and DC engineering teams to operationalize and scale your architecture

Experience in scripting for automation and telemetry pipelines (Python, Go, etc.)

Health insurance: 100% company-paid medical, dental, and vision coverage for employees and families
401(k) plan: Up to 4% company match with immediate vesting
Parental leave: 20 weeks paid for primary caregivers, 12 weeks for secondary caregivers
Remote work reimbursement: Up to $85/month for mobile and internet
Disability & life insurance: Company-paid short-term, long-term and life insurance coverage
Compensation
We offer competitive salaries, ranging from $150k- $180k base + quarterly performance bonuses
Competitive salary and comprehensive benefits package
Opportunities for professional growth within Nebius
Hybrid working arrangements
A dynamic and collaborative work environment that values initiative and innovation
Flexible working arrangements

$150'000 - $180'000

Опубликовано: 21.12.2025