Baselayer

Data Engineer

Hybrid · San FranciscoDataposted 15d agoMid

Baselayer seeks a Data Engineer to build and scale data infrastructure for its intelligent business identity platform. This hands-on role involves owning pipelines and data systems that power analytics, reporting, and ML, with a focus on reliability and data quality in a regulated environment. Hybrid in SF, with equity and competitive salary.

Annual salary$122,000 – $167,000+ equity · health · 401k match

Responsibilities

Design, build, and maintain scalable data pipelines that ingest, clean, validate, and transform data from internal systems and external sources
Own data reliability and quality through monitoring, alerting, lineage, and validation frameworks
Build and maintain data models and curated datasets supporting analytics, dashboards, customer reporting, and ML use cases
Partner with Engineering to define best practices for data architecture, storage, access controls, and performance
Implement orchestration and scheduling for batch and near-real-time workflows

About the role

About Baselayer Trusted by 2,200+ financial institutions, Baselayer is the intelligent business identity platform that helps verify any business, automate KYB, and monitor real-time risk. Baselayer’s B2B risk solutions and identity graph network leverage state and federal government filings and proprietary data sources to prevent fraud, accelerate onboarding, and lower credit losses.

About the Role We are looking for a Data Engineer to build and scale Baselayer’s data infrastructure. You will own the pipelines and data systems that power analytics, reporting, and machine learning across the company, with a focus on reliability, performance, and data quality. This role is hands-on and highly cross-functional.

You will work closely with Product and Engineering to ensure data is accessible, trusted, and delivered in a way that supports product capabilities in a regulated environment.

What You’ll Do Design, build, and maintain scalable data pipelines that ingest, clean, validate, and transform data from internal systems and external sources Own data reliability and quality through monitoring, alerting, lineage, and validation frameworks Build and maintain data models and curated datasets that support analytics, dashboards, customer reporting, and downstream ML use cases Partner with Engineering to define best practices for data architecture, storage, access controls, and performance Implement orchestration and scheduling for batch and near-real-time workflows as needed Optimize pipeline performance, cost, and scalability as data volumes grow Develop and maintain documentation and runbooks for pipelines, datasets, and operational procedures Identify data gaps and instrumentation needs, and work with engineering teams to improve event capture and logging About You You want to learn fast, take ownership, and do work that matters.

You are not just doing this for the win. You are doing it because you have something to prove and want to be great. You thrive in the details, care about correctness, and take pride in building robust systems that other teams can rely on. You operate with urgency, handle ambiguity well, and consistently raise the bar on data quality and reliability.

About Baselayer

Visit job-boards.greenhouse.io for more.

Similar roles

MongoDB

Principal Analyst, GTM Operations

—11d ago

hellofresh

Staff Product Analyst, Growth

—11d ago

Roku

Sr. Analyst, Ad Forecasting & Inventory Yield

$140k–$150k11d ago