- As a Software Engineer on the Warehouse Pipeline team in the Data Stack group, you’ll build and iterate on our data import system
- Our import workers are built in python and we pull in data from APIs and databases in batches, process the data using Apache Arrow in memory, and move the data into object storage in open table formats
- You’ll build and maintain our source library, as we’re looking for creative ways to make our library manageable at scale
- You’ll revamp our schema management strategy, and build resilient systems (e.g logging, observability, testing)
- You’ll debug stateful data workflows by digging into k8s pod metrics, and schedule jobs using Temporal.io
You will have the chance to push the boundaries of what our Data Stack team can do while ensuring we remain stable and production-ready