Senior Data Engineer

San Francisco, CA

Data Acquisition & Ingestion Team Overview

The Data Acquisition & Ingestion team is responsible for collecting, ingesting, and normalizing the data that powers Helio. We are building an automated ingestion system that is able to scale and reliably onboard and extract value from hundreds of disparate data sources. We are also champions of maintaining high quality data practices across all of Helio’s data pipelines.

Some of the technologies we leverage are Python 3, PySpark, AWS, Docker, Kubernetes, Postgres, Airflow, Jenkins and Git.

We are looking for a Senior Data Engineer to help us design and develop data pipelines to ingest, validate, extract, and normalize data across new and existing sources. The ideal candidate will be self-driven and comfortable balancing progress towards a longer-term roadmap while maintaining context and stability across a dynamic set of existing data sources. While the role is for an individual contributor, we are looking for someone who is excited and willing to mentor junior engineers.

CircleUp has been named one of the Top 5 Most Disruptive Companies in Finance by CNBC, one of the 50 Best FinTech Innovators by KPMG, Top 3 Most Innovative Companies within Data Science by Fast Company and one of America's Most Promising Companies by Forbes. We are backed by top-tier investors including Google Ventures, Union Square Ventures (backers of Etsy/Kickstarter), and the ex CEOs/Presidents of Goldman Sachs, Morgan Stanley, Thomson Reuters, the Stanford Endowment and Capital One.

Responsibilities

Provide senior-level contribution to the design, implementation and maintenance of complex data pipelines

Build reliable services for gathering & ingesting data from a wide variety of sources

Build performant and reliable data pipelines to validate, extract and normalize data from a wide variety of sources

Develop strategy, tools, and workflow for integrations and ingestion of data

Collaborate with cross-functional teams and stakeholders to understand data needs

Write quality, maintainable code with extensive test coverage in a fast-paced, agile software engineering environment

Mentor junior teammates and lead by example in demonstrating software engineering best practices

Requirements

Hold a B.S. or M.S. in Computer Science, or equivalent degree

5-7+ years of proven working experience as a data engineer

Excellent software engineering skills and strong fundamentals in algorithms, data structures, predictive modeling and big data concepts

Strong programming fundamentals and proficiency in an object-oriented language such as Python or Scala

Excellent communication skills to collaborate with stakeholders in engineering, data science, and product

Nice to Have

Experience with our stack (Python, PySpark, Airflow, AWS ecosystem) is preferred but not required

Experience building large-scale and complex data processing pipelines

A successful history of manipulating, processing and extracting value from large disconnected datasets

Strong analytical skills related to working with unstructured datasets

Useful Traits for this Role

Communication, both technical and business-level, especially with external contractors

Detail-oriented, Business-sense and ability to manage ambiguity; able to synthesize detailed schema specifications from a newly identified source

Ability to understand, maintain, document, and be knowledgeable about a large variety of data sources; able to deal with a certain level of reactive context-switching

Proactive and driven; will identify gaps in our data model and will proactively work to improve it

Apply

All Open Positions