Open Position: Senior Machine Learning Engineer

San Francisco

CircleUp harnesses the power of machine learning and predictive analytics to discover the fastest-growing companies in the consumer & retail sector. We are building a predictive data system called "Helio" to bring the data-driven revolution that has occurred in the public markets to the private markets, starting with consumer & retail.

We are working on challenging problems in information retrieval, entity resolution, and developing an in-depth knowledge graph of all private companies. We are mining vast amounts of data to successfully rewrite the rules on how private companies are evaluated.

With a background in both software development and machine learning, you will collaborate with software engineers, data scientists, PMs, and domain experts in consumer investing to develop & ship predictive data products in support of CircleUp's mission - helping entrepreneurs thrive by connecting them with the capital & resources they need.

You are a passionate software engineer who is comfortable working across the full stack & lifecycle of predictive data products - prototyping, feature engineering, validation, and productionalization. You are comfortable working with data at scale with tools such as Spark, Dask, or Hadoop.


  • Collect and refine structured and unstructured data on private companies
  • Build out our entity resolution platform that can identify references to companies, brands and products in highly unstructured digital documents and link them back to real world entities
  • Build scalable production systems for data collection, data transformation, feature extraction, model training, and scoring, using distributed software tools
  • Build end to end algorithms for objectively measuring the quality of private consumer companies
  • Contribute to all phases of algorithm development including ideation, prototyping, design and production

We're looking for teammates who have:

  • Bachelor's degree or higher in Computer Science,Information retrieval, Natural Language processing,  Math, Statistics or related technical field
  • 8+ years of track record in building and delivering production quality software systems
  • 2+ years of experience in machine learning, NLP and/or information retrieval, and broad knowledge of machine learning APIs, tools, and open source libraries
  • Excellent coding skills and strong fundamentals in algorithms, data structures, predictive modeling and big data concepts
  • Experience in distributed data processing frameworks such as Dask, Spark or Hadoop
  • Experience with Entity Resolution or Knowledge Graph problem spaces a huge plus

