Data Engineering Lead
Daydream
Location
NYC
Employment Type
Full time
Department
Engineering
About Daydream
Daydream is the first chat-based shopping agent built exclusively for fashion. Designed to redefine how people search and discover fashion, Daydream offers a personalized, conversational experience powered by advanced AI and natural language understanding.
Backed by top-tier investors including Forerunner Ventures, Index Ventures, Google Ventures, and True Ventures, our team is committed to shaping the future of shopping.
About the role
Are you passionate about the intersection of high fashion and cutting-edge artificial intelligence?
Are you passionate about building the data foundations that power truly intelligent systems? As a Data Engineering Lead at Daydream, you will be a foundational member of the team, responsible for designing and building the entire data ecosystem that fuels our AI Personal Stylist. This is a unique opportunity to solve complex technical challenges while directly shaping a product that will revolutionize how people shop online.
What you’ll do:
Design, build, and optimize scalable, parallel data processing pipelines on Google Cloud to handle massive volumes of offline data.
Implement and manage large-scale LLM batch inference jobs, processing millions of data points to enrich our product catalog with sophisticated, AI-generated attributes.
Architect and own the data infrastructure for our Fashion Knowledge Graph, leveraging BigQuery and parallel data processing frameworks.
Develop and maintain robust feature generation pipelines to craft high-quality signals for both the training and inference of our machine learning models.
Orchestrate complex workflows of data processing jobs, implementing robust monitoring, alerting, and data quality validation systems to ensure reliability and trust in our data.
Collaborate closely with data science and machine learning teams to understand data requirements and deliver production-grade data solutions.
Champion engineering best practices, including writing clean, maintainable Python and SQL, and drive a culture of high-quality data and operational excellence.
Who you are
You have extensive experience building and deploying data solutions on a major cloud platform (preferable Google Cloud Platform)
You are highly proficient with distributed data processing frameworks such as Apache Spark, Flink, or Polars.
You possess exceptional Python coding skills, with a deep understanding of writing efficient, testable, and maintainable code for data applications.
You have expert-level SQL skills and deep experience with modern cloud data warehouses like BigQuery, Snowflake, or Redshift.
You have hands-on experience with workflow orchestration tools like Airflow, Argo or Kubeflow.
You are a pragmatic and proactive builder who thrives in a fast-paced, autonomous startup environment, capable of driving projects from concept to production.
You are an empathetic and collaborative teammate, skilled at communicating complex technical ideas and passionate about building the reliable infrastructure that empowers your colleagues.
You are a natural leader who enjoys mentoring and developing teammates and aligning work to provide growth opportunities while ensuring priorities are aligned with broader company goals
What we offer
Competitive salary, equity and benefits (medical, dental, vision, 401k, etc.)
Flexible vacation and remote working
The opportunity to be part of a groundbreaking, AI-focused company
Collaborative work environment with a team of talented, fun-loving individuals.
Opportunity to learn and grow in your career while shaping the future of fashion, shopping and technology
Commitment to Diversity
Daydream is proud to be an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees, regardless of race, color, religion, gender, sexual orientation, gender identity, age, national origin, or disability status.