Data Engineer

wynd-labs· Engineering
Apply Now ↗
🌍 Remote📍 remoteFullTime

About this role

Who We Are:

We build infrastructure that delivers massive amounts of web data to the companies training the world’s most powerful AI models.

We're the team that helps to power and support Grass, a bandwidth-sharing network that lets us operate a massive distributed crawler, giving us unique access to high-quality public web data at global scale. On top of that, we’ve built pipelines for ingesting, segmenting, and annotating billions of videos, transcripts, and audio files, powering dataset creation for frontier labs.

We’re lean, technical, and move fast. No red tape, no slow decision-making; just a team of builders pushing to expand what’s possible for open web data and AI.

The Role.

We are seeking a Data Engineer with expertise in building and maintaining robust data pipelines and integrating scalable infrastructure. You will join a small, talented team and play a critical role in designing and optimizing our data systems, ensuring seamless data flow and accessibility. Your contributions will directly support our mission to position Grass as a key player in the evolution of data-driven innovation on the internet.

Who You Are.

  • Bachelor’s degree in Computer Science, Information Systems, Data Engineering, or a related technical field.

  • Extensive experience with database systems such as Redshift, Snowflake, or similar cloud-based solutions.

  • Advanced proficiency in SQL and experience with optimizing complex queries for performance.

  • Hands-on experience with building and managing data pipelines using tools such as Apache Airflow, AWS Glue, or similar technologies.

  • Solid understanding of ETL (Extract, Transform, Load) processes and best practices for data integration.

  • Experience with infrastructure automation tools (e.g., Terraform, CloudFormation) for managing data ecosystems.

  • Knowledge of programming languages such as Python, Scala, or Java for pipeline orchestration and data manipulation.

  • Strong analytical and problem-solving skills, with an ability to troubleshoot and resolve data flow issues.

  • Familiarity with containerization (e.g., Docker) and orchestration (e.g., Kubernetes) technologies for data infrastructure deployment.

  • Collaborative team player with strong communication skills to work with cross-functional teams.

What You'll Be Doing.

  • Designing, building, and optimizing scalable data pipelines to process and integrate data from various sources in real-time or batch modes.

  • Developing and managing ETL/ELT workflows to transform raw data into structured formats for analysis and reporting.

  • Integrating and configuring database infrastructure, ensuring performance, scalability, and data security.

  • Automating data workflows and infrastructure setup using tools like Apache Airflow, Terraform, or similar.

  • Collaborating with data scientists, analysts, and other stakeholders to ensure efficient data accessibility and usability.

  • Monitoring, troubleshooting, and improving the performance of data pipelines and infrastructure to ensure data quality and flow consistency.

  • Working with cloud infrastructure (AWS, GCP, Azure) to manage databases, storage, and compute resources efficiently.

  • Implementing best practices for data governance, data security, and disaster recovery in all infrastructure designs.

  • Staying current with the latest trends and technologies in data engineering, pipeline automation, and infrastructure as code.

Why Work With Us:

  • Opportunity. We are at the forefront of developing a web-scale crawler and knowledge graph that improves access to public web data and extends the value of AI to the people.

  • Culture. We're a lean team with a high bar. We come to work not to be comfortable, but to find out what we're capable of and to do work that matters. We're not calling for people who keep things moving. We're calling for people who make everyone around them better.
    We prioritize low ego and high output. This is a fully remote team.

  • Compensation. You’ll receive a competitive salary, benefits and equity package.

Frequently Asked Questions

Is the salary disclosed for the Data Engineer position at wynd-labs?
The salary for this Data Engineer role at wynd-labs is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Is the Data Engineer job at wynd-labs remote?
Yes, this Data Engineer position at wynd-labs is remote, with team members based in remote. You can work from home or anywhere in the supported regions.
Is the Data Engineer role at wynd-labs full-time or part-time?
This is listed as a FullTime position. It is posted as a Data Engineer role in the Engineering department at wynd-labs.
Which team or department does the Data Engineer at wynd-labs belong to?
This Data Engineer position is part of the Engineering department at wynd-labs. See the full job description for more information about the team structure and responsibilities.
How do I apply for the Data Engineer position at wynd-labs?
Click the "Apply Now" button on this page. You will be redirected to wynd-labs's official application portal hosted on ashby where you can submit your application directly.
When was the Data Engineer job at wynd-labs posted?
This Data Engineer position at wynd-labs was posted on May 19, 2025. Apply as soon as possible — early applications are often reviewed first.
Data Engineer
wynd-labs
Apply for this role ↗

You'll be redirected to wynd-labs's official application page on Ashby ATS.