Data Engineer

Core team

$80K - $150K/yr compensation

Required Skills

Python

Pandas

SQL

AI/ML

About micro1

micro1 connects domain experts to the development of frontier AI models. Real-world expertise is turned into training data, evaluations, and feedback loops that improve how models perform. AI labs and enterprises use micro1 to train models and build reliable AI agents through advanced evaluations and reinforcement learning environments. Experts contribute directly to how AI systems learn, reason, and perform across domains like finance, healthcare, engineering, and more. Our platform identifies and vets top talent through an AI recruiter, enabling high-quality contributions at scale.

Our goal is to enable 1 billion people to do meaningful work by applying their expertise to AI. We’ve raised $40M+ in funding, and our AI recruiter has powered over 1 million AI-led interviews as our global network of experts grows into the human intelligence layer for AI.

Job Description

Job Title: Data Engineer

Job Type: Full-time

Location: Remote

The Role

We are looking for a Data Engineer to support data infrastructure and experimentation in an AI research environment. In this role, you will build reliable data pipelines, explore datasets, and help transform raw data into structured formats that enable research and model development.

Key Responsibilities

Design, build, and maintain scalable data pipelines to ingest, process, and transform data from multiple sources.
Collaborate with AI researchers and data scientists to structure and prepare datasets for experimentation and model training.
Develop and maintain data models, schemas, and storage systems optimized for large-scale datasets.
Write efficient SQL queries and Python scripts to extract, transform, and analyze data.
Ensure data quality, integrity, and reliability across data pipelines and storage layers.
Implement data validation, monitoring, and automation workflows that support iterative research cycles.

Required Skills and Qualifications

Strong proficiency in Python and SQL.
Experience designing and maintaining ETL / ELT pipelines.
Solid experience with data manipulation libraries such as Pandas and NumPy.
Experience working with structured and semi-structured datasets.
Familiarity with relational databases such as PostgreSQL or MySQL.
Strong analytical thinking and ability to work in collaborative research-driven environments.
Excellent written and verbal communication skills.

Nice to Have

Exposure to AI/ML workflows or research environments.
Experience with data visualization tools such as Matplotlib, Seaborn, or Plotly.
Familiarity with LLM-related data workflows (datasets for training, evaluation, or prompt experimentation).

Apply now

First name

Last name

Phone number

Linkedin profile URL

Upload your resume (in English)

Click to upload or drag & drop (.pdf)

Please note that after completing the interview process, you’ll be added to our talent pool and considered for this and other roles that match your skills.

Have any questions? See FAQs

Refer and Earn$1500