Data QA Engineer
Required Skills
automated data testing
etl/elt pipeline testing
sql
python
data validation
data quality monitoring
bi/reporting tools
data modeling
cloud data platforms (e.g., snowflake, bigquery, redshift)
airflow
dbt
data governance
data privacy and compliance
documentation
root cause analysis
attention to detail
cross-team collaboration
remote work effectiveness
Job Description
Job Summary:
Join our team as a Data QA Engineer and play a crucial role in ensuring the quality and reliability of our data products. You will be responsible for designing and executing automated and targeted manual tests to validate data pipelines, transformations, models, and business intelligence outputs. This role empowers the business to make confident, data-driven decisions by safeguarding the accuracy and trustworthiness of our data engineering deliverables.
Key Responsibilities:
- Design, develop, and maintain automated data quality tests for ETL/ELT pipelines and analytical data models.
- Perform in-depth manual verification on high-impact data flows, reports, and dashboard outputs.
- Implement robust data quality monitoring, alerting, and validation frameworks to catch anomalies early in the data lifecycle.
- Collaborate with data engineers and analysts to understand business requirements and translate them into effective QA processes.
- Thoroughly validate data transformations, aggregations, and outputs to uphold integrity and consistency.
- Create clear, detailed documentation and actionable reports on data quality issues and improvements.
- Continuously refine QA strategies and tools to adapt to the evolving data landscape and best practices.
Required Skills and Qualifications:
- Expertise in designing and implementing automated data testing solutions for data warehouses, data lakes, or large-scale ETL/ELT pipelines.
- Strong experience with SQL and scripting languages such as Python for data validation and testing automation.
- Solid understanding of data modeling concepts and BI/reporting tools.
- Familiarity with data quality monitoring frameworks and alerting mechanisms.
- Exceptional written and verbal communication skills, with a passion for clear documentation and cross-team collaboration.
- Meticulous attention to detail and a methodical approach to identifying root causes of data discrepancies.
- Track record of working effectively in a remote, distributed team environment.
Preferred Qualifications:
- Experience with cloud-based data platforms (e.g., Snowflake, BigQuery, Redshift) and modern orchestration tools (e.g., Airflow, dbt).
- Knowledge of data governance, privacy standards, and compliance best practices.
- Background in validating data for machine learning, analytics, or complex business reporting environments.
About micro1
micro1 is a data engine that helps AI labs train foundational models and enterprises build AI agents. We provide frontier evaluations and reinforcement learning environments used to improve LLM capabilities, as well as contextual evaluations used to monitor and improve AI agents in enterprise settings. Our data engine includes an AI recruiter agent that sources and vets domain experts, a data platform that enables rapid production of high-quality training data, and a pipeline performance system that ensures both quality and velocity.
Our goal is to have 1 billion people doing meaningful work by contributing their expertise to the development of frontier AI models. We’ve raised $40M+ in funding, and our AI recruiter has powered more than 1 million AI-led interviews as our global network of experts expands to form the human intelligence layer for AGI.