
Data Engineer - AI trainer
Required Skills
data engineering
big data
hadoop
spark
kafka
cloud platforms
data pipelines
ai/ml applications
prompt engineering
generative ai
data curation
data quality monitoring
technical documentation
written communication
verbal communication
remote collaboration
problem-solving
troubleshooting
Job Description
Job Title: Data Engineer - AI trainer
Job Type: Part-time, Paid per task
Location: Remote
Job Summary:
Join our customer’s team as a Data Engineer - AI trainer and work at the intersection of data engineering and advanced AI training. You will play a pivotal role in designing and implementing data engineering solutions specifically tailored for AI model development, leveraging your Subject Matter Expertise (SME) in big data.
Key Responsibilities:
- Collaborate with AI and engineering teams to create and refine high-quality prompts for training generative AI models focused on data engineering topics.
- Develop, curate, and optimize datasets to enhance AI learning and performance.
- Establish best practices for data engineering workflows that support scalable and robust AI solutions.
- Serve as SME for big data technologies, providing guidance in architectural decisions and technical challenges.
- Design and document full-cycle prompt creation processes, ensuring clarity and reproducibility.
- Communicate complex technical ideas clearly in both written and verbal formats, supporting both technical and non-technical stakeholders.
- Continuously monitor, evaluate, and improve data pipelines and training data quality.
Required Skills and Qualifications:
- Proven experience in data engineering and big data environments (e.g., Hadoop, Spark, Kafka, cloud platforms).
- Expertise in constructing, maintaining, and optimizing data pipelines for AI/ML applications.
- Demonstrated ability to create technical documentation and clear instructional materials.
- Exceptional written and verbal communication skills, with a strong attention to detail.
- Proficient in prompt engineering for AI training, including data generation and validation.
- Ability to work independently and collaboratively within a remote, cross-functional team.
- Experience troubleshooting and resolving data-related issues efficiently.
Preferred Qualifications:
- Prior experience as an AI trainer or in AI/ML prompt engineering projects.
- Advanced degree in Computer Science, Data Engineering, or related field.
- Familiarity with generative AI models and their data requirements.