
Data Center Operations Manager
Required Skills
facilities operations
Electrical Systems (UPS
Generators
ATS
PDUs)
Cooling Systems
DCIM tools
About micro1
micro1 connects domain experts to the development of frontier AI models. Real-world expertise is turned into training data, evaluations, and feedback loops that improve how models perform. AI labs and enterprises use micro1 to train models and build reliable AI agents through advanced evaluations and reinforcement learning environments. Experts contribute directly to how AI systems learn, reason, and perform across domains like finance, healthcare, engineering, and more. Our platform identifies and vets top talent through an AI recruiter, enabling high-quality contributions at scale.
Our goal is to enable 1 billion people to do meaningful work by applying their expertise to AI. We’ve raised $40M+ in funding, and our AI recruiter has powered over 1 million AI-led interviews as our global network of experts grows into the human intelligence layer for AI.
Job Description
Job Title: Data Center Operations Manager
Job Summary
- Tier 3 (L3) Data Center Engineer – Physical Infrastructure
- Responsible for expert-level support, troubleshooting, and management of data center facility systems
- Covers power, cooling, racks, and environmental controls
- Acts as the final escalation point for complex facility-related issues
- Ensures high availability, reliability, and safety of the physical data center environment
Key Responsibilities
1. Advanced Facilities Support & Escalation
- Act as L3 escalation point for critical issues related to:
- Power systems (UPS, generators, ATS, PDUs)
- Cooling systems (CRAC/CRAH, chillers)
- Environmental systems (temperature, humidity, sensors)
- Lead troubleshooting and restoration during major outages or failures
- Coordinate with facility vendors and contractors for resolution
2. Power & Electrical Systems Management
- Oversee operation and health of:
- UPS systems and battery banks
- Generators and fuel systems
- ATS panels and electrical distribution
- Rack PDUs and power circuits
- Review load balancing and ensure no single point of failure
- Validate redundancy (N+1, 2N configurations)
3. Cooling & Environmental Management
- Ensure optimal operation of:
- Cooling distribution
- Monitor temperature and humidity levels
- Optimize airflow management (hot/cold aisle containment)
4. Preventive Maintenance & Compliance
- Review and validate PM/PPM activities carried out by vendors
- Ensure compliance with maintenance schedules and OEM standards
- Maintain records of maintenance logs and reports
- Ensure adherence to safety and regulatory standards
5. Incident & Problem Management
- Lead root cause analysis (RCA) for facility-related incidents
- Ensure quick recovery from:
- Power failures
- Cooling issues
- Environmental alarms
- Maintain detailed incident reports and corrective action plans
6. Change & Risk Management
- Review and approve facility-related changes (power, cooling, racks)
- Conduct risk assessments before any critical activity
- Ensure changes are executed during approved maintenance windows
7. Capacity Planning & Optimization
- Monitor and plan for:
- Power capacity utilization
- Cooling capacity and efficiency
- Rack space and layout
- Support data center expansion and future growth planning
8. Vendor & Stakeholder Coordination
- Manage and coordinate with:
- Electrical and mechanical vendors
- Facility management teams
- Ensure SLA compliance and quality of service delivery
9. Documentation & DCIM Management
- Maintain:
- SOPs, MOPs, EOPs
- Single-line diagrams (SLD)
- Rack layouts and power mapping
- Update and manage DCIM system data
10. Disaster Recovery & Emergency Handling
- Support emergency response procedures (EOPs)
- Participate in power fail and DR drills
- Ensure readiness for failover and backup systems
Qualifications & Experience
- Bachelor’s degree in electrical / mechanical engineering or related field
- Minimum 6–10 years of experience in data center facilities operations
- Strong experience in critical infrastructure environments
Technical Skills
- Electrical Systems (UPS, Generators, ATS, PDUs)
- Cooling Systems
- DCIM tools and environmental monitoring
- Load calculations and capacity planning
- Knowledge of redundancy models (N+1, 2N)
Key Competencies
- Advanced troubleshooting and crisis handling
- Strong understanding of critical facility operations
- Risk assessment and decision-making
- Vendor management and coordination
- Attention to detail and operational discipline
Preferred Certifications
- Certified Data Center Professional (CDCP / CDCS)
- Electrical or Mechanical certifications