Junior Technical Program Manager — Infrastructure Operations
Together AI is a research-driven artificial intelligence company focused on creating innovative and transparent AI systems. They are seeking a Junior Technical Program Manager to oversee the operational management of their GPU fleet, ensuring efficient node lifecycle management and cross-functional collaboration to resolve issues swiftly.
Responsibilities
- Own the end-to-end node lifecycle - from failure through repair, return, and re-integration — across provider ticketing, internal tooling, and the state machine that governs each stage
- Drive node remediation to resolution with urgency, eliminating gaps in ownership at every handoff
- Manage project timelines for new datacenter bring-ups, coordinating across internal teams and external providers to keep milestones on track
- Identify and diagnose GPU utilization loss across the fleet, working with engineering leads to drive resolution
- Build dashboards and tracking processes that make efficiency gaps visible and ensure they get closed
- Continuously improve operational workflows through process improvements and lightweight automation
- Develop and maintain relationships with external datacenter providers
Skills
- Some prior experience in a TPM role - we're open to candidates who came into TPM from engineering, ops, or another technical function, but you should have done the work in practice: owning programs end-to-end, driving cross-functional resolution, managing external dependencies
- A technical background or demonstrated experience in a highly technical environment - you don't need to know GPUs on day one, but you need to be able to engage meaningfully with technical problems and earn credibility with infrastructure engineers
- A genuine bias toward action - you see a problem and start moving, even when the path forward isn't fully clear
- Resilience in a fast-paced, sometimes chaotic environment - you adapt quickly, stay effective under pressure, and don't wait for perfect conditions to make progress
- Strong organizational instincts - you can manage multiple workstreams, track dependencies, and keep things moving without losing the thread
- Ability to zoom out - you can be deep in the weeds on an operational problem while keeping the bigger picture in view
Benefits
- Startup equity
- Health insurance
- Other competitive benefits
Company Overview
Company H1B Sponsorship