Robotics Data Pipeline Intern
About this role
Robotics Data Pipeline Intern โ Multimodal Data
About Us At Persona, we're building the next generation of humanoid robots, and that requires an unprecedented volume of high-quality, multimodal data. We're moving beyond basic teleoperation to leverage massive datasets of in-the-wild egocentric video combined with dense sensor streams (IMU, haptics, kinematics, and high-fidelity force profiles). We're looking for a curious, technically sharp intern to roll up their sleeves and help us turn raw, unstructured multimodal data into high-fidelity training assets for our robots.
The Role As a Data Pipeline Intern, you'll work directly alongside our data and robotics engineering teams to support the infrastructure that feeds our foundation models. You'll get hands-on experience with real multimodal data challenges, from sensor stream processing and video pipeline optimization to force analysis and kinematic retargeting. This is not a "fetch coffee and shadow engineers" internship. You'll own real work and ship real code.
What You'll Work On
Rebuilding and extending pipelines that ingest and synchronously process egocentric video alongside rich sensor streams (IMU, force-torque, tactile, proprioception)
Owning post-processing algorithms for force analysis and hidden state inference, including contact force estimation, occlusion handling, and inverse kinematics gap-filling
Bridging kinematic retargeting work that translates human hand tracking into humanoid end-effector coordinates
Optimizing and testing data augmentation strategies (spatial, temporal, synthetic viewpoints, sensor noise injection)
Tying together work across our Hardware Teleoperation Team to help align human-robot play-data across modalities
What We're Looking For
Currently pursuing a B.S., M.S., or Ph.D. in Computer Science, Data Engineering, Machine Learning, Robotics, or a related field
Solid Python skills and exposure to PyTorch, particularly around data loading or multimodal datasets
Coursework or project experience with computer vision, time-series data, or sensor processing
Familiarity with video processing tools (OpenCV, FFmpeg) or pose estimation frameworks (MediaPipe) is a plus
Awareness of imitation learning, VLA architectures, or human-to-robot transfer concepts is a plus, but genuine curiosity counts for a lot here
Bonus Points
Experience with NVIDIA's robotics stack (Isaac, Cosmos, GR00T)
Exposure to distributed computing (Ray, Spark) or simulation environments (Omniverse, MuJoCo)
Any project work involving synthetic data generation or tactile/spatial data representations
Frequently Asked Questions
Is the salary disclosed for the Robotics Data Pipeline Intern position at persona.ai?
Where is the Robotics Data Pipeline Intern position at persona.ai located?
Is the Robotics Data Pipeline Intern role at persona.ai full-time or part-time?
Which team or department does the Robotics Data Pipeline Intern at persona.ai belong to?
How do I apply for the Robotics Data Pipeline Intern position at persona.ai?
When was the Robotics Data Pipeline Intern job at persona.ai posted?
You'll be redirected to persona.ai's official application page on Ashby ATS.