In this series, I examine what is missing from today's dominant approaches to robot learning. My focus is pragmatic: what is the next meaningful step toward scalable, reliable physical AI systems that actually work in the real world.
Learning a physical skill like snowboarding requires more than observation—it requires embodiment.
The three components of robot learning map to different aspects of skill acquisition.
From Scott McCloud's "Understanding Comics" [11] - our awareness of self flows outward to include objects of our extended identity.
The NVIDIA ecosystem spans three axes: Physics Simulation (Newton/Warp, Isaac Sim), Skill Training (Isaac Labs), and Asset Generation (Replicator).
Lucid Sim [14] generating simulated scenes - both approaches can generate effectively unbounded data for robot learning.
A world model training workflow: Build scenes from real-world videos, add robots for interaction, and train in parallel with various textures, scenarios, and lighting.