Android Dreams

“The danger is never that robots disobey, but that they obey perfectly.”

Takeaways

  1. Vertical robotics companies can win billion-dollar task domains without Embodied General Intelligence (EGI). They also will not be outcompeted in those domains by EGI. This is because cost, speed, and quality determine adoption, and humanoid EGI justifies its cost only in complex domains.
  2. Research in robotics will go from learning from human video to post-training from massive video models. Learning from human video is the best method to scalable collect task-specific data. Post-training from massive video models is the paradigm that enables EGI-scale information content in the training set. Video models clearly already have implicit action models.
  3. Adaptive long-term memory is true online learning and is the last bottleneck to virtual AGI. Long-term memory in humans is not only the process of information storage and retrieval, but also the meta-process of adapting the mechanisms that form notions of importance and concept connections. Both combined create online learning in humans.
  4. There is a self-reinforcing exponential curve for robots that automate robot manufacturing. America's chance to win is to automate and subsidize actuator manufacturing, then proceed to automate every other production layer: mining, mineral processing, machining, and assembly. The self-reinforcing curve is bottlenecked by real-world resources like processed metals and equipment, which are increasingly less of a bottleneck as robot labor automates them as well.
  5. China is currently more ahead in low-cost hardware manufacturing than people think. They have 10x the actuator production, 5-10x lower input costs, 90% rare earth metal processing, 2.2x more energy production, and 10x the steel production, 10x the manufacturing robot installations, etc.
  6. Winning robotics is a national security issue because robots have backdoors. Set by the TikTok acquisition precedent, technology that gives its owner access to the American people and resources at scale is a threat.
  7. AI labs with the scale to train massive video models create Embodied General Intelligence (EGI) first, paired with a humanoid company for hardware. The base of EGI is cross-embodiment-compatible, but the final version is hardware-specific.
  8. There will be a diversity of robot form factors in the future. It will range from inexpensive, grippered, wheeled robots doing repetitive tasks to high-quality human-looking androids for service to ultra-engineered elite F1 humanoids for entertainment. Humanoids are necessary for EGI because they can learn from video data more easily.