RoboticsAI

OpenAI’s Secret Robotics Push: The “Physical GPT” Era Begins

To be honest, I thought they were done with hardware. Back in 2020-2021, when OpenAI disbanded its original robotics team, the narrative was clear: “Hardware is hard, software is scalable. Let’s focus on ChatGPT.”

But as I’ve been tracking the latest developments, it turns out they didn’t leave the arena—they just went into stealth mode.

According to a recent report by Business Insider, OpenAI is quietly running a dedicated robotics laboratory in San Francisco. This isn’t just a side project; it feels like the beginning of the “ChatGPT moment” for the physical world. As someone who lives and breathes this tech, I want to dive deep into what they are actually doing, why they are using cheap robotic arms instead of fancy humanoids, and what this means for the future of AI.


The Secret Lab in San Francisco

It seems OpenAI is taking a page out of the “brute force” playbook that made their LLMs so successful. They aren’t just coding algorithms; they are farming data.

Here is what’s happening behind closed doors:

  • The Setup: A facility in San Francisco that has quadrupled in size recently.
  • The Workforce: Approximately 100 data collectors and over a dozen robotic engineers working in shifts.
  • The Gear: They aren’t obsessing over walking robots (yet). The focus is on robotic arms and a teleoperation device called “GELLO.”

The lab is reportedly operational around the clock. This tells me one thing: They are hungry for data.

Why “GELLO” and Teleoperation?

This is the part that fascinates me. Instead of letting a robot try to pick up an apple 10,000 times until it figures it out (Reinforcement Learning), OpenAI is hiring humans to control the robot arms remotely to perform the task correctly.

They are using GELLO, a 3D-printed, low-cost controller that maps human arm movements directly to the robot.

Why does this matter? Because AI models like GPT-4 are trained on the internet—text, images, video. But there is no “internet” for muscle memory. You cannot scrape the web to learn exactly how much pressure to apply when holding a strawberry versus a rock. That data has to be created from scratch. OpenAI is essentially building the “textbook of movement” by having humans write it, one movement at a time.


Mundane Tasks: The Path to AGI?

You might laugh when you hear that these cutting-edge robots are being trained to fold laundry or put bread in a toaster. But from my perspective, this is the Holy Grail of robotics.

In the tech world, we call this Moravec’s Paradox:

High-level reasoning (playing chess, writing poetry) requires very little computation, but low-level sensorimotor skills (walking, folding a shirt) require enormous computational resources.

ChatGPT can pass the Bar Exam, but it has no idea how to navigate a messy kitchen. By focusing on these “boring” tasks, OpenAI is trying to solve the hardest part of intelligence: understanding the physical world.

If they can train a model to understand physics, gravity, and object permanence through these robotic arms, they can plug that “brain” into any body—whether it’s a Figure 01 humanoid or a factory arm.


The Strategy: Imitation Learning vs. Reinforcement Learning

I’ve been analyzing OpenAI’s history here. Their previous robotics attempt (the one that solved the Rubik’s Cube) relied heavily on Reinforcement Learning (RL)—basically trial and error.

The new strategy appears to be Imitation Learning (Behavioral Cloning).

  1. Human demonstrates: A human uses the GELLO arm to fold a shirt.
  2. Robot records: The AI records the visual data and the motor joint data.
  3. Model learns: After thousands of examples, the AI learns the concept of folding, not just the specific motion.

This is exactly how they built ChatGPT. They took massive amounts of human text and trained a model to predict the next word. Now, they are taking massive amounts of human movement to train a model to predict the next action.


The Ecosystem: Why Build When You Can Buy? (Or Do Both?)

I found it interesting that OpenAI is doing this in-house despite investing in major players like Figure, 1X, and Physical Intelligence.

In fact, the partnership with Figure (aimed at putting OpenAI’s brain into Figure’s body) hit a snag recently, with Figure’s CEO implying they are pulling back from the deal.

My take? OpenAI realized that Embodied Data is the new oil. If they rely on Figure or Tesla for the hardware data, they lose their competitive edge. By building their own “Data Factory” (and planning a second one in Richmond, CA), they ensure they own the foundational model for robotics, just like they own the foundational model for text.


What This Means for Us

ChatGPT-Users-Buzzing-Is-GPT-5-on-the-Horizon

We are witnessing the birth of VLA Models (Vision-Language-Action).

Imagine asking ChatGPT not just to write a recipe, but to: “Look at the ingredients on my counter, tell me what I can cook, and then guide my kitchen robot to chop the onions.”

That reality is closer than we think. While the “iRobot-style” humanoid in their lobby sits mostly inactive for now, the real magic is happening with those robotic arms. They are building the software brain that will eventually power the humanoid bodies of the future.

Summary of Key Insights

  • Shift in Strategy: Moving from pure software/RL to hardware-assisted Imitation Learning.
  • Data is King: The massive hiring of data collectors proves that high-quality physical data is currently the bottleneck in AI.
  • Low Cost, High Scale: Using 3D printed controllers (GELLO) allows them to scale data collection much faster than using expensive proprietary suits.
  • Independence: OpenAI is reducing reliance on hardware partners to own the “physical intelligence” stack.

Final Thoughts

I’m genuinely excited, but also a bit wary. The speed at which this is moving suggests that the “brain” problem for robots is being solved faster than the “body” problem. We might soon have AI that is clumsy but incredibly smart at understanding physical tasks.

I’d love to hear your thoughts on this: If OpenAI releases a “Brain” for robots that can do your laundry, but it requires a camera feed of your entire home to work, would you let it in?

Let’s discuss in the comments!

You Might Also Like;

Back to top button