Sign up for daily news updates from CleanTechnica on email. Or follow us on Google News!
The 20th century saw the introduction of various integrated machines into our homes to simplify household chores. Washers, dryers, and dishwashers were early entries, followed more recently by stand mixers, food processors, electric juicers, even robot vacuums. While extremely helpful at speeding up manual tasks, these machines excel at performing only a single task effectively. As we look toward the middle 21st century, we’re ready to consider mechanized household help that performs multiple tasks — domestic robots that can adapt and learn from our needs, all while remaining cost-effective.
Maybe you grew up like I did in the 1960s with the cartoon, The Jetsons, in which flying vehicles transported humans and Rosey was one of the robots in the children’s series that helped with chores. It didn’t actually seem that far-fetched then, and now companies say they are close to production of robots that perceive their surroundings or adapt to spontaneous circumstances.
As chronicled by Wired, a startup in San Francisco has demonstrated that the fantasy of household robots might just be able to become reality. Physical Intelligence has created a single artificial intelligence model that has learned to do a wide range of useful home chores. The breakthrough was training through an unprecedented amount of data. “We have a recipe that is very general, that can take advantage of data from many different embodiments, from many different robot types, and which is similar to how people train language models,” the company’s CEO, Karol Hausman, explained.
Physical Intelligence, also known as PI or π, was founded earlier this year by several prominent robotics researchers to pursue the new robotics approach inspired by breakthroughs in AI’s language abilities.
The advent of large language models (LLMs) enables robots to establish and execute suitable plans in various situations. LLMs interpret natural language from users and complex commands, enabling robots to establish and execute suitable plans in various situations. Moreover, LLMs adapt flexibly to new situations through a zero-shot approach and utilize past data for learning. These capabilities indicate that robots can play a vital role in autonomously navigating changing environments and resolving unexpected issues.
A blog post from Physical Intelligence reveals the research and development that went into their breakthrough.
“Over the past eight months, we’ve developed a general-purpose robot foundation model that we call π0 (pi-zero). We believe this is a first step toward our long-term goal of developing artificial physical intelligence, so that users can simply ask robots to perform any task they want, just like they can ask large language models (LLMs) and chatbot assistants.”
Like LLMs, the Physical Intelligence model is trained on broad and diverse data and can follow various text instructions. Unlike LLMs, it spans images, text, and actions and acquires physical intelligence by training on embodied experience from robots, learning to directly output low-level motor commands via a novel architecture. It can control a variety of different robots and can either be prompted to carry out the desired task, or fine-tuned to specialize it to challenging application scenarios. The company often has humans teleoperate the robots to provide the necessary teaching.
“The amount of data we’re training on is larger than any robotics model ever made, by a very significant margin, to our knowledge,” says Sergey Levine, a cofounder of Physical Intelligence and an associate professor at UC Berkeley. “It’s no ChatGPT by any means, but maybe it’s close to GPT-1,” he adds, in reference to the first large language model developed by OpenAI in 2018.
You can see videos from Physical Intelligence here that show a variety of robot models doing a range of household chores with fairly precise skill. Manipulating a coat hanger. Placing a spice container back on the shelf. Organizing a child’s play room full of toys. Opening a drawer. Closing a door. Replacing kitchen wares.
Folding clothes? Not so much. That task requires more general intelligence about the physical world, Hausman says, because it involves dealing with a wide range of flexible items that deform and crumple unpredictably.
While the algorithm behind these feats doesn’t always perform exactly to expectations, Hausman added that the robots sometimes fail in surprising and amusing ways. When asked to load eggs into a carton, a robot once chose to overfill the box and force it to shut. Another time, a robot suddenly flung a box off a table instead of filling it with things.
Physical Intelligence generates its own data, so its techniques to improve learning arise from a more limited dataset. To develop π0 the company combined so-called vision language models, which are trained on images as well as text, with diffusion modeling, a technique borrowed from AI image generation, to enable a more general kind of learning.
Robots around the house are still years away, but it seems like progress is being made to emulate chores that a person asks them to do. Scaling will need to take place, which Physical Intelligence considers such learning as part of a scaffolding process.
What Does It Take to Train Robots to Do Household Tasks?
For household robots to perform everyday tasks, they must be able to do an object search. That is more difficult than it might seem.
Homes are relatively complex and dynamic environments, as explained in a 2024 article in IEEE Explore. For robots, it seems that some target objects can hardly be observed in the first place. That means the object search has reduced efficiency. As human beings, we make associations among objects, taking intro account relevant but obvious objects, or room categories, in our identification process.
But we humans seem to be able to guide robots toward making meaning of these kinds of information so they can locate target objects more quickly and accurately. It takes modelling in areas such as room category, environmental object, and dynamic object as a relationship in natural languages related to home services. Relationships among these categories need to form, as do rules for how and when to deploy this knowledge in a practical sense. The value of efficiency comes into play next, in that a heuristic object search strategy grounded in the knowledge guides the robot. So, too, does providing the room layout and the distance between the robot and the candidate.
Testing of this process takes place in both the simulated and real environments, and the results are promising in aiding the robots on locating the target object with less time cost and shorter path length.
Chip in a few dollars a month to help support independent cleantech coverage that helps to accelerate the cleantech revolution!
Have a tip for CleanTechnica? Want to advertise? Want to suggest a guest for our CleanTech Talk podcast? Contact us here.
Sign up for our daily newsletter for 15 new cleantech stories a day. Or sign up for our weekly one if daily is too frequent.
CleanTechnica uses affiliate links. See our policy here.
CleanTechnica’s Comment Policy