Multiple AI models help robots execute complex plans

The HiP framework develops detailed plans for robots.

While basic chores come naturally to humans, they require intricate planning for robots. Now, MIT scientists have developed a new AI system called "HiP" that allows robots to make detailed plans to accomplish complex goals.

HiP works by combining three different foundation models - large AI models trained on massive datasets like images, text or video. Each model specializes in a different capability - language reasoning, visual perception or action planning. By dividing planning tasks between specialized models rather than relying on a single monolithic model, HiP generates more nuanced step-by-step plans for robots to follow.

In tests, HiP directed a robot through multi-phase tasks like stacking blocks, arranging objects, and simulated meal preparation. Unlike other systems, HiP could dynamically adjust its plans to account for changes in the environment and tasks. For example, when instructed to stack certain colored blocks that weren't available, HiP planned for the robot to first paint white blocks the needed colors before stacking them.

Researchers say HiP represents an evolution in robotic planning toward more adaptable systems. While today's robots require meticulous coding of each sub-task, HiP leverages the power of AI for autonomous decision making. Its hierarchical approach also makes the reasoning process more transparent than end-to-end learning models.

"Instead of pushing for one model to do everything, we combine multiple ones that leverage different modalities of internet data," says PhD student Anurag Ajay. "When used in tandem, they help with robotic decision-making and can potentially aid with tasks in homes, factories, and construction sites."

Looking ahead, improved video and multisensory foundation models could further enhance HiP's contextual understanding and lead to more seamless human-robot collaboration. MIT researchers plan to test the system on additional real-world tasks like manufacturing and construction projects requiring long-horizon planning.

As foundation models continue to advance in capabilities, HiP represents a framework for imbuing robots with increasing intelligence and foresight to handle open-ended goals. That could one day provide a helpful hand with chores around the house or allow adaptive automation across many industries.

Write and read comments only authorized users.

You may be interested in

Read the recent news from the world of robotics. Briefly about the main.

Taiwan's Techman Robot showcases cutting-edge technology at Automate 2025

At the company's stand will also present a new model of the TM6S robot.

MIT's manipulation planning research

MIT researchers have developed a new AI technique to simplify robotic manipulation.

Robotics: New Oil

Robotics and artificial intelligence have become the new "oil" for the technology sector.

Share with friends

media_1media_2media_3media_4media_5media_6media_7