are very limited in what they can do.
Their inability to understand the nuances of human language
makes them mostly useless for more complicated requests.
For example, if you put a specific tool in a toolbox and ask a robot to “pick it up,” it would be completely lost.
Picking it up means being able to see and identify objects, understand commands, recognise that the “it” in question is the tool you put down, go back in time to remember the moment when you put down the tool, and distinguish the tool you put down from other ones of similar shapes and sizes.
Researchers from MIT have gotten closer to making this type of request easier.
They have developed an Alexa-like system called “ComText” — for “commands in context” — that allows robots
to understand a wide range of commands that require contextual knowledge about objects and their environments.
“Where humans understand the world as a collection of objects and people
and abstract concepts, machines
view it as pixels, point-clouds, and 3D maps
generated from sensors,” said Rohan Paul, one of the lead authors of the paper. “This semantic gap means that, for robots
to understand what we want them to do, they need a much richer representation of what we do and say.”
The team tested ComText
on a two-armed humanoid robot Baxter. ComText
can observe a range of visuals and natural language
to learn about an object’s size, shape, position, type and even if it belongs to somebody. From this knowledge base, it can then reason, infer meaning and respond to commands.
“The main contribution is this idea that robots
should have different kinds of memory, just like people,” said Andrei Barbu, the project’s co-lead.
With ComText, Baxter was successful in executing the right command about 90 per cent of the time.
In the future, the team hopes to enable robots
to understand more complicated information, such as multi-step commands, the intent of actions, and using properties about objects to interact with them more naturally.
By creating much less constrained interactions, this line of research could enable better communications for a range of robotic systems, from self-driving cars to household helpers.
“This work is a nice step towards building robots
that can interact much more naturally with people,” said Luke Zettlemoyer, an associate professor at the University of Washington
in the US, who was not involved in the research.
“In particular, it will help robots
better understand the names that are used to identify objects in the world, and interpret instructions that use those names to better do what users ask,” Zettlemoyer said.
Currently, robots are very limited in what they can do
For example, if you put a specific tool in a toolbox and ask a robot to “pick it up,” it would be completely lost
MIT researchers developed a system called “ComText” that allows robots to understand a wide range of commands
The team tested ComText on a two-armed humanoid robot called Baxter
ComText can observe a range of visuals and natural language to learn about an object’s size, shape, position, type and even if it belongs to somebody
From this knowledge base, it can then reason, infer meaning and respond to commands