Robots are getting closer to being able to see and feel the physical world.
A team of researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) developed AI software that’s capable of predicting what an object will look like or feel like by using ‘sight’ and ‘touch.’
The study could help humans and machines work together more seamlessly in the workplace, researchers say.
Scroll down for video
Robots are getting closer to being more aware of their surroundings. MIT researchers developed AI that’s capable of identifying an object by using ‘sight’ and ‘touch’
Their findings also bring robots closer to emulating a common function of the human brain: When humans look at an object, they can often anticipate what it will feel like, i.e. hard, soft, flexible, etc.
Similarly, when humans touch a particular object with their eyes closed, they often form a picture in their head of what they think it looks like.
The team took a KUKA robot arm, a machine often used in warehouses, then outfitted it with GelSight, a type of tactile sensor that’s made of rubber and was also developed by a group at MIT.
From there, researchers recorded 12,000 videos of 200 objects being touched, including tools, household products and fabrics.
The video clips were broken down into static photos and compiled into a dataset of over 3 million images, called ‘VisGel.’
After studying this dataset, the robot arm was able to begin learning the relationship between tactile and visual information, with the help of GANs, or general adversarial networks.
GANs essentially operate using two algorithms, one called the generator and the other called the discriminator.
For the robot to feel and see, the team took a KUKA robot arm (pictured), a machine often used in warehouses, then outfitted it with GelSight, a type of tactile sensor that’s made of rubber
The generator makes judgments based on data, then the discriminator determines whether or not those judgments are correct and the two continuously feed off of one another, making the AI system smarter over time.
In this case, the GAN took the ‘VisGel’ dataset and attempted to make guesses about how an item looked based on tactile data.
‘By looking at the scene, our model can imagine the feeling of touching a flat surface or a sharp edge,’ Yunzhu Li, lead author of the study, said in a statement.
‘By blindly touching around, our model can predict the interaction with the environment purely from tactile feelings.
‘Bringing these two senses together could empower the robot and reduce the data we might need for tasks involving manipulating and grasping objects,’ he added.
While the researchers were impressed with their findings, they cautioned that there were some limitations.
The robot only operated in a controlled environment and didn’t quite master some capabilities, like judging the color or softness of objects.
Still, it could have wide-ranging implications for how robots are used in the workplace.
‘This is the first method that can convincingly translate between visual and touch signals,’ Andrew Owens, a postdoc at the University of California Berkeley, said in a statement.
‘Methods like this have the potential to be very useful for robotics, where you need to answer questions like “is this object hard or soft?”, or “if I lift this mug by its handle, how good will my grip be?”
‘This is a very challenging problem, since the signals are so different, and this model has demonstrated great capability,’ he added.
HOW DOES ARTIFICIAL INTELLIGENCE LEARN?
AI systems rely on artificial neural networks (ANNs), which try to simulate the way the brain works in order to learn.
ANNs can be trained to recognise patterns in information – including speech, text data, or visual images – and are the basis for a large number of the developments in AI over recent years.
Conventional AI uses input to ‘teach’ an algorithm about a particular subject by feeding it massive amounts of information.
AI systems rely on artificial neural networks (ANNs), which try to simulate the way the brain works in order to learn. ANNs can be trained to recognise patterns in information – including speech, text data, or visual images
Practical applications include Google’s language translation services, Facebook’s facial recognition software and Snapchat’s image altering live filters.
The process of inputting this data can be extremely time consuming, and is limited to one type of knowledge.
A new breed of ANNs called Adversarial Neural Networks pits the wits of two AI bots against each other, which allows them to learn from each other.
This approach is designed to speed up the process of learning, as well as refining the output created by AI systems.