Artificial Intelligence can now “imagine” objects and scenarios that could help scientists develop new medicines and increase the safety of self-driving vehicles.
Picture a grey cat. Now, picture the same cat, but with orange fur. Now, picture the cat walking on two legs, down a tightrope in a night circus. The fact that you can conjure all these images in your mind is thanks to a quick series of neuron activations that use your previous knowledge of the world to come up with multiple variations of an object, and in this case, cats. And for many, the ability to imagine, make scenarios, and envision objects with different attributes is a key defining trait of mankind. However, while this has been true for many centuries, recent developments in artificial intelligence (AI) have shown that the skill of “imagination” is no longer exclusive to humans.
Researchers from the University of Southern California have developed a new AI system that can imagine a never-before-seen object with different attributes. By putting together features that were previously analysed by the system, the machine works much like how humans take in and combine previously learnt concepts to generate new ideas.
“We were inspired by human visual generalisation capabilities to try to simulate human imagination in machines,” said the study’s lead author Yunhao Ge, a computer science doctoral student working under the supervision of Laurent Itti, a computer science professor. “Humans can separate their learned knowledge by tributes — for instance, shape, pose, position, colour — and then recombine them to imagine a new object. Our paper attempts to simulate this process using neural networks.”
In an ideal world, developers would be able to create AI systems with the ability to extrapolate. For example, if they wish to build an AI that generates images of cars, feeding the algorithm with several images of a car would be sufficient to train the system to generate a variety of cars – from Porches to Pontiacs to pick-up trucks – in any colour, from multiple angles. This means that, provided a few examples, the system should be able to extract the underlying rules and apply them to a vast range of novel examples it has not seen before.
However, such an elegant solution has yet to take shape in most AIs today. Because most machines are trained with sample features like pixels, they fail to consider the object’s attributes. Consequently, many systems often produce inaccurate images or outcomes when working with new previously unseen data, or in technical terms, suffer from high generalisation errors.
To overcome this limitation, the researchers borrowed a concept called disentanglement, which is often used to generate deepfakes by disentangling human face movements and identity. “People can synthesise new images and videos that substitute the original person’s identity with another person, but keep the original movement,” explained Ge.
Taking the principles of disentanglement, the study’s new AI takes a group of sample images and collectively mines the similarity between them to attain “controllable disentangled representation learning.” Next, the system recombines the knowledge to achieve “controllable novel image synthesis,” which closely mimics how humans extrapolate information, or simply, imagine.
“For instance, take the Transformer movie as an example,” said Ge. “It can take the shape of Megatron car, the colour and pose of a yellow Bumblebee car, and the background of New York’s Time Square. The result will be a Bumblebee-coloured Megatron car driving in Times Square, even if this sample was not witnessed during the training session.”
By unlocking this new ability, the framework is expected to potentiate a wide range of applications. In the study, the team demonstrated one valuable use of their AI by generating a new dataset containing 1.56 million images. This dataset could become a valuable resource for future studies in the field.
Additionally, the AI system can also bring a myriad of benefits in other fields like medicine and autonomous vehicles. Since the framework is compatible with nearly any type of data or knowledge, the team believes that the system can assist doctors and biologists to discover new drugs by disentangling the medicinal function and properties of drugs. New medicines can also be synthesised by recombining the mined data. Furthermore, imbuing machines with imagination could also potentially improve the safety of AI used in autonomous vehicles by allowing them to anticipate and avoid dangerous scenarios even if they were previously unseen during training.
“Deep learning has already demonstrated unsurpassed performance and promise in many domains, but all too often this has happened through shallow mimicry, and without a deeper understanding of the separate attributes that make each object unique,” said Itti. “This new disentanglement approach, for the first time, truly unleashes a new sense of imagination in AI systems, bringing them closer to humans’ understanding of the world.”
Source: Ge et al. (2021). Zero-Shot Synthesis with Group-Supervised Learning. International Conference on Learning Representations 2021