Researchers from Brown and Johns Hopkins Universities have designed a new method for computers to gather information from images. The new evaluation system enables a computer to decipher details such as movement, relationships, and implied intent from the images by designing the system to asking a specific series of questions.
The researchers are calling the new system a “visual Turing test” after the famous computer scientist Alan Turing’s “Turing Test”. The Turing Test is an evaluation of how much human-like intelligence a computer system can possess. Even with the extraordinary advances in technology over the past 100 years, computers still do not possess the capability to truly think like humans do. The way people analyze and compare data is much different than the way a computer does.
One example of the differences can be seen in the way a computer analyzes a picture. A computer reads detailed information called metadata in the image and uses this as it’s starting point for determining specifics of the picture. Humans, on the other hand, take in to account a number of different circumstances and details in relation to the image itself, their past experiences and any related memories.
The new method builds and expands on present computer vision and works by generating a string of yes or no questions about an image. Each question becomes progressively more in-depth and each question is based on the responses from the questions that came before. This series of questions is geared towards building more of a storyline from whats happening in an image, not just common details like the number of people or daytime/nighttime.
To learn more, see the team’s research published in the Proceedings of the National Academy of Sciences
Image Courtesy: Public Domain Images