A little-known
Google
Labs widget has enabled researchers from
UC San
Diego and
UCLA to add "
common sense" to computers.
The computer scientists have added the ability to use context to help
identify objects in photographs in an automated image labelling system.
For example, if a conventional automated object identifier has scanned an
image and identified 'person', 'tennis racket', 'tennis court' and 'lemon', the
new post-processing context check will re-label 'lemon' as 'tennis ball'.
"We think that our paper is the first to bring external semantic context to
the problem of object recognition," said computer science professor Serge
Belongie from UC San Diego.
The researchers showed that
Google
Sets can be used to provide external contextual information to automated
object identifiers.
Google Sets generates lists of related items or objects from just a few
examples. If a user types in 'John', 'Paul' and 'George', it will return the
words 'Ringo', 'Beatles' and 'John Lennon'.
Similarly, entering 'neon' and 'argon' will generate a list of other noble
gasses, i.e. 'helium', 'krypton', 'xenon' and 'radon'.
"In some ways, Google Sets is a proxy for common sense," explained Professor
Belongie.
"In our paper we showed that you can use this common sense to provide
contextual information that improves the accuracy of automated image labelling
systems."
The image labelling system is a three-step process. Firstly, an automated
system splits the image into different regions using image segmentation.
In the tennis example, image segmentation separates the person, the court,
the racket and the yellow sphere.
Next, an automated system provides a ranked list of probable labels for each
of these image regions.
Finally, the system adds a dose of context by processing all the different
possible combinations of labels within the image, and maximising the contextual
agreement among the labelled objects within each picture.
It is during this step that Google Sets can be used as a source of context
that helps the system turn a 'lemon' into a 'tennis ball'.
In this case, these 'semantic context constraints' helped the system to
disambiguate between visually similar objects.
The
Objects
in Context paper (PDF) will be presented today at the 11th
IEEE
International Conference on Computer Vision in Rio de Janeiro.
Do you agree?
Have your say on this article