Over the past few years, Facebook has been advancing its image recognition capabilities, which could soon provide a whole new consideration for marketers. Today, Facebook has released a new research paper outlining their latest image recognition developments, including advanced search capabilities based on objects.
Here's how Facebook's image search is evolving, and the potential opportunities that will provide.
Computer Vision
Last April, Facebook announced the release of automatic alternative text - or automatic alt text - for images posted to Facebook. Automatic alt text uses object recognition technology to generate a description of a photo, processing each through Facebook's artificial intelligence engine to establish image content.
This system is the culmination of years of work - back in November 2015, Facebook showcased the progress they'd made with their image recognition AI, with their system able to distinguish between objects in a photo 30% faster, and using 10x less training data, than previous industry benchmarks.
The level of detail the system now translate may not seem like much of a return for years of effort, but the amount of work involved in teaching a computer to "see" what's in an image is massive. The system needs to be trained on huge datasets with relevant descriptions of each image in order to ascertain the key elements and determine what each object is.
Yet even with all that training, there are still limitations in what image recognition can do - for example, in the image below, you can see how the system still stumbles when objects are 'on top' of each other, recognizing two people as one.
Those types of more finite recognition abilities are much harder to train, which gives you some idea of the challenges Facebook's researchers face in working to build an image recognition system. Things that a human can do easily are not simple for a system based on code and inputs.
But they are getting much better, and the applications for such advancements likely go beyond what you might think.
Sight and Insight
In their latest research paper, Facebook has outlined their advances in image recognition.
First, they've added new descriptions of actions within images, as opposed to just objects:
"Until recently, these captions described only the objects in the photo. Today we're announcing that we've added a set of 12 actions, so image descriptions will now include things like "people walking," "people dancing," "people riding horses," "people playing instruments," and more."
Again, it may not seem ground breaking, and it may not have a huge impact on non-visually impaired users, but the development is big. The fact that Facebook's system is now moving onto actions underlines just how smart their systems are getting, and again, the amount of work involved to get to this stage is staggering - though Facebook does have a huge training model to work with, with somewhere in the vicinity of two billion photos being uploaded to Facebook, Instagram, Messenger and WhatsApp every day.
In addition to this, Facebook has also announced a new search capability based on image recognition.
"...we've built a search system that leverages image understanding to sort through this vast amount of information and surface the most relevant photos quickly and easily. In other words, in a search for "black shirt photo," the system can "see" whether there is a black shirt in the photo and search based on that, even if the photo wasn't tagged with that information."
The functionality, similar to what's available in Google Photos, could make it much easier to categorize and locate your images.
Recognizing Opportunity
But as much as these individual applications are interesting, and will no doubt provide benefit, it's the extended application of Facebook's image recognition AI that's far more compelling.
Right now, Facebook is already using their image recognition AI for two purposes:
1. To identify and remove objectionable content. As noted in this post from Ars Technica:
"In the past, users tagged these images as objectionable, and that info was funneled to the Protect and Care team. Images confirmed objectionable were deleted by a team member. Then machine learning models were built to identify and delete these images. In 2015, the ML models examined and eliminated more of these images than people did. Now, the Protect and Care group independently creates new classifiers to identify new types of objectionable material and retrain the models to automatically respond to it."
2. To build the "Memories" montages you regularly see in your News Feed - those image packages are put together by Facebook's AI based on the content you're more likely to engage and interact with.
And again, those are interesting, but further than that, it's not hard to imagine that Facebook's advanced image recognition AI will soon provide additional opportunities for marketers, with a whole new data set to work with.
For example, using this Google Chrome extension, you can see what Facebook's AI can recognize in any photo posted to the network.
Now imagine this - let's say you're a café looking to reach new audiences in a certain region. A handy new search parameter might be the ability to locate Facebook users who regularly post images like this.
As you can see, Facebook's AI has correctly identified both "coffee" and "coffee cup". Such capacity would not only enable you to pinpoint users who are interested in coffee (based on the images they post, not just what they write), but it would also mean you could target people who regularly post pictures of their coffee, which would also increase your chances of increased exposure via user-generated content.
Reach out to these people with relevant coupons or offers, and there's a good possibility that they'll not only take you up on the offer, but that they'll also let their friends know by posting images of your business.
In this post (below), Facebook's AI has identified an ultrasound image. And while you'd expect most announcements like this would also contain words like "baby" or "expecting" in the text, the additional qualifier may also make it easier to locate potential audiences for baby or maternity products.
Facebook has also highlighted the potential benefits of such data in their own research. Last August, Facebook conducted a study of more than 160,000 people in the United States who've shared photos of cats or dogs (or both) to gain further insight into the differences between each group.
It's light-hearted research, for sure, but again, it underlines the potential of their image recognition technology to add an additional qualifier, another way to further clarify and refine your audiences to target the exact right people with the exact right messaging.
The next level of image recognition, of course, is video, which Facebook is also developing.
That capacity could also further enhance Facebook's offerings - as noted by Ars Technica:
"AI inference could rank the Live video streams, personalizing the streams for individual user's newsfeeds and removing the latency of video publishing and distribution. The personalization of real-time reality video could be very compelling, again increasing the time that users spend in the Facebook app."
It may not seem like much - searching for specific images you've taken is not a game-changing addition to your Facebook experience. But don't limit your perspective to what you can see now.
There's far more to Facebook's image recognition tech than what you see on the surface at present.