By Marissa Lang
Updated 9:30 pm, Monday, April 4, 2016
Text-to-speech dictation software that describes the back-and-forth comments and recites friends’ status updates would offer little when users came across an image: “Photo,” the machine would say. Maybe a name, if the photo was tagged with a person. Nothing more.
“It was incredibly frustrating,” said Matt King, a blind electrical engineer and member of Facebook’s Accessibility Team.
That will change Tuesday when the social media giant launches a new feature to help the visually impaired navigate Facebook’s increasingly photo-driven platform: an artificial intelligence that can analyze images and describe them aloud to users who can’t see.
This feature, built by numerous Facebook engineers and tested by King himself, has been in the works for months.
“Worldwide, more than 39 million people are blind, and over 246 million have a severe visual impairment,” Facebook said in a statement. “As Facebook becomes an increasingly visual experience, we hope our new automatic, alternative text technology will help the blind community experience Facebook the same way others enjoy it.”
The idea for the photo reader came from interviewing people with visual impairments, Facebook said. As Facebook evolved from short text updates to a more photo-driven experience, they felt aggravated and left out.
Roughly 2 billion photos are uploaded every day to Facebook and other apps the company owns — including Messenger, WhatsApp and Instagram.
The photo reader will initially only work on Facebook’s apps for iPhones and iPads. Officials said they are working to build versions for Facebook on other operating systems and, eventually, other Facebook-owned apps.
The software itself relies on “deep learning” algorithms that seek to mimic the way the human brain thinks. It’s the same technology Facebook has used for years to understand who and what are in photos, and allows the network to suggest tags and friends it finds in images.
Google, Apple, Microsoft and other tech companies use similar networks to recognize speech, translate languages and search images.
Some users have voiced concern over these technologies and suggested having a machine read and analyze images might be creepy for readers — particularly if a computer can analyze a photo better than a human might.
“Emotional reading tends to make people the most uncomfortable,” said Shaomei Wu, a member of Facebook’s Accessibility Team.
That’s why Facebook is, for now, sticking with literal, one-word descriptions of a scene.
In photos of people, the virtual narrator will focus on physical traits and characteristics — beard, eyeglasses, smiling, and so on. It will also be able to identify different types of vehicles, sports, food, clothing, jewelry and various outdoor settings. The team expects Facebook’s artificial intelligence to evolve and learn more over time.
The nascent photo-reading technology will initially be available in English only, though Facebook said other languages are coming.
Marissa Lang is a San Francisco Chronicle staff writer. Email: firstname.lastname@example.org Twitter: @Marissa_Jae
Read your photos
Apple users interested in trying this technology can enable VoiceOver by asking Siri to “turn on VoiceOver,” or by tapping on Settings > General > Accessibility > VoiceOver. Once enabled, users can open the Facebook app and swipe to scroll through News Feed or a specific profile. When you swipe past a photo, you'll hear this technology tell you some of the items the photo may contain.