Multimodal models can recognize images and describe their content. This includes images of people, which could prove problematic.
The New York Times reports that OpenAI currently masks faces in images and does not allow GPT-4 to process them with image recognition.
This particularly affects blind people, who are using GPT-4 with image enhancement in the “Be My Eyes” experiment to get detailed descriptions of the environment as well as people. The description of the environment is still available, but the description of the people has recently been disabled and the faces on the images are blurred.
OpenAI doesn’t want GPT-4 misused for facial recognition
GPT-4 with image capabilities can recognize prominent people, such as OpenAI CEO Sam Altman, of whom it has seen many images in AI training. GPT-4 will not recognize people who do not appear in many images available on the Internet.
As a result, its identification and monitoring capabilities are nowhere near as extensive as those of AI systems such as Clearview AI or PimEyes, which are optimized for this scenario and can identify people in images based on fine details.
Beyond identification, however, OpenAI faces more problems when analyzing faces: Even if the person in an image is unknown, he or she can be described and associated with the wrong gender or emotional state, for example.
With hundreds of millions of users, this could lead to numerous complaints. In addition, the image analysis is said to be strong enough to bypass common captcha systems.
Misidentifications, where the model correctly identifies a person’s role as CEO, for example, but assigns the wrong name to that role, are also possible, says Sandhini Agarwal, OpenAI policy researcher.
“We very much want this to be a two-way conversation with the public. If what we hear is like, ‘We actually don’t want any of it,’ that’s something we’re very on board with,” says Agarwal.
Microsoft and Google also block facial recognition
Google’s chatbot Bard also offers image analysis. Currently, Bard refuses to make statements about images of known and unknown people and deletes an uploaded image of a person without further inquiry. Google’s Lens visual search, on the other hand, recognizes an image of Sam Altman and correctly identifies him.
Microsoft is also integrating visual image search into Bing Chat. After uploading an image of a person, Bing Chat indicates that the image will be blurred “for privacy reasons.” However, the chatbot will offer recommendations on how to find the person through Google Images, social media, or TinEyes.