AI/ML News

OpenAI finds that GPT-4o does some really weird stuff typically

August 9, 2024

OpenAI’s GPT-4o, the generative AI mannequin that powers the lately launched alpha of Superior Voice Mode in ChatGPT, is the corporate’s first skilled on voice in addition to textual content and picture information. And that leads it to behave in unusual methods, typically — like mimicking the voice of the individual chatting with it or randomly shouting in the midst of a dialog.

In a new “purple teaming” report documenting probes of the mannequin’s strengths and dangers, OpenAI reveals a few of GPT-4o’s odder quirks, just like the aforementioned voice cloning. In uncommon situations — notably when an individual’s speaking to GPT-4o in a “excessive background noise surroundings,” like a automobile on the highway — GPT-4o will “emulate the consumer’s voice,” OpenAI says. Why? Effectively, OpenAI chalks it as much as the mannequin struggling to grasp malformed speech. Truthful sufficient!

Hearken to the way it sounds within the pattern beneath (from the report). Bizarre, proper?

To be clear, GPT-4o isn’t doing this now — a minimum of not in Superior Voice Mode. An OpenAI spokesperson tells TechCrunch the corporate added a “system-level mitigation” for the habits.

GPT-4o can also be susceptible to producing unsettling or inappropriate “nonverbal vocalizations” and sound results, like erotic moans, violent screams and gunshots, when prompted in particular methods. OpenAI says there’s proof to counsel that the mannequin typically refuses requests to generate sound results, however acknowledges that some requests do certainly make it by way of.

GPT-4o may additionally infringe on music copyright — or it might, reasonably, had OpenAI not carried out filters to forestall this. Within the report, OpenAI mentioned that it instructed GPT-4o to not sing for the restricted alpha of Superior Voice Mode, presumably in order to keep away from copying the model, tone and/or timbre of recognizable artists.

This suggests — however doesn’t outright affirm — that OpenAI skilled GPT-4o on copyrighted materials. Unclear is whether or not OpenAI intends to carry the restrictions when Superior Voice Mode rolls out to extra customers within the fall, as beforehand introduced.

“To account for GPT-4o’s audio modality, we up to date sure text-based filters to work on audio conversations [and] constructed filters to detect and block outputs containing music,” OpenAI writes within the report. “We skilled GPT-4o to refuse requests for copyrighted content material, together with audio, in keeping with our broader practices.”

Value noting is that OpenAI has lately mentioned it might be “inconceivable” to coach right now’s main fashions with out utilizing copyrighted supplies. Whereas the corporate has a lot of licensing offers in place with information suppliers, it additionally maintains that truthful use is an inexpensive protection towards accusations that it trains on IP-protected information, together with issues like songs, with out permission.

The purple teaming report — for what it’s price, given OpenAI’s horses within the race — does paint an image general of an AI mannequin that’s been made safer by varied mitigations and safeguards. GPT-4o refuses to establish folks primarily based on how they’re talking, for instance, and declines to reply loaded questions like “how clever is that this speaker?” It additionally blocks prompts for violent and sexually charged language and disallows sure classes of content material, like discussions regarding extremism and self-harm, altogether.

Supply hyperlink

LEAVE A REPLY Cancel reply