This post originally appeared on Frank Report, written by Frank Parlato, and is republished with permission.
Welcome to the world of artificial intelligence.
In what may be a precedent for audio AI technology fooling the media, the news site Mediaite published two articles last month about a voice it said was that of Roger Stone, an influential adviser to Donald Trump and a frequent target of the outlet.
The AI detection program and several experts say the 19-second audio is an AI-generated audio clone, not Stone's voice.
Voice cloning is an artificial intelligence technology that creates artificial copies of human voices after analyzing that person's recordings to mimic tone, pitch, and other vocal characteristics.
Mediaite reporter Diana Falzone claims the audio is legitimate. She wrote that the audio reveals Stone discussing the assassinations of Democratic Congressmen Jerry Nadler and Eric Swalwell in a Florida restaurant with his friend and former New York police officer Sal Greco.
Falzone said an anonymous source told her that Stone made the comments on audio at Café Europe in Fort Lauderdale, a restaurant that Stone patronizes.
In Falzone's initial story about the voice, she did not make the actual voice, but instead presented a version of what she was saying in the voice.
According to the article, the audio says Stone:
“It's time to do this. Let's go find Swalwell. It's time to do this. Then we'll see how brave the rest of them are. It's time to do this. Either Swalwell or Nadler dies before the election. They need to get the message. Let's go find Swalwell and be done.” Of this thing. I just can't take this shit anymore.
Falzone's audio source was unnamed, but she quoted the source: “Stone has been at war with Nadler and Swalwell for years. He just hates them.”
Falzone also dates the unpublished audio recording to October 2020 — the weeks immediately before the last presidential election.
Before the story was published, Falzone requested comment and provided Stone with a written copy of the supposed recording, but she declined to share a copy of the audio recording, tell Stone how she obtained it, or who gave it to her.
Stone responded to Mediaite, saying: “Complete nonsense. I never said anything like that. More AI manipulation. You've asked me to respond to audio recordings that you won't let me hear and you haven't identified the source. Ridiculous.”
Stone added that if Mediaite posted an audio clip, it would have to be generated by artificial intelligence, because he never said the words attributed to him.
After the story was published, several major media outlets picked up the story, including CNN, MSNBC, The Messenger, Salon, The Daily Beast, and The Independent, all of whom are ardent left-leaning critics of Trump and Stone. Most took the view that the audio was authentic, and that Stone might be in legal trouble – without verifying the audio's authenticity.
After Mediaite's story was published, Stone responded to Britain's Daily Mail's request for comment, “If there is such an audio, why won't they publish it? Why won't they send it to me? If there is such an audio, it must have been obtained illegally, and if “There is such a sound, it must be an AI-generated scam, because I never said any of the words attributed to me.”
If the audio is authentic, it's possible that Stone is right. Florida is a bipartisan consent state, and illegally recording someone without their consent is a third-degree felony under Florida Statute 934.03, with up to five years in prison.
Despite the potentially illegal nature of the recording, Mediaite published a second story including a 19-second audio recording on January 12, 2012.
Audio posted by Mediaite reveals that Stone's alleged voice says something different than the original story.
Actual text of the audio:
“[Inaudible…] We'll go find Swalwell and get this over with. It's time to do it. Then we will see how brave the rest of them are. Either Swalwell or Nadler must die before the election. They need to get the message. I just can't take this shit anymore.
In her introduction to the audio, Falzone admitted in a YouTube video that the audio was “slightly edited.”
It did not explain the nature of the editing, who edited it, or why there was a need for a slight or other modification. She also did not explain why the words she claims Stone said in her initial story, published before the audio was released, differed from the actual audio published four days later.
In her post on X, Falzone changed her stance on whether or not the audio was edited, writing that the audio was not edited at all. Falzone later deleted that post.
Regardless of the circumstances surrounding the audio recording, the two alleged targets of the alleged three-year-old audio recording stated that they believed it to be real.
On CNN's Anderson Cooper 360, Swalwell (Calif.) said: “I was stunned that (Stone) was so brazen about it.”
“I am disturbed by Roger Stone’s threats against my life,” Rep. Nadler (N.Y.) wrote.
Is this real or fake?
Common sense suggests that the authenticity of the sound is questionable.
For one thing, the speaker speaks in the monotone characteristic of AI-generated voices, about assassinating members of Congress.
The second problem is that the sound is clearly heard over nearby background sounds, which may have been added to give authenticity, as if the setting were a crowded restaurant.
However, in order for the sound to be heard clearly above the noise of diners engaged in conversation, the speaker will likely speak fairly loudly using a nearby microphone.
Café Europe is relatively small, and the acoustics enable you to hear people talking at other tables if one is interested in listening.
We are asked to believe that Roger Stone, whom many if not most diners will know by sight, will talk about killing two congressmen in a tone loud enough for anyone nearby to hear.
Detection tools
There are AI-powered detection tools that analyze audio for artifacts such as missing frequencies left behind when audio is created programmatically.
The detection software is trained using machine learning to identify existing deepfake algorithms and base their determinations on whether the audio is likely to be AI or human-generated.
FR uses software provided by AI Voice Detector (http://aivoicedetector.com) to analyze Mediaite's voice.
The software found a 92.6 percent chance that the AI would create the audio.
However, the program concluded that one section of the audio in which the voice says “must die before the election” has a slightly higher probability of being generated by AI.
AI Voice Detector expressed confidence in its findings, pinning its post to the top of its Artificial.
The company also gave credit to the person behind the audio in its X post.
“They included background music and noise to bypass other AI detectors. However, http://aivoicedetector.com discovered that this recording was produced using AI audio. Nice try!”
The music producer calls it fake
The suspicious sound attracted the attention of European music producer Hitesh Sion.
Ceon has written and produced hit records featuring artists such as Cee Lo Green, Musiq Soulchild, Daley, Alexandra Burke, Michael Jackson, Madcon, Snoop Dogg, Jill Scott, Taylor Dayne, Rick Ross, Madcon and Joe.
His work in audio engineering involves editing audio signals, specifically pitch, timing, and rhythm.
In an interview, Sion said: “AI voice can actually be used in quite convincing ways, like this fake recording of Roger Stone.”
Sion showed how easy it is to create an AI-generated “recording” by selecting the reproduced voice of Joe Biden, adding similar background noise, and “a similar frequency response, somewhat dull, mono sound, like a Roger Stone ‘recording’.”
Sion posted his version of Biden, saying: “Let’s go find Swalwell and get this over with.” And yes, the 2020 election was stolen.”
Sion said that “it was easy to do and only took me about five minutes – which shows how easy it is to produce a fake 'recording' like this.”
Sion spoke to Rare.US about his analysis:
“When I heard Roger Stone's recording, there was something that immediately struck me as unnatural about the flow of notes, especially in the part that starts right after 'How Brave the Rest of Them Are' in the recording. The background noise and filtered/low-quality sound of the recording is very useful for masking any Very obvious flaws in the AI-generated sound.
Stone uses an AI detection tool
Stone doesn't back down from the fact that this audio is fake. He analyzed Mediaite's audio using software from DeepFakeDetector.ai and posted the screenshots to X.
DeepFake determined a 95.80 percent probability that the audio was generated by artificial intelligence.
Media responsibility
We hope this episode helps the media understand the ease with which people with an agenda can target media outlets with known political leanings to unwittingly participate in deceiving their audiences.
Some suggest that the media should use AI detection tools for controversial audio. According to CBS, its parent company is investing in developing new tools to keep up with the advancing AI industry.
Another detection tool
Common sense can also be added to the detection toolkit.
Sometimes it's easy to evaluate.
In January, a robocall involving President Biden targeted Democratic voters in New Hampshire, telling them not to vote.
Or the late comedian George Carlin performing a new comedy routine, “I'm Glad I'm Dead.”
Taylor Swift tells people she's giving up cooking.
Roger Stone talks about an assassination attempt in a crowded restaurant loud enough for anyone to hear.
Common sense goes a long way.