Pindrop CEO Vijay Balasubramaniyan warns of the dangers of deepfake audio at RSA Conference 2020.

Feb 26 2020

RSA 2020: Is Voice Fraud the Next Frontier for Scam Artists?

Using readily available deep learning tech, fraudsters impersonate real people’s voices, gaining access to user accounts and corporate cash.

Imagine a phone call occurs between the CEO and the CFO of a midsized corporation, in which the CFO is instructed to wire millions of corporate dollars to an outside account. 

The request is peculiar, and the boss sounds a bit strange. The CFO is suspicious. But a plausible explanation for the transfer is offered — and this is, after all, the CEO. The CFO initiates the transfer.

But it wasn’t the chief executive after all; It was a deep learning-enabled robot delivering an impersonation so perfect that human ears could not detect a difference. This is modern voice fraud, a subject in which attendees of RSA 2020 received a crash course on Feb. 26 in San Francisco. Scenarios like this one are playing out with increasing frequency around the world, said Vijay Balasubramaniyan, CEO of Pindrop, a voice-security company.

“We’ve seen as much as $17 million go out the door this way,” said Balasubramaniyan. “This is a situation where a fraudster is cloning your CEO’s voice or a top executive’s voice and fooling other people in the company into doing something they shouldn’t do. You on the receiving end are not quite able to pick up the difference.”

In August, criminals scammed a U.K. energy company out of $243,000 by persuading the CFO to transfer the funds to an account supposedly belonging to a Hungarian supplier, on the orders of someone the CFO thought was his boss. Millions were stolen in a similar fashion from three separate companies just a month earlier.

Voice Fraud Is Easy to Commit

Voice fraud is growing because it’s relatively easy to do it, Balasubramaniyan said. A sample of a person’s voice just a few minutes long is enough to train a deep learning system to create a convincing synthetic version of that voice saying anything. And the deep learning technology is not hard to come by.

Balasubramaniyan conducted a live demonstration in which he used such a system to create what sounded like a clip of Donald Trump saying he’d just initiated a bombing campaign on North Korea. Several politicians have already been subjected to such fakes, perhaps most famously last year, when Speaker of the House Nancy Pelosi was made to sound drunk by scammers who simply manipulated a sample of her voice to make her speech sound slurred.

That was just a “cheap fake,” Balasubramaniyan said, because the scammers didn’t use deep learning. But think of a situation in which a politician is presented as having said something truly damaging, when they didn’t. “It seems to me pretty easy to imagine a scenario in which that would actually affect an election,” he said. 

MORE FROM BIZTECH AT RSA: How Equifax is changing its security culture in the wake of a massive data breach.

Businesses Have Reason to Worry About Voice Fraud

Voice fraud is worrisome for enterprises for several reasons. In addition to the possibility that they might be scammed by fraudsters impersonating their own executives, high-profile business leaders may also find themselves subject to public relations problems or even blackmail by scam artists who create fake audio or video of them saying something embarrassing.

Mark Zuckerberg, for example, was the subject of such a “deepfake” in 2019, when a fake video was posted to Instagram, which Zuckerberg’s Facebook owns. In the video, Zuckerberg is presented as saying: “Imagine this for a second: One man with total control of billions of people’s stolen data, all their secrets, their lives, their futures.”

He never said that, and the video is not especially convincing. But Balasubramaniyan said the technology is improving rapidly, and soon fraudsters will be making much more persuasive deepfakes.

Another concern for businesses: the implications of voice fraud on voiceprint authentication systems, a technology with which many organizations are experimenting. With voiceprint authentication, end users supply a sample of their voices, which is then stored and compared against live callers’ voices to verify identity.

Asked whether voiceprint authentication is now a “dead end,” Balasubramaniyan said the technology still has value.

“Any authentication system on its own is susceptible to fraud,” he said, noting that generations of criminals have stolen passwords. “What you need to make sure of is that you have enough safeguards and that you’re using all of them. If you’re replacing passwords with voiceprint, you’re going to run into a world of hurt. But if you’re using several things together, you’ll be much safer.”

Keep this page bookmarked for articles and videos from RSA 2020, and join the conversation on Twitter @BizTechMagazine.

Bob Keaveney/BizTech Magazine

Become an Insider

Unlock white papers, personalized recommendations and other premium content for an in-depth look at evolving IT