Audio deepfakes are WAY scarier than image ones and nobodys talking about it

bright.puddle.15 · February 20, 2026, 1:14am

we spend SO much time on this forum (and everywhere else) debating AI images but honestly the audio side is what keeps me up at night

saw a demo last week where someone cloned a CEO’s voice from a 30 second earnings call clip and then generated a completely fake phone call authorizing a wire transfer. the whole thing took less than 5 minutes to set up. five. minutes.

there are some detection tools coming (Resemble AI Detect, Pindrop) but theyre all enterprise priced. for regular people theres basically nothing.

and the detection challenge is even harder than images because phone call compression and voice message compression literally destroys the artifacts that detectors look for. so even if you had a good detector it probably wouldnt work on most real-world audio.

voiceover artists, podcasters, musicians — yall paying attention to this? how are you thinking about protecting your voice?

voidvibes92 · February 26, 2026, 8:17pm

Voiceover artist here, been in the industry 12 years. Yes we’re paying attention and yes we’re terrified.

I already found my voice on an AI voice marketplace last year. Someone had cloned it from my demo reel on my website. Took me 4 months and a lawyer’s letter to get it taken down, and I’m still not sure they actually deleted the model.

There’s basically no legal framework for voice cloning yet. A few states have laws but enforcement is a joke.

RushMoment · February 27, 2026, 4:07am

The wire transfer scenario is real and already happening btw — there was a case in 2024 where a Hong Kong company lost $25 million to a deepfake video call where the “CFO” authorized a transfer. That wasnt even audio-only, it was a full video deepfake on a zoom call with multiple cloned participants.

And yeah the enterprise detection tools cost $$$. For consumers there’s nothing, and I don’t see that changing anytime soon because the consumer market cant support the R&D costs.

One thing that might help: establishing verbal authentication protocols. Like a family safe word for phone calls. Sounds paranoid now but probably wont in 2 years.

zara.phantom · February 27, 2026, 8:02am

I make music and this terrifies me too. Drake’s AI-generated track showed that you can clone a recognizable artist voice and it’ll go viral before anyone does anything about it.

What bugs me is the asymmetry. Creating a voice clone: minutes. Detecting one: expensive specialized tools. Getting one removed: months of legal back and forth. The incentives are completely broken.

sleek.Protocol · February 28, 2026, 5:11am

From a technical perspective, audio deepfake detection is harder than image detection for a few reasons:

Audio compression is lossy and aggressive — phone calls are 8kHz, most detectors need 16kHz+
Background noise masks artifacts
Voice is inherently variable (you sound different when tired, sick, emotional) so “normal” has a wide range
Real-time detection is computationally expensive

The most promising approach I’ve seen is challenge-response: the detector asks the speaker to say something specific in real-time. Current voice cloning can’t handle truly arbitrary real-time conversation without latency tells. But that window is closing fast.

Topic		Replies	Views
About the AI Art Detection category AI Art Detection	0	1	February 17, 2026
Midjourney v7 is breaking all our detection tools and im not sure what to do AI Art Detection	5	10	February 27, 2026
Tested 8 ai image detectors on the same 50 images — results are kinda wild AI Art Detection	5	44	February 26, 2026
Welcome to Creatisimo — Read This First Announcements	0	0	February 17, 2026
Clients Are Starting to Run AI Checks on Freelancer Deliverables - Is This the New Normal? Design Resources & AI Tools	4	4	April 4, 2026

Audio deepfakes are WAY scarier than image ones and nobodys talking about it

Related topics