Will AI be the voice of the future?

Why artificial podcasters probably won’t be taking our jobs any time soon

As a recovering tech journalist, I've still got a lingering fascination with cutting-edge technology, and few areas are less vibrant than the burgeoning world of artificial intelligence. In just a few years, AI has gone from a theoretical project to a legitimate field of development, and it’s already revolutionised many areas, from manufacturing to media.

That includes podcasting: AI has become an integral part of many podcasters’ workflows, including my own. Otter, for example, is a fabulous service that uses voice recognition and natural language processing to automatically transcribe audio or video recordings into text, massively speeding up the process of creating episode transcripts.

Headliner is another handy tool, which is used for creating audiograms from podcast episodes, and it also includes AI features to suggest which bits from an episode would make the best clips. Even the process of audio mastering is being made easier by machine learning; there are now multiple plugins for Adobe’s Audition software that use AI for tasks like noise reduction and cleanup, with first-party AI tools from Adobe itself currently in development.

Descript is the most impressive of the lot – it combines Otter-style automatic AI transcription with audio editing tools that let you make alterations by cutting and pasting text directly from the transcript, using AI to match the text to the audio. It even includes features like the ability to automatically remove all filler words from a recording with a single click, automating much of the tedious legwork from the process of editing an episode.

It's not all sunshine and roses, though. AI is a tool and, just like any tool, it can be put to problematic use in the wrong hands. For example play.ht, a company specialising in AI voice generation, has created an entirely artificial podcast as a promotional vehicle to show off its technology. Podcast.ai features an AI-generated script being discussed by AI-generated recreations of the voices of famous people – Steve Jobs and Joe Rogan for the first episode, and Lex Fridman and Richard Feynman for the second.

Jobs and Feynman have both been dead for many years, and using AI to put them on a podcast like the audio equivalent of Weekend At Bernie’s is both deeply weird and not a little creepy. It’s essentially the same as deepfake videos, which follow the same principle in a visual format, and faces the same ethical concerns that have been raised by numerous sources in relation to that technology – not least whether it’s morally acceptable to use the voice of a dead person as a marketing tool.

Leaving aside this unsettling application of the technology, however, there are interesting use-cases for it in the real world. Busy media spokespeople, for example, could take a set of pre-prepared interview questions, write out their answers to them, and use an AI algorithm that's been trained on their voice to essentially do the interview for them. You could even create an entire solo podcast from the script, without once needing to sit in front of a mic. I know plenty of PRs that would be doing backflips at the prospect of never having to try and battle for time in an executive's diary – and a good few time-poor execs who’d be similarly thrilled with not needing to do the media circuit.

They’ll be disappointed to know, then, that this is unlikely to happen any time soon. The technology still isn't quite advanced enough to convincingly replicate human speech, and all the little nuances that go into it. While it's impressive listening to Joe Rogan interviewing Steve Jobs, no one’s actually convinced that it's really them. More to the point, the fun of creating podcasts lies in the conversations it allows you to have with interesting people – and few podcasters are going to give that up, regardless of how high-profile the guest is. Even with all the AI in the world, nothing beats the spontaneous ebb and flow of a real-time conversation.

Of course, the process of podcasting involves more than just talking, and there are lots of jobs within podcasting that I'm happy to hand over to the robots – but when it comes to hosting their own shows, they’re not ready for primetime just yet.


Latest