I spent a lot of time in 2024 talking to podcasters about video. Some are angry, some are excited, lots are perplexed.
YouTube’s push into being a home for podcasters and Spotify’s desire for podcasters to upload video versions of their episodes are gravitational forces that seem to be dragging podcasters, willing or not, into switching on the cameras.
On the more refusenik end, there’s a discussion about the brilliance of audio (for listeners and creators) and the desire not to kowtow to the big platforms. There’s often talk of ‘defending’ audio and that it’s a special medium of its own.
For those excited by it, video offers a way to reach more audience, specifically driven by YouTube’s content discovery algorithms. It also offers another stream of cash either through video ads, or Spotify’s as-yet, under-explained, subscription revenue share. Plus I think there’s also probably a bit about ego too - looking good on a screen.
My own podcast background is from doing it for big radio groups, doing it for smaller audio startups, running the British Podcast Awards and now with Podcast Discovery, talking to lots of people who want to grow their show.
When we did the Awards it became very clear that podcasters' objectives are very much not aligned. ‘Podcasters’ were doing it for many different reasons. We often referred to the Awards as a very big, broad tent. It ranged from people making shows for fun in a kitchen to massive media organisations, from small businesses appealing to a niche to high-value talent wanting to own their audience directly. Some weren’t bothered about the cash, others saw it as a way to escape the nine to five, a few were using it as a route to develop a new creative business whilst others saw it as a new platform to monetise their fans.
I think it would be really hard to create a trade body for podcasting because why people are doing it is so varied. It would be difficult to get them to agree on what the body should do! A key part of the initial rise of podcasting was that individuals could create a show with very few barriers to entry. Today, if someone doesn’t want to do video, that’s fine. There’s lots of ways to execute a podcast and build an audience. Doing that purely in audio is not going to go away.
At the same time, anyone would be mad not to think that the world, and consumers, do change. A stand alone iPod was transformative and a design classic, but the job it did was more important than the device it was on. A thousand songs in your pocket was a more successful proposition than the concept of a click wheel. Yes, there are still some people that are searching eBay so they can get a replacement, as for them it was the perfect device for listening to music, but most people have moved on. As I describe media use below, and to stop me having to continually point it out, that just like the iPod not everyone consumes media the same way - I’m trying to describe what the larger groups of people do - please don’t get offended by your edge case - “well, I listen to shows on my smart fridge I’ll have you know!”
I don’t think video podcast viewers are the natural successor to audio podcast listeners. I haven’t seen much evidence that audio listeners wholesale move their consumption to YouTube once they’ve ‘seen the light’. The success of the audio podcast is often about how it fits into someone’s day. For many it’s an accompaniment that does not require full attention. It leans into what radio’s learned, that being a background medium is a superpower.
Whether the device plays video or not, it’s a screen that’s not often on. Most likely in a pocket, or on a car dashboard. In the UK, RAJAR’s MIDAS report suggests that podcast listening is more likely to come at the expense of people’s owned music than it does for radio listening. In other words, one non-linear background media has partly replaced another.
For me the much bigger question is how people get into listening to podcasts. I think the first major interaction with podcasts were three interlinked events. The stand alone podcast app for the iPhone came out in June 2012, the year in which smart-phone penetration hit 50% and seven years after Apple put podcasting into the iTunes app. By 2012 there were a decent number of shows - many from people consumers have heard of - if you pressed that new podcast button. Then in 2014, Serial, was the first cross-over mainstream hit. For many people they could now hit that button, and tune into something they had heard a lot about, for free, on a mainstream device connected to the internet.
But you have to remember that the iPhone was an expensive device and Serial was an intelligent product. It also appealed to people well-versed in listening to something. Podcasting started out, post-early adopters, as something a little older (at least 25+) and up-market.
In 2014, at the same time, new creators Pewdiepie, Jenna Marbles, Zoella, Vsauce and Stampy were enjoying huge YouTube success with younger and less well-off audiences.
The original vlogging - individuals talking to the camera - alone or with their friends has many similarities with what was happening in audio-only podcasting. Their separate successes were more about demographics on platforms than it was about the content they created. Indeed both platforms had loads of material that wasn’t chatting. Dramas and documentaries on podcasts and lots of clips of media on YouTube. Indeed many older audiences were following a link to a video on YouTube, whilst younger consumers were subscribing to their fellow content creators. But these younger audiences were basically doing the same thing on YouTube as older audiences were doing on podcasts, hitting subscribe on their favourite shows.
So both have been chugging along quite happily for the last ten years. So what’s changed? Well, in the YouTube space many of those vloggers have grown up and transitioned into more traditional podcast formats. Some people were always there or thereabouts. A bit of Googling finds the first episode of The Joe Rogan Experience on his YouTube channel. It’s a live stream and as one of the commentators says it’s like “watching cavemen discover fire”.
Rogan was video-first from the start and whilst he had an audio-only hiatus whilst taking Spotify’s money, it’s his spiritual home. Whilst not to my taste, the show is hugely popular and still generates outsized press coverage. What the show is, where it is and how it looks, leads many consumers discovering ‘podcasting’ through Joe Rogan to link the two. A podcast’s a video thing, right? As Serial taught people what a podcast is in 2014, Rogan does the same in 2024.
It’s not just Joe Rogan of course. In the US shows like Impaulsive, self-described as “The world's greatest, most thought-provoking, mentally stimulating podcast in the history of mankind…” do a million views an episode, whilst not troubling the Apple Podcast charts. Other shows like The Pat McAfee Show are again, something different. It’s a TV show streamed on ESPN+, his YouTube channel and released as an audio podcast. It gets half a million views on YouTube and is the number 10 sports podcast in the US.
Both of these could easily be dismissed as not proper podcasts, but they’ve claimed the word podcast and are teaching new-to-podcasting consumers that a podcast isn’t something that’s driven by a platform, it’s a type of content - people having a discussion - that you can watch. It doesn’t mean you can’t get it elsewhere, but the core visual, is er, visual.
This seeing is believing concept is only heightened by social media. The last few years has seen the transition away from text-based social media - Twitter, Facebook etc - to video based ones - TikTok, Instagram Reels etc. People are spending far more time in these places, and younger audiences are native consumers of it.
As an aside, social clips reinforce the idea of a podcast as video. Conversation, into large visible microphones, instantly makes people think - podcast. There is some talk about trying to wrestle the word podcast back to audio. That battle is lost. The audience does not see podcasting as a particular medium, they see it as a content type. People talking into a microphone.
If you hang out in the podcasting subreddit, you’ll see lots of posts about people starting shows asking the same questions over and over again about what gear they should use. In the last year ‘what microphone should I use’ has been replaced by ‘what camera should I use’. New, youthful creators see podcasting as ‘people talking into a microphone’ rather than making an MP3.
Anyway, back to phones. Fundamentally they’re interactive, always on devices that give high levels of engagement. The entire world is accessible through them. Ubiquitous connectivity means they can be a lean-forward filler of any available time. Of course, a phone can do lean-back as well, with audio accompanying a run etc, but its sweet-spot is lean-forward.
Again, in RAJAR’s MIDAS survey you see time and again that linear radio performs poorly on mobile devices. It’s an active, choice-based medium. People use it to find what’s relevant to them, web searches, TikTok feeds and even podcasts are all personal. There’s not a shared experience between people’s phones. Radio, in a shared space, is often the ‘least worst option’ for the people consuming it - it’s designed to be mass appeal, consumed by many.
Podcasts, as a personal medium, do well on a phone too, but it competes with a little bit of everything, all of the time - the ease of the internet, YouTube, Netflix, messaging - all often more active engagement than an audio show. But I think it’s less about the device, and more about people’s availability and what their hierarchy of consumption is. Where they are, and what suits them for that moment.
The emergence of podcasting was a confluence of technology (for both making and listening) and general internet connectivity. It built on the fact that people were comfortable with the concept - “it’s a bit like radio programmes delivered to your phone” - and it occupied radio’s great selling point - working as a background medium and it was free! It was accelerated by the (relative) ease to make audio content. It’s awareness was driven by talent jumping on board (again, relatively easily) and using their channels to promote their shows, word-of-mouth from passionate fans and publicity around break-out hits - Gervais, Serial, Rogan, My Dad Wrote A Porno, Shagged Married Annoyed.
Fundamentally though, it was media that could be inserted into an already busy media diet. Walking, travelling and personal time with the headphones in. In 2005 to 2015 radio had failed to occupy this place due to noisy FM reception issues and inadequate streaming. Podcasting’s competition was owned-music, saved to a device. It grew fast.
Meanwhile the growth of internet video consumption during this time, the early days of vlogs and YouTube, was driven by its own unique environment. Firstly it was tethered. Computer-based consumption and later broader devices through wifi at home. Similar to podcasts it was personal. YouTube was rarely watched by people together. It may well have been shared, but consumption was usually solitary. Individual laptops in bedrooms whilst others in the household consumed video on the big TV.
The last ten years has seen YouTube remain personal but it has become untethered. It still occupies key real estate at home, but ubiquitous connectivity means for many people, particularly the younger ones, it now sits in a place that audio podcasts occupied for older audiences. You only have to walk down a train, or past people on a platform to see that “travelling and personal time with the headphones in” is now often dominated by video.
Before podcasts and video, the Metro newspaper, launching in March 1999 was an instant hit, it delivered a new product to cater for a need and distributed it to the right places. Its success wasn’t entirely about the content - though it was timely and well designed for the audience - its success came from catering for a human need in a specific environment… and delivering.
So what does this all mean for audio podcasters today? How should you think about what you make and how you distribute it and where does video fit in? Read about that in part 2 here.