It should come as no surprise to anyone that’s read my previous columns that I’m a bit of a nerd. I love data - I’ve got a background in tech, and there’s nothing I enjoy more than getting elbow-deep in the guts of a dashboard or spreadsheet.
It’s been particularly helpful when looking at the podcast industry. I’ve been having a lot of conversations with both podcasters and advertisers over the last few months, looking at their main challenges and pain points - and one thing that comes up again and again is metrics.
Put simply, podcast metrics are bad. The level of data that’s generally on offer isn’t granular enough to be particularly useful. The most complete data we get is analytics from hosting platforms, but that’s still very limited, with the most relevant metric usually being the overall number of downloads, which doesn’t tell you whether or not a user actually listened to the episode, or whether it’s just sitting unplayed in their library. This is something that Dan Misener, co-founder of podcast growth agency Bumper wrote about last month.
You’ll also find information on which platform listeners are using, and a rough location, and more information can be found in Apple and Spotify’s respective analytics tools, such as completion rates and more detailed demographic data - but it’s not remotely unified across platforms. Look at your show’s analytics in your podcast host’s back-end, and then compare it against the data found in Apple and Spotify’s: chances are, you’ll find three wildly different numbers, with no easy way to reconcile them against each other.
“In my ideal world, each podcast app would allow creators to track total Listen Time in an anonymized, aggregated way,” Misener writes. “Inconveniently, many podcast apps simply do not report Listen Time, or equivalent metrics.”
There’s no ‘single source of truth’ - and for both podcasters and advertisers, that presents a major problem. Data is the foundation of reliable decision-making within business; without good data, podcasters can’t get granular insights into how their content performs, and advertisers don’t have the confidence to invest in podcasts with the knowledge they’ll be able to achieve their goals.
Various companies have tried to solve this by layering on additional tracking and analytics, but these attempts to retrofit modern day measurements are a patchwork solution that still doesn’t fully meet the industry’s needs.
By way of comparison, let’s look at a standard website. By the time you’re reading this article, I’ll be able to go into my Google Analytics dashboard and see how many people have read it, how long they spent on the page, how many people went on to read more articles, and even how they arrived on the page in the first place.
YouTube offers a similar level of analytics, and it allows me to plan my content strategy with a high degree of confidence, knowing what content works better than others, which channels are most effective for promotion and discovery, and which pieces are most engaging.
Without this level of insight, the decisions I can make are based on little more than guesswork and gut feeling. With the podcast, for example, I can tell that our episode with David Law from The Tennis Podcast was wildly successful in terms of downloads, but what I can’t see is where those listeners came from. Did they come from our social media posts about the episode, did they see the episode in David’s newsletter, or did they simply stumble across it while scrolling through their podcast feed? Your guess is as good as mine.
If I could answer those questions, I’d have a better idea whether to book more sports podcasters, or those with highly-engaged newsletter subscribers, or simply to do more content around remote podcasting, but without data, I’m restricted to educated hunches.
The problem is, these limitations are an inherent part of the way podcast infrastructure is built. RSS is the foundational technology standard that underlies podcasting, but it was designed in a different time, when digital measurement was much more rudimentary. It’s a unidirectional system; when you publish a new episode, your hosting platform updates the RSS feed, and the information (including the episode file, description, et cetera) is pushed out to all podcast platforms.
That’s a surprisingly efficient way of delivering content, and projects like Podcasting 2.0 have made admirable strides towards making RSS feeds more flexible and fully-featured but it only goes one way. There’s no mechanism built into RSS for allowing podcast platforms to transmit data back to your hosting provider, so all your provider knows is what it can infer from the request to download the episode. Using the IP address of the requester, it can tell roughly where the user is located, what device and platform they’re using, and the fact that the episode file was fetched - but that’s about it.
The reason Google Analytics is able to collect such frighteningly comprehensive data is that, when you visit a website that uses Google Analytics, it embeds a tiny piece of javascript code that runs in your browser and reports back to Google’s data collection servers about your behaviour. This, incidentally, is why most sites now ask you to agree to the use of ‘cookies’ when you first visit them.
It may sound somewhat Orwellian, but the benefits of this system have driven web-based content and advertising to become the juggernaut that it is today. Put simply, the reason it’s so widespread is that it works absolute gangbusters for providing publishers and advertisers with robust, actionable data - and it’s high time that podcasting adopted something similar.
The most likely hurdle preventing this is the ongoing turf war between major podcast platforms, which traditionally have a relationship akin to starving cats trapped in a sack. Platforms like Apple, Spotify and Google have a vested interest in centralising as much infrastructure and tooling as possible within their own environments, so convincing them to adopt a new open, non-proprietary standard for bi-directional data sharing is probably going to be a tough sell.
The propensity for analytics fraud and inauthentic traffic is also a potential danger - although, as we’ve seen from the Jun Group fiasco, it’s not a problem that podcasts are immune from under the current setup.
Then there’s the question of who manages and maintains that standard, and collects and distributes the data. In web technology, standards like HTTP are updated and maintained by non-profit consortiums like the IETF and W3C, but these were formed in the halcyon days before corporate interests asserted their dominance over the web’s development.
For podcasting, a new standards organisation would need to be formed, with the support of the wider community of podcasters as well as all the industry’s major players, and the remit to develop and safeguard this prospective new technology framework. An organisation like the team behind Podcasting 2.0, for example, could potentially fill this role.
To be clear, this is by no means an argument for replacing open RSS infrastructure with a walled garden owned by Apple, Spotify or anyone else. Any new standard would need to remain open and accessible for smaller players - one of the benefits of RSS is that, theoretically, anyone can set up their own RSS feed to distribute a podcast without needing to go through a hosting provider. All you need is a computer to act as a micro-server. If the technical infrastructure needed to host and distribute podcasts suddenly balloons in cost and complexity, that could limit the openness and diversity of the wider podcast ecosystem.
It’s a risk worth taking, however, and something needs to be done to ensure podcasting’s long-term growth and viability. The current foundation for podcast technology simply isn’t up to modern standards for measurement and data analysis, and it’s kneecapping the ability of both podcasters and advertisers to effectively build sustainable businesses around the medium.
Really Simple Syndication has brought podcasting into the mainstream - but simple can only take you so far, and it’s well past time that the industry started thinking about really sophisticated syndication.