Many websites are built upon content that is by nature relevant for a finite period of time and routinely expires.
- Ecommerce sites discontinue products.
- Job sites fill postings.
- Classifieds sites sell listings.
- Event sites have past gatherings.
Such expired content often accounts for thousands of pages. With additional pages added to the pile daily.
Taking an individual look at each page is unrealistic. You will need to manage expired content with an automated retirement plan. This plan should be both user and SEO friendly.
Like many SEO topics, there’s no absolute answer. But there are six key considerations that help identify the optimal solution for specific circumstances.
Do the expired content pages:
- Offer valuable information from a user perspective?
- Possess page level ranking signals, such as external links?
- Report significant sessions from earned media, such as organic, referral or social sharing?
- Have live content that could be recommended as a replacement?
- Use crawl budget that can be better employed elsewhere?
- Lead to index bloat in search engines?
Answer these questions to find out if you’re an expired content hoarder, recycler, destroyer, cleaner or juggler.
For clarity, expired content does not include:
- Temporarily out of stock items: Implement an out of stock notice and restocked email alert form using structured data to inform search engines it’s temporarily unavailable.
- Seasonal content: Leave the page up all year round, but remove any internal links and XML sitemap record when not in season.
- Promotional landing pages: Handling varies based on frequency, indexation, and URL slug.
- Category pages that change from having content to being empty. Implement a thin content handling strategy.
- News stories where interest is time-limited. Either leave it live or implement a content sunsetting strategy.
- Blog posts with outdated information. Implement a republishing strategy.
- Paginated URLs past the current end of the content. Render a 404 status code pagination strategy.
Here are five options to manage expired content pages.
1. The Hoarder: Leave On-Site
The simplest approach is to leave expired content pages live, adding an explanatory message and replacing the conversion form with related content recommendations.
Out of all expired content handling options, leaving expired pages on-site maintains the highest amount of sessions.
Obviously, expired content is less useful to users once it has past its expiration date. But in some situations, it can still offer value.
For example, seeing the specifications of previous product models, the salary of past job roles, or details of sold property listings.
The main problem with this option is that expired content will rapidly outnumber the site’s active content.
This bloat causes competition for crawl activity, potentially delaying the indexation of fresh content.
Google’s John Mueller commented, “what would happen is we would just index all of these old pages and we’d have trouble focusing on the important stuff on your website.”
And as search engine spiders repeatedly hit expired content pages, which are by their nature stale and thus lower value, crawl budget may be negatively affected.
What’s more, expired content pages split ranking signals and cause keyword cannibalization – challenging live content for rankings. As a result, your site may rank lower overall as search engines aren’t receiving clear signals.
You can combat keyword cannibalization by removing expired content URLs from the search engine index with:
- unavailable_after Tag (Dropped from the index without a recrawl.)
- ValidThrough schema (Presumed to be dropped from the index without a recrawl.)
- noindex robots directive (Dropped from the index after a recrawl.)
Such deindexation methods allow sites to maintain the power of any backlinks for the domain – although naturally, with such commands the page-level rankings are moot.
If you choose this option, be sure all expired content pages are excluded from CPC campaigns, internal site search, and dynamic XML sitemaps. You don’t want to be actively promoting content you no longer offer.
- Crawlability: Utilizes significant crawl budget.
- Indexability: Expired content is not dropped from search engine indexes (unless additional directives discourage indexation) and may continue to garner rankings.
- Ranking signals: Maintained by the expired content page, which can result in keyword cannibalization issues.
- User experience: Visitors may continue to be attracted via earned media and land on their expected destination page.
2. The Recycler: 301 Permanent Redirect
Redirecting your expired content to another page is a common solution, but the devil is in the locational details for SEO success.
Firstly, ensure the redirect destination will stay on-site for the foreseeable future.
Else redirect chains are created, which causes latency for users as well as take search engines time to follow and forward the ranking signals. Plus, after five chained redirects Google will reschedule the crawl, delaying indexation of the destination content.
This effectively rules out horizontal redirects (item A to item B) if item B will also shortly expire and need to be redirected (item B to item C).
Secondly, Mueller commented that any redirect should be to a page which can be seen as “a clear replacement”.
Else it is seen as “essentially just a 404 that you are trying to do with a redirect.”
In which case “what will happen is [Google] will essentially pick that up as a soft 404 page. And say oh, the webmaster probably did this wrong, and we’ll just treat it as a 404 on our side.”
This is crucial to understand.
301 redirecting expired content to an irrelevant page, such as the home page or top level categories, is likely to be treated by Google as a soft 404 and so won’t pass ranking signals to the destination page.
This may also rule out high vertical redirects (item A to category A).
A vertical redirect location that will often meet these two critical criteria, both relevant and lasting, is a filtered subcategory page.
How do you know you have the vertical redirect level correct? If not, you will see an increase in “Submitted URL seems to be a Soft 404” errors in Google Search Console.
As an additional benefit, specific redirects reduce friction for the user by landing them on a page with a set of highly relevant alternatives.
To further smooth the user experience, you can add a dynamically generated message explaining why they have been redirected.
For example, when item A redirects to filtered subcategory A add a message that item A is out of stock but below are other similar items.
Beware of the redirect message copy. Mueller has mentioned when Googlebot see a message that says “no longer available”, it may assume that it applies to the page itself. And again, the 301 redirect will result in a soft 404.
- Crawlability: After redirect is processed, utilizes crawl budget only if a link is found.
- Indexability: Expired content is dropped from the index, but this can take some time.
- Ranking signals: Passed on when redirected to a related page.
- User experience: Visitors come by clicking on an old link and land on an unexpected, but ideally relevant, destination page. Explain the redirect with a short message.
3. The Destroyer: 404 Page Not Found
Amassing thousands of 404 errors can be loathsome for SEO pros fearing some form of penalty.
This is understandable because they show up as errors in Google Search Console, which implies they need to be fixed.
But it’s also indisputably wrong.
Whoever came up with the idea that having 404s gives a site any sort of penalty, you’re wrong. Utterly wrong.
— Gary “鯨理” Illyes (@methode) August 6, 2015
Plus, a 404 or 410 status code is one of three valid options to remove content in the Google for Jobs developer guidelines.
What’s important to understand is that as the page will be dropped from search engine indexes any ranking signals the content has accumulated is lost to the 404 void.
@SEO_Portal yes, they are. Generally, with a few exceptions, if a page is not indexed, its links are ignored too.
— Gary “鯨理” Illyes (@methode) August 10, 2015
On the plus side, only a small amount of crawl budget will be spent on expired content URLs.
Googlebot will occasionally recrawl 404 pages, to see if they should reindex the URL. Google’s behavior is based on the assumptions webmasters often make status code mistakes.
Let’s also take a moment to look at this from the user’s perspective
You’re looking for a specific product, job opportunity, event, etc. You have found a link on a website or social media. Clicked it annndddd…
Page not found.
A generic 404 page is likely to cause a mix of disappointment and frustration. The majority of users will immediately bounce.
Determined users may look for navigational help such as on-site search. However, as the content no longer exists, they won’t be able to find exactly what they clicked to land on.
That is a lost conversion and a poor brand experience.
To improve user experience, implement a custom 404 page that shows a “that item is no longer available” message and recommends relevant alternatives.
- Crawlability: Infrequent crawls of expired content URLs.
- Indexability: Fast de-indexation by search engines.
- Ranking signals: Lost to the void.
- User experience: Visitors come by clicking on an old link and land on the 404 page, which can be customized to offer relevant alternatives.
- Reporting: Will show up as errors in GSC, but these don’t need to be fixed.
4. The Cleaner: 410 Page Permanently Gone
Similar to a 404 status code, 410ing a page will lead to the loss of ranking signals and the 410 landing page should be customized.
The difference is that a 410 status code tells crawlers that the page is permanently gone. As such, the expired content is removed from the index more swiftly.
- Crawlability: Infrequent crawls of expired content URLs.
- Indexability: Fastest de-indexation by search engines.
- Ranking signals: Lost to the void.
- User experience: Visitors come by clicking on an old link and land on the 410 page, which can be customized to offer relevant alternatives.
- Reporting: Won’t show up as errors in GSC.
5. The Juggler: Combining Status Codes
If none of the above options ticks all boxes, there is no reason you can’t choose to combine the approaches above.
For example, you could leave the page live for a set period of time to capture long-tail sessions of people may still be interested in the content and then change to a 410 when the visits no longer provide more value than the negative impact on crawl budget.
Or for a classifieds website that allows reactivation of listings within 30 days. You could leave the page live for that period, and then implement a 301 redirect once reactivation is no longer possible.
For fastest possible deindexation with minimal crawl, you could send double the directives by combining an unavailable after tag and a 410 status code when the content expires.
Expired Content Handling Decision Tree
Clear judgment and a little bit of analysis are required to choose how to manage expired content. Get your data together and follow the decision tree to your ideal status code.
A Note on User Experience
There are many arguments that one of the above options offers a better user experience than another. Often, these expert opinions directly contradict one another.
Here are the indisputable basics:
- Expired content is less useful than current content and as such will impact bounce rates, no matter what expired content handling option you implement.
- Clearly communicating that the content wanted by the user is no longer offered is essential to a positive user experience.
- Every expired content handling option can be implemented in such a way as to mitigate user frustration and offer an avenue onwards by suggesting alternate content.
- The only reliable way to know the best user experience for your audience is to A/B test.
So, what’s your plan for expired content handling?
- 5 Reasons You Should Remove Outdated Content (And One Reason You Shouldn’t)
- How & Why You Must Improve or Remove Your Old Content
- 5 Ways to Fix Your Stagnated Content
Featured & In-Post Images: Created by author, July 2019
Screenshot taken by author, July 2019