Earlier this week President Donal Trump said Google was rigged:
Google search results for "Trump News" shows only the viewing/reporting of Fake News Media. In other words, they have it RIGGED, for me & others, so that almost all stories & news is BAD. Fake CNN is prominent. Republican/Conservative & Fair Media is shut out. Illegal? 96% of….
— Donald J. Trump (@realDonaldTrump) August 28, 2018 (opens in a new tab)
Then, Trump's economic adviser said they'd take a look at whether Google search results should be regulated (opens in a new tab).
That was a big no-go for the SEO community. You can read more background information on this on Search Engine Journal (opens in a new tab) and a good explanation of why Google's search results are shown the way they are from SEO veteran Bill Hartzer (opens in a new tab).
Barry Adams put together a solid background article on the neutrality of search results (opens in a new tab), as at the end of the day humans write the code that drives Google.
In a nutshell, what does it take to rank well in Google?
You need to have:
- Authority and trust (opens in a new tab).
- Great content
- Solid technical foundation
Let's apply that to the Whitehouse.gov website!
Does the Whitehouse.gov website have authority and trust? Yes, they have that locked down.
Does it have great, SEO optimized content? Nope.
Does it have a solid technical foundation? Absolutely not.
Mr. President, if your SEO sucks you're getting outranked by other websites. And…I hate to say it, but your SEO sucks. In this article we'll explain the most important pain points.
Whitehouse.gov's SEO at first glance
We always like to use the Graphs view to get a feel for a site's SEO, so we'll do that here too.
Break-down by URL type
Here's a break-down by the URL types on whitehouse.gov:
In total we found 64,601 URLs, of which only 1/3 are pages. That's a bad way to spend your crawl budget (opens in a new tab). Especially those ~ 39,000 404
pages are bad. SAD!
Distribution of relevance
Let's drill down to just pages, and zoom in on the distribution of those relevance scores (relevance score is a PageRank-like metric that indicates the importance of pages within the website):
You can see that the majority of the pages has a very low relevance score, which in this case comes down to a poor internal link structure.
Indexability of pages
We know that 2/3 of the URLs found are non-pages. Let's zoom in on the indexability status of these pages, how well is crawl budget used within those 21,911 pages?
We see that only ~ 22% of the pages are indexable. Nearly 80% of the pages are canonicalized to other pages. Again, a bad way to spend your crawl budget.
Orphaned pages
And what about orphaned pages?
We see that there are over 750 orphaned pages (pages that aren't linked to from any other pages).
Meta information
OK, now we've gotten a feel for the state the platform is in let's take a look at the meta information (title (opens in a new tab) and meta description (opens in a new tab)).
The issues that stands out like a sore thumb: 21% of the pages don't have a unique title. SAD!
Example:
What page do you think ranks for the query Daily Press Briefing by Press Secretary Sean Spicer
?
For us this page from May 2017:
The issues below have less of an impact, but they were worth mentioning in this context as it says a lot about the care that goes out to meta information:
- 20% of the pages have a meta descriptions that's either too long or too short.
- 14% of the pages have a title that's either too long or too short.
- 4% of the pages don't have a unique meta description.
Headings
After we saw what went wrong regarding the meta information, we already had a bad feeling about what we'd find regarding heading issues.
It turns out that 21% of the pages don't have an unique H1 heading (opens in a new tab). There's massive overlap in pages suffering from this issue, and pages that don't have unique titles. They're losing out on using two key elements to communicate what their content is about. The way a lot of the body content on these pages is structured is also highly similar.
Broken and redirected links
When it comes to links there's some house keeping to do as well:
There are 215 broken links and 77 redirecting links. Some of these redirected links are chained redirects.
Here's an example:
There's a link from https://www.whitehouse.gov/briefings-statements/first-lady-melania-trump-announces-2018-white-house-easter-egg-roll/
to www.whitehouse.gov/easter-egg-roll
(note: without protocol and without trailing slash). The page that's linked 301-redirects to https://www.whitehouse.gov/eastereggroll
which in turns redirects to https://www.whitehouse.gov/eastereggroll/
.
On a regular basis, links break and aren't fixed. How we know this? Because of ContentKing's handy alert feature:
URL parameters: filters and pagination
You can use filters to search for news articles and briefings statements around specific issues such as education and healthcare. To filter the content on these issues, the URL parameter ?issue_filter=$issue
is used. Example: https://www.whitehouse.gov/news/?issue_filter=education
.
These filtered pages are canonicalized (opens in a new tab) to https://www.whitehouse.gov/news/
, but a quick search in Google (opens in a new tab) shows that hundreds of of these pages are indexed. They may want to add a robot noindex tag (opens in a new tab) to these if they don't carry any value. Or they can set up URL parameters in Google Search Console / Bing Webmaster Tools to communicate how to deal with these type of pages.
There's something strange going on with pagination as well. There are thousands of pages like https://www.whitehouse.gov/articles/page/10/?page=10
. These pages seem to be suffering from an issue that causes pagination to be applied twice.
What else did we find?
We did some poking around to size up the Whitehouse.gov's SEO performance. We also found a bunch of other interesting things:
After publishing pages, they're often updated a few days later with better meta descriptions, Open Graph/Twitter Card descriptions, as well as Open Graph/Twitter Card images:
Around April 20/21 meta descriptions and Open Graph/Twitter Card descriptions across the board were filled in (likely grabbed from the first paragraph), shortened and automatically truncated:
Mid-January, the page title template was changed from $pageName | The White House
to just $pageName
:
On December 15th, 2017 there was a massive pre-Christmas update to the site. We recorded changes ranging from the Google Analytics ID to titles, headings and Open Graph and Twitter Card data:
Wrapping it up
SEO is about making it easy for search engine crawlers to go through your website and find all the content they need to find. When they find it, they should be able to understand what it's about.
Whitehouse.gov does not do either of these things, so Mr. President I'd first fix this before doing anything else. Put Whitehouse.gov back in the race so it can compete with other websites. Then you'll have a chance at ranking for queries other than branded queries, or names of former presidents.
At ContentKing, we believe everyone should be able to optimize their sites. We'd be willing to chip in and give you a free ContentKing account for the duration of your term as President of the United States.