Google Algorithm Changes in February - Part 2
Welcome back everyone. In this post I'll be going through changes 11-20. If you haven't caught this series of posts from the beginning, I'm working my way through the 40 changes that Google have made to their algorithms, indexing and search results in February, and you can find the first 10 changes discussed here.
11. Improved detection for SafeSearch in Image Search. [launch codename "Michandro", project codename "SafeSearch"] This change improves our signals for detecting adult content in Image Search, aligning the signals more closely with the signals we use for our other search results.
Another change, like number 9, designed to maintain (or, depending on your view, repair somewhat) Google's "do no evil" reputation. I wonder if the image recognition software employed by Google "Search by Image" (which lets you drag and drop an image into the search box in order to find out what it is and to identify similar images) is used to filter out unsavoury content in Google Images as well. If so, aligning the signals between normal search and image search could be a matter of broadening the range of images considered "adult" so they match a list of words that Google uses to filter text content.
12. Interval based history tracking for indexing. [project codename "Intervals"] This improvement changes the signals we use in document tracking algorithms.
Google keeps track of how documents change over time for a number of reasons. If the topic of a document significantly changes it can indicate that the host domain has changed ownership, potentially as a result of an SEO buying an old or credible domain in order to capitalise on its link profile, something Google is likely to want to put a stop to.
If an old document has a new link added it can be a sign of potential link buying by someone trawling old content for easy contextual link opportunities. Conversely if an old document has a link removed it can be a sign that the page that was previously linked to has lost quality or relevancy.
13. Improvements to foreign language synonyms. [launch codename "floating context synonyms", project codename "Synonyms"] This change applies an improvement we previously launched for English to all other languages. The net impact is that you'll more often find relevant pages that include synonyms for your query terms.
I don't know when the previously launched change happened, but optimising page content in English has long been a case of utilising multiple keyword variants and synonyms. Doing this not only means you optimise your pages for multiple target keywords (as many as 20 keyword could be actively optimised for on a typical webpage providing there is sufficient content - not to mention the long tail variations you end up ranking for by accident), but also your rankings for each keyword will be individually stronger than if you dedicated an entire page to it, by virtue of the semantic variation. This change underscores how search engines understand the relationships between words and their synonyms.
14. Disabling two old fresh query classifiers. [launch codename "Mango", project codename "Freshness"] As search evolves and new signals and classifiers are applied to rank search results, sometimes old algorithms get outdated. This improvement disables two old classifiers related to query freshness.
A typical "classifier" is what Google calls an algorithm designed to apply additional meta data to a document in its index. Google's infamous Panda update, for example, came off the back of an updated classifier designed to describe whether a document is poor quality. Using these classifiers allows the Google front end to function faster, since they ensure documents are already pre-categorised in a number of ways negating the need to run the algorithms over and over again whenever someone does a search. I always explain how Google works by asking people to imagine the index at the back of a book, and in this comparison classifiers function much like the alphabetical categorisation in such an index, allowing you to jump straight to the sub-section of pages you're interested in.
This change seems to be related to the QDF or "query deserves freshness" algorithm, which Google has used since 2006 to determine if a users search term indicates they are looking for fresh content. If so, Google will then show fresh content in preference to the older, more authorative content that generally tends to rank well.
The classifier here applies to a search query instead of a document (let's face it, classifying whether a document is fresh or not isn't terribly complex), but the principles of the classifier are the same - categorising things in advance to speed up the front end.
As well as actual search demand, the QDF algorithm is known to factor in things like how much news coverage a search term has and how often it appears in recent blog posts. I'd guess that over time social media has been added as a more important source and potentially the two classifiers that have been disabled are things such as blog coverage which is increasingly irrelevant considering the near total domination of Twitter and co. for trending news.
15. More organized search results for Google Korea. [launch codename "smoothieking", project codename "Sokoban4"] This significant improvement to search in Korea better organizes the search results into sections for news, blogs and homepages.
Although I do find it interesting that "homepages" is its own category, I'm certainly not well placed to comment on this one as my Korean is basic at best, so I'll move swiftly on.
16. Fresher images. [launch codename "tumeric"] We've adjusted our signals for surfacing fresh images. Now we can more often surface fresh images when they appear on the web.
More "query deserves freshness", this time for images. This one is possibly a variation or essentially the same thing as change 14, above. Again, I'm not convinced there is any way other than the obvious one to determine whether as image is fresh, so this probably applies to the search query itself and Google's ability to determine whether a user is looking for newer images or not.
What I've started noticing today is that in image search results
for news related terms Google has started labelling recently
discovered images, something that seems likely to be related to
this update. For example, in the following search for
"santorum" (currently ever present in the news, for wholesome and
not so wholesome reasons) a number of the images are labelled in
the bottom right as "3 days ago", "12 hours ago" and so on:
17. Update to the Google bar. [project codename "Kennedy"] We continue to iterate in our efforts to deliver a beautifully simple experience across Google products, and as part of that this month we made further adjustments to the Google bar. The biggest change is that we've replaced the drop-down Google menu in the November redesign with a consistent and expanded set of links running across the top of the page.
For reasons that sometimes elude marketers the toolbar is very important to Google, so if these detailed updates from them continue I'd expect to see it being regular fine tuned for usability and attractiveness to users. How does Google judge my page speed? The toolbar. How does Google know how much time users spend on my sites? The toolbar. How does Google know which links users click, to and within my site? The toolbar. A lot of this information is available to Google through other sources in theory, such as Google Analytics, but Google disavows any use of that data, something they don't do with the toolbar.
18. Adding three new languages to classifier related to error pages. [launch codename "PNI", project codename "Soft404"] We have signals designed to detect crypto 404 pages (also known as "soft 404s"), pages that return valid text to a browser but the text only contain error messages, such as "Page not found." It's rare that a user will be looking for such a page, so it's important we be able to detect them. This change extends a particular classifier to Portuguese, Dutch and Italian.
A "soft 404" is a "page not found" error page that has been set up wrongly so it actually returns a 200 header response instead of a 404 response. While the page may be designed in such a way that users understand the page is no longer there, the 200 response tells browsers and search engines that the page was found just fine. That's why Google needs to be able to interpret the content on the page correctly so that they can discard such pages from the index, or more likely put them in the supplemental index (where they won't have any chance of ranking).
Usually this occurs when developers set up their error handling to redirect requests for non-existent pages to a custom error page. Beyond worrying about the effect this can have on the cleanliness of Google's search results, soft 404's are a fairly common SEO problem for businesses too. You should be monitoring and fixing internal broken links and reclaiming the link value of external links that point to error pages by redirecting those pages elsewhere. It's much harder to do this when your error pages aren't being reported because they are returning a 200 response.
19. Improvements to travel-related searches. [launch codename "nesehorn"] We've made improvements to triggering for a variety of flight-related search queries. These changes improve the user experience for our Flight Search feature with users getting more accurate flight results.
Bad news for travel sites and flights aggregators everywhere, as
Google continues to persevere with and refine the incorporation of
flight details directly from the airlines into the SERPs. If
you haven't seen this before, it's a "onebox" result that sits
directly below the paid search ads and looks like this:
I think aggregators can take comfort in the fact that so far Google's implementation of this isn't very useful for users. Although I'm not aware of any studies on the attention or click through rate that this result type receives from users, without prices or a call to action I'd guess it was fairly minimal.
Good news, of course, for airlines themselves who at least have another alternative route to natural search visibility. Not that you can control or influence it, as the flight details are taken from a central database of flights that Google buys or borrows, but - never look a gift horse in the mouth.
20. Data refresh for related searches signal. [launch codename "Chicago", project codename "Related Search"] One of the many signals we look at to generate the "Searches related to" section is the queries users type in succession. If users very often search for [apple] right after [banana], that's a sign the two might be related. This update refreshes the model we use to generate these refinements, leading to more relevant queries to try.
See also change 1, "more coverage for related searches". Not only have Google added more signals, they've also refined the original (and core) signal - search behaviour. This reminds me of a little trick that can be used to potentially get your brand appearing as a suggestion in the related searches by encouraging users to search for your brand in conjunction with a generic keyword - something like "seo agency greenlight". First, make sure you rank in position 1 for the phrase in question. Next, link to the search results for that phrase from a source that will send lots of traffic (thus achieving the goal of having many people "searching" for the phrase) but where you don't care about squandering the SEO value of said links on a Google SERP. Twitter is perfect for this.
Note that for all we know the refinement in question might well be designed to remove SERPs that have been linked to from the equation when using search demand to calculate related searches.