Saturday, December 08, 2012

Things Google Should Invent: don't index blog template sidebars on individual post pages.

Newer blog templates have expandable tree archive links that contain the title of every post.  (You can see an example in the right-hand column of this blog, under the twitter feed.)

The problem is that Google indexes these archive links like they're just regular words on the page, with the result that if you google within someone's blog for a word or phrase that happens to appear in a post title, Google will return every single page from that blog.

For example, I once wrote a blog post with the title "Denervousization", which isn't a real word at all and I just made up for the post.  However, if you google the word "denervousization", you get the post entitled "Denervousization", you get this post if you're reading it after it's been indexed, and you get a number of other posts that are turning up simply because they have the link to the "Denervousization" post in the archive links.

Not that many extra posts are turning up for this particular search because I switched to a template with expandable tree archive links very recently, but as time goes by they'll all get reindexed and eventually every single post will turn up if you search for just the one post with a distinctive title.  Not terribly useful, is it?

Since Blogger and Blogspot belong to Google, they should be able to work something out between the two of them to produce more effective search results.


laura k said...

Hmm, I never thought of that. Recently, instead of going back in my own blog's archives to find individual posts, I've been Googling wmtc + whatever key words I can think of, and I've been impressed with how easy it is to find what I'm looking for.

impudent strumpet said...

It totally does find what you're looking for, it's just there's a lot of noise besides. It also skews hit counts for people who use google results as quick and dirty linguistics research. There are currently 9 google hits for "denervousization", when it has only been used once organically and once for linguistics purposes. Once Google gets my whole blog indexed under the new template, there will be over 4,000 hits, when really the searcher only needs the one, and which does not reflect the frequency of the word use.

laura k said...

Ah-ha. Linguistics. Ah-ha.