Welcome to WebmasterWorld Guest from

Message Too Old, No Replies

Must we still make site search query strings non-indexable?



2:31 pm on Dec 11, 2012 (gmt 0)

5+ Year Member

My main website is completely custom designed, by me, and since I started it before CMS such as Wordpress became popular, I never made the transition so as not to deal with the headache of the redirects etc. Plus I like to have full control.

I am built a basic search facility so people can search through articles and reviews, the usual. Are there any repercussions in having querystrings such as /?page=xxx&query=xxx in terms of SEO rankings?

I see a lot of sites esp Wordpress etc have a search box so am I just being a bit paranoid? Do I need to make the search result pages non-idexable? And would Googlebot place and index lots of queries which might end up having a negative effect? Or is this a myth from ages ago?


5:00 pm on Dec 11, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

You NEED to use robots.txt or noindex to prevent google from crawling or indexing those pages. Google has been known to delist entire sites that have indexed search result pages.

On the google support forums, there was a thread that Matt Cutts participated in. The site owner was asking why he was penalized. The site had a box on the home page that looked like a search box (although it was meant to put in numbers). Matt Cutts gave examples of putting #*$! terms into the box and showed that the site gave back indexable pages full of products. It didn't matter that those pages had nothing to do with #*$! terms, the product list was just a default list of products and you got the same page back for any term that wasn't the type of number that the script that powered it was expecting.

You don't want to be in the situation where a Google reviewer types in "Viagra" into your search box, finds an indexable page, and imposes a penalty on your entire site.


5:59 pm on Dec 11, 2012 (gmt 0)

5+ Year Member

OK, so for the results page its best to use Noindex in the meta tags and disallow in robots.txt then?

Otherwise if the results cant be indexed then should be no other issues, right?


6:23 pm on Dec 11, 2012 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

Either robots.txt OR noindex should be sufficient.

It doesn't make sense to use both. If you put it in robots.txt, there is no way for Googlebot to crawl it to find out that it is noindex. There does not appear to be a way to tell Google that you don't want something crawled and you also don't want it indexed. If they can't crawl something they generally don't index it (so robots.txt is sufficient for this case), but if a page that is in robots.txt gets enough external links they may index it based on the anchor text and context of those links alone.

Featured Threads

Hot Threads This Week

Hot Threads This Month