Wordpress blog search pages indexed in Google

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Wordpress blog search pages indexed in Google

deathbytabasco

6:51 am on Feb 23, 2008 (gmt 0)

I have noticed for some of my Wordpress blogs that I have loads of pages indexed in Google like the following:

www.example.com/?s=keyword&submit=Search .

Often the keywords in question are just random words that can be found in the text of the site or even misspelled words. Could this be a competitor somehow trying to get my sites in trouble? The fact that the words are often so obscure would lead me to believe this. There seems to be no links pointing to these strange search strings. I don't understand how they got into the index.

[edited by: Robert_Charlton at 8:12 am (utc) on Feb. 23, 2008]
[edit reason] Used example.com. It can never be owned. [/edit]

tedster

3:12 pm on Feb 23, 2008 (gmt 0)

Here's a similar discussion, started back in October but still active:

Google indexing large volumes of (unlinked?) dynamic pages [webmasterworld.com]

That thread pretty much looks at these two possibilities:

- Someone linking to these pages for whatever reason
- Inexplicable spidering behaviour by Google.
Receptional_Andy

I've seen some examples of this behavior too, although not in large volumes. It certainly can be a competitor, but now I'm beginning to lean toward experimental spidering behavior from Google. Could also be some malicious automated behavior from an unknown person, a competitor or just an experimenter.

Have you considered blocking the search results pages with robots.txt? At some point it doesn't matter HOW googlebot is getting these urls. If they resolve, then they can cause problems, so fixing the issue become the priority.

FromRocky

4:11 pm on Feb 23, 2008 (gmt 0)

This will give you more trouble if the indexed pages do not have any or minimal content but AdSense ads. Use robots.txt to block them as soon as possible if this is the case.

tedster

4:15 pm on Feb 23, 2008 (gmt 0)

Also, welcome to the forums, Mr. Tabasco.

We've got a section here called Hot Topics area [webmasterworld.com], which is always pinned to the top of this forum's index page. In there is a post about WordPress that you may also find helpful:

WordPress And Google: Avoiding Duplicate Content Issues [webmasterworld.com]

deathbytabasco

9:42 pm on Feb 23, 2008 (gmt 0)

Thanks for that Tedster, yes I seem to have the same problem mentioned in that thread.

Damn you try and make your site user friendly with a search box and it comes back to bite you!

Regarding getting the pages removed and stopping any future pages of this nature finding their way into Google. Would

Disallow: /?s=*

be the correct line to add to robots.txt? And how long would I be looking at for those pages to disappear?