Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

How does Google index on-site search result pages?

         

drogbasen

8:33 am on Nov 23, 2010 (gmt 0)

10+ Year Member



Hi to all,

these days when i check my site's google index result, i found that many On-Site Search Pages (such as: www.example.com/search?q=*****) are indexed.. I didnot build any link building for these pages. And there is not any internal links for these pages.
So how google crawls and indexes these on-site search result pages? Does Google Analytics Code will help google robots craw these pages? Thanks for your feedback!

almighty monkey

10:35 am on Nov 23, 2010 (gmt 0)

10+ Year Member



Does Google Analytics Code will help google robots craw these pages?


Bingo!

scooterdude

10:42 am on Nov 23, 2010 (gmt 0)

10+ Year Member



actaually, my analytics package suggests that googlebot or some other google automata tests search tools by feeding the search box with random or related search terms, said google automata also seems capable of pressing the search button

Sgt_Kickaxe

11:04 am on Nov 23, 2010 (gmt 0)



If you type in an adult search term and press enter does that adult word end up on the search result page title? in the result page url? Does your site thoughtfully write "Here are your search results for adult word"? If so, to any of those, you'll want to add a noindex meta tag to the search result page as well as block it via robots.txt. In the alternative you'll want to set up a badwords filter to send those to a 404 error page. User generated pages/content must always be closely monitored.

drogbasen

11:40 am on Nov 23, 2010 (gmt 0)

10+ Year Member



@Sgt_Kickaxe Thanks for your advice. i havent realized this issue. But it's unavoidable to show some adult content for adult words search on site. Unless somebody attacks a site, it wouldnt hurt it too much i think.

drogbasen

11:48 am on Nov 23, 2010 (gmt 0)

10+ Year Member



@scooterdude How to check the googlebots visiting a website?

FranticFish

12:58 pm on Nov 23, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Google crawls via forms and has done for some time - here's their offical announcement - [googlewebmastercentral.blogspot.com...]

Note that back then they said
(a) they only crawl GET forms, and
(b) pages indexed in this way are not part of the site's normal indexing allocation

s34rch

10:45 pm on Nov 23, 2010 (gmt 0)

10+ Year Member



These pages may be also indexed if anyone else links to them. One of your customers might have done this.

Note the Google guideline about blcoking site search pages:
"Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines."

Robert Charlton

7:10 am on Nov 24, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Regarding how Google finds the urls to crawl your search pages, it's also possible that Googlebot is crawling publicly available server logs....

Why is Google indexing my entire web server?
http://www.webmasterworld.com/google/3396393.htm [webmasterworld.com]

(Note that the Google Scholar article I link to in the discussion is no longer available, but the relevant section is quoted in the thread).

As s34rch notes, Google doesn't like to show search results in its SERPs, whether it's an adult word or not. See this discussion in Matt Cutts blog, where he suggests blocking all search results pages....

Search results in search results
http://www.mattcutts.com/blog/search-results-in-search-results/ [mattcutts.com]