Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Googlebot going through internal search forms

         

LunaC

4:59 pm on Jan 19, 2008 (gmt 0)

10+ Year Member



I've seen Googlebot using one of my sites internal search forms the last while. It seems to be words that are on my site (although occasionally irrelevant i.e. footer, header etc.) and seems to be only 1 word, never search phrases. The IP is the real Googlebot.

I don't link to the search pages anywhere although it's possible (but unlikely since theres so many terms it's looking at) that there may be outside links to them.

I have noindex, nocache meta tag on the search landing pages and I haven't seen any show up in Google's SERP. (Should I add nofollow or let any pagerank flow?)

I'm using the exact same search script on another of my sites and don't see Googlebot doing this. Why is it going through search forms and is this a potential problem?

jomaxx

7:03 pm on Jan 19, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It could be getting the URLs from the toolbar. I highly doubt it's actually trying to submit a form at random.

Personally I have my search scripts blocked in the robots.txt, because I don't want them hammered by bots. But there's no intrinsic reason you can't allow Google to index those pages; just take a good look at the landing page from an SEO perspective -- how it would appear in the SERPs, whether there are any no-no's on the page, etc.

LunaC

4:58 pm on Jan 22, 2008 (gmt 0)

10+ Year Member



That's possible, but it seems very unlikely that it's from the toolbar. I looked at the logs and even the data gathered from search and no human visitors searched many of these terms (footer, header, nav as well as some other odd generic terms like happy, printer, green etc.), the only reference I could find was Googlebot. The fact that it's only ever a single word even seems odd.

It's very strange, the IP is real, everything points to it being a valid Googlebot but the behavior on that site is unlike anything I see on my others. If it was the toolbar I'd expect to see the same thing on another site that uses the same search script. More in fact, since it's far busier than the site I am seeing it on and Googlebots are always crawling it, just never the search area. The only difference I can see is in the one it isn't going through search (or whatever it is doing) is that the script is in CGI-bin, the one it is is just /search/ at root level.

I blocked it in robots.txt for a bit to see what happens, and in Google Webmaster Tools, it showed a ton of those URLs as 'URLs restricted by robots.txt'.

A bit of a mystery, but I'm leaving them as noindex, nocache in the meta tags and removed the robots.txt block for now.

jomaxx

9:04 pm on Jan 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No human ever searched for these terms, or just recently?

Do you have Google Analytics on the search results pages? AdSense? External links to Google or any related sites? Internal or external links to any pages containing Google Analytics or AdSense?