Forum Moderators: open
Google's official guidelines seem to be behind their actual capabilities - I saw an article yesterday saying they could now crawl Flash content. So any idea if Google's spiders can crawl javascript hrefs? Assuming Google doesn't have a valid cookie set, I can't think of how to allow them into the site. We intercept anyone without a valid cookie on every page.
Google's guidelines don't give us a lot of info on *how* to actually allow search bots to crawl. Do bots access with a different protocol that we can identify?
I'd suggest to create unique URLs for each of the countries and link (say) flags on front page to these pages. Naturally do not require cookie to be set in order for bot to access these country-specific URLs.
What kinds of links does Googlebot follow?
Googlebot follows HREF links and SRC links.
end quote.
I use JavaScript redirects, and google doesn't seem to follow them. I did some tests where I placed a new page, with only JavaScript pointing to it, and at the same time added another page with a regular link. Only the regular one got spidered.
Oren
I'm now checking for the user agent string set to the user agent of the google robot. This is from:
[robotstxt.org...]
If the user agent is determined to be a valid robot (in this case, only google), then no redirect occurs. It goes straight into the site. I tested this using the browser from:
To make it as primitive as possible, I went into the preference and turned off Java, Javascript and did not accept cookies. I set your user agent string directly in this browser, in the preferences. Set to:
Googlebot/2.X (+http://www.googlebot.com/bot.html)
I could then browse the site as a robot would see it.
Any further experiences on this front?
If the user agent is determined to be a valid robot (in this case, only google), then no redirect occurs.
This sounds like cloacking to me. One would assume that a search engine would do a test run over all or suspected in cloacking domains by requesting pages with typical browsing useragent. This might be innocent, but in the age of automated processes it might be treated with extreme prejudice. I'd rethink this approach.