While we strongly recommend against restricting our system's automatic review of your landing page, you can edit your site's robots.txt file to avoid a review. The file must explicitly exclude your page from our system visits as follows:To prevent AdsBot-Google from accessing your site, add the following to your robots.txt file:
User-agent: AdsBot-Google
Disallow: /To prevent AdsBot-Google from accessing parts of your site, add the following to your robots.txt file:
User-agent: AdsBot-Google
Disallow: /exclude/[adwords.google.com...]
Depending on any current robots.txt restrictions, we also might want to be sure we aren't accidentally excluding the AdsBot. What isn't clear to me is whether AdsBot will also be participating in the big cache sharing free-for-all along with all the other googlebots. Also, I hope the user agent includes the string "googlebot" so various stats packages automatically catch it as a Google spider.
Many people have entire sections of their site nocrawled since they're just PPC landing pages and don't want the dupe content issues.
It would be useful if there was a section added to the sitemaps console to check if this bot was being allowed into certain sections of a site while keeping the other Google bots out.
I am only speculating here. But since the landing page quality algorithm change that occured on or around April 5 came out of the blue, I am guessing this is just an update to the TOS, to reflect the new activity. In a more perfect world, the changes would be announced, reflected in the TOS, then implemented in that order.
How is this different from the landing page quality assessments that AdWords was supposedly already doing previously?
I wondered that too when I read the new T&C that came out this week. Was that just a bunch of smoke to make sure we behaved until they could figure out how they could really check the landing page?
saying that Google now would have one bot from here on in
My memory is that all of Google's bots would share one cache, not that there would only be one bot.
The idea is that an Adsense page, for example, would not need to be crawled by both Media-bot and a Googlebot too, one to get info for targeting ads and the other to get placement in the regular index. That practice was wasteful of bandwidth and the new Big Daddy spiders now don't need it.
Which is why I wonder if the landing page bot also will be cache sharing -- and how that will affect dedicated landing pages that site owners are currently excluding from regular crawling. I'd hate to see a cache sharing bug end up placing these dedicated pages in the regular index -- all kinds of duplicate content filtering might fall out in that scenarion.
My understanding of what MC has said, eg in his blog, is that whichever bot(s) fetch a page, all bots will obey the relevant bit of robots.txt as if they have fetched the page directly themselves.
So there should, in theory, be no difference in what gets indexed (etc); only less bandwidth to do the job.
However, it may make some black-hat UA-based cloaking harder I guess.
Rgds
Damon
Well, the problem goes back to the fact that we block a number of IP ranges where we've had problems with hacking and/or abuse. I finally got an AdWords rep to admit (and this is only recently.. always before the rep DENIED that there were external editors/reviewers) that G uses editors in India and apparently we've blocked whatever IP addresses they come in on. So, naturally they will never get a valid landing page.
I sure do hope they turn loose their new Adsbot-Google from a reliable US IP range or at least publish the IP addresses their Adsbot will be coming from so we can unblock our block(s).
I've emailed AdWords support about this, but no response yet.
Anybody have similar experience / concern(s)?
I think they've been scanning landing pages in some form or another for a while.
They have its been this one New googlebot UA [webmasterworld.com]from april.