Forum Moderators: Robert Charlton & goodroi
Our host fixed the server side stats program so we can shock ourselves with even more numbers ( or their lack of ). And combining the now visible list of bots that are rampaging on our site with some threads here about how to, and how not to get images indexed in G ImageSearch...
...
i was wondering...
If you insert a NOINDEX directive into a page as META ( and not the robots.txt ), especially for Googlebot, will ImageSearch still index the pics on it?
I kind of recall an answer i got from G when asking about why, when, how to get pics indexed ( properly ) that it's no wonder that the images aren't in the ImageSearch index ( as opposed to the pages they were on ) because it's not only updated less regularly but also... is done by a DIFFERENT bot. Which would make sense for we're seeing some visits ( although very few ) from a bot named Google-Image 2.01whatever...
And adsense is a different bot as well, although this has nothing to do with what i'm asking :P
...
So, again.
if you insert a directive for ImageSearch in a robots META ( not .txt),
Will the pages be indexed all the same?
And on the other hand...
If you insert a NOINDEX tag onto a page for Googlebot,
Will ImageSearch still index the pics?
Since there is a separate bot for images as these comments/logs/and the corresponding help pages indicate (all of which i believe :P) would they obey directives in a META specifically for them, or those inserted for Gbot, or those inserted for all bots?
...
The theory part [google.com] i know. That all of them SHOULD obey one by one... in a robots.txt .
I'd like to ask whether someone had any experience in the implications of using either method in a META... was there any feedback, and so on.
We don't need to insert such a tag, but it may come very handy someday... to know that we can.
So at that point the content goes into the single crawl cache (now shared by all of Google's bots) to be processed on Google's back end. That's where the rubber truly meets the road. I've always assumed that somewhere, Google has every bit of data a server sends, even when that data is not made publicly available.
With a shared crawl cache, I think it is much less likely that one of the bots could gather data that is handled in a non-standard way. Although bugginess can always creep into any computerized process, I have only seen well-behaved actions in this area since the shared crawl cache was introduced earlier this year.