Forum Moderators: open
Did anybody see something like that in the logs? I tried a search here at WebmasterWorld and didn't turn up anything. Nor did I see it before in our logs.
This happened a day after a full crawl of our sites, although we have the respective *.js in place since the beginning of the year.
Im curious about what 'Googlebot/Test' is up for.
Anyway, I saw "Googlebot/Test" getting some javascript files and a few regular files as well.
However, it's main appetite seems to be javascript files.
It does seem to obey the robots.txt file.
So because a rarely-seen spider didn't pull your particular .js files, you assume it obeys robots.txt?
Requests just one page, same one all 5 times.
only js in the file is a third party counter.
This is a very old page that hasn't been changed/updated in over a year except for site wide navigation changes.
Odd...
GG did say he wasn't familiar with this bot and was going to check it out, but AFAIK he hasn't mentioned it again.
Just my thoughts
jazzbo
If you want to find out which is happening, just place a few dummy functions in your js files. In those dummy functions open a window using an url that does not exist made up of two strings.
If your error logs show the non-existent urls, then the bot is running the javascript.
It might be necessary, to be certain, to connect the dummy functions to a small links somewhere on the page.
Hopefully, someone will be able to provide an answer rather than just speculate and ruminate.
Kaled.
[edited by: Powdork at 4:37 pm (utc) on Mar. 20, 2004]
- you might want to simplify your javascript. Google can often extract urls from javascript, but I can believe that doing utterly weird stuff might mess things up.
[webmasterworld.com...]
Forgive me but i simply don't believe you. I haven't seen this on any of the 50+ sites i manage for me and my clients - many of these sites use javascript for many things - from popup to document write. Never ever had any single js crawled by gbot nor any other major bot.
Enough reasons for me to say that you're either wrong or simply just trying to blow another bubble.
Regarding GBot/Test ignoring robots.txt: it wouldn't surprise me. Doh! A bot that doesn't identify itself with a user agent that contains correct contact info and/or correct whois info +plus+ GoogleGuy claiming that he doesn't know anything about a Testbot ... sorry, but this stinks ...
There are a number of reasons both of you could experience what you have experienced and neither one would be wrong.
Google is crawling javascript for me - not for a year I don't think but for a few months at least. It is in the index - with the reverse links pointing to the page with the javascript links.
Google has made some MAJOR changes in the last month or so in the links they crawl and index. Probably the biggest change in two years from what I monitor.
As with any other thing google does - they do it in steps. They probably started off crawling simple javascript stuff first - and moved on from there. Depending on your PR, JS config, and other factors would depend on whether they would crawl you.