homepage Welcome to WebmasterWorld Guest from 54.205.59.78
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 58 message thread spans 2 pages: 58 ( [1] 2 > >     
"Googlebot/Test" Pulling External Javascipt?
Google seems to be testing something to check about javascript
jazzbo




msg:201781
 3:16 pm on Mar 18, 2004 (gmt 0)

64.68.89.190 - - [18/Mar/2004:12:03:53 +0100] "GET /check-inkc.js HTTP/1.1" 200 463 "-" "Googlebot/Test"

Did anybody see something like that in the logs? I tried a search here at WebmasterWorld and didn't turn up anything. Nor did I see it before in our logs.

This happened a day after a full crawl of our sites, although we have the respective *.js in place since the beginning of the year.

Im curious about what 'Googlebot/Test' is up for.

 

mayor




msg:201782
 4:58 pm on Mar 18, 2004 (gmt 0)

Me too. Posted a new thread too, but mods haven't released it. Maybe they won't now, since one is already started.

Anyway, I saw "Googlebot/Test" getting some javascript files and a few regular files as well.

However, it's main appetite seems to be javascript files.

It does seem to obey the robots.txt file.

bull




msg:201783
 5:02 pm on Mar 18, 2004 (gmt 0)

It does obey robots.txt as my javascripts are in a disallowed directory and were not spidered.

Dolemite




msg:201784
 5:05 pm on Mar 18, 2004 (gmt 0)

It does obey robots.txt as my javascripts are in a disallowed directory and were not spidered.

So because a rarely-seen spider didn't pull your particular .js files, you assume it obeys robots.txt?

loganoski




msg:201785
 5:10 pm on Mar 18, 2004 (gmt 0)

Yes, I have GoogleBot/Test spidering several times my .js files.

bull




msg:201786
 5:26 pm on Mar 18, 2004 (gmt 0)

So because a rarely-seen spider didn't pull your particular .js files, you assume it obeys robots.txt?

1. The bot is not at all rare, see several threads in f39, too.
2. Yes, from my observations I assume it obeys robots.txt
3. You -or anyone else- need to prove the contrary. No respect for robots.txt is commonly regarded as something, let's say, unethical for robots. In dubio pro reo.

sabai




msg:201787
 6:08 pm on Mar 18, 2004 (gmt 0)

are any of you using adwords? It could checking that you don't have popups...

bull




msg:201788
 6:13 pm on Mar 18, 2004 (gmt 0)

no.

Powdork




msg:201789
 6:16 pm on Mar 18, 2004 (gmt 0)

I wonder if this will put an end to the js redirected doorway pages that seem to be doing so well now.

mayor




msg:201790
 6:47 pm on Mar 18, 2004 (gmt 0)

>> javascript redirected doorways

Hmmmm, how do you do check to see if a site is using javascript redirected doorways?

bcolflesh




msg:201791
 6:53 pm on Mar 18, 2004 (gmt 0)

how do you do check to see if a site is using javascript redirected doorways?

Log all your browser requests with a proxy/sniffer.

nancyb




msg:201792
 6:55 pm on Mar 18, 2004 (gmt 0)

this bot has come around 5 times since 3/4, it pulled robots.txt the first time only, last visit was 3/16.

Requests just one page, same one all 5 times.

only js in the file is a third party counter.

This is a very old page that hasn't been changed/updated in over a year except for site wide navigation changes.

Odd...

GG did say he wasn't familiar with this bot and was going to check it out, but AFAIK he hasn't mentioned it again.

cashmere




msg:201793
 6:55 pm on Mar 18, 2004 (gmt 0)

The ones I have seen use onmouseover, so load the page with the cursor outside the window.

bull




msg:201794
 7:05 pm on Mar 18, 2004 (gmt 0)

fearing that my language-based redirect will be interpreted as spam, therefore scripts disallowed. Hope there is no penalty on disallowed scripts.

mayor




msg:201795
 7:12 pm on Mar 18, 2004 (gmt 0)

Maybe they're looking for those automatic popups used by frauds to set affiliate link cookies, even if the visitor doesn't click on the link.

But why would the search engine care if one affiliate steals commissions from another one? Wouldn't they say, "hahha, serves them right!".

kaled




msg:201796
 7:19 pm on Mar 18, 2004 (gmt 0)

I hate to burst bubbles, but this is not new and has nothing to do with the testbot. Google has been spidering and indexing my javascript files for maybe a year now.

Kaled.

mayor




msg:201797
 7:21 pm on Mar 18, 2004 (gmt 0)

Yeah, I think a witch hunt about spiders visiting .js files would be a red herring. Maybe they're just looking for the deep web.

jazzbo




msg:201798
 9:17 pm on Mar 18, 2004 (gmt 0)

Or they could try to improve on their abilities to check wether a browser gets different content than the bot. This could - like powdork mentioned - very well put an end to js-redirected doorways and would in most cases improve serps - which is the communicated goal if I am not mistaken.

Just my thoughts

jazzbo

mayor




msg:201799
 12:42 pm on Mar 20, 2004 (gmt 0)

Well, this bot keeps coming back and is bloating my log files so I did try to exclude it in robots.txt but it doesn't check robots.txt anymore (it did at first) and just keeps sucking up bandwidth.

Would someone please teach this bot some manners!

pontifex




msg:201800
 1:21 pm on Mar 20, 2004 (gmt 0)

hi folks,
i got the same JavaScript hungry AI here and it is following the links in functions() as well... I got one page that is loaded as a pop up and only in one external .js file... google picked it up yesterday :-/ strange. are they executing javascript or just following everything that looks like an URL in any file they find?
P!

kaled




msg:201801
 1:36 pm on Mar 20, 2004 (gmt 0)

Whilst I very much doubt it, they could be running the javascript to see what transpires. It is much more likely that they are simply finding urls.

If you want to find out which is happening, just place a few dummy functions in your js files. In those dummy functions open a window using an url that does not exist made up of two strings.

If your error logs show the non-existent urls, then the bot is running the javascript.

It might be necessary, to be certain, to connect the dummy functions to a small links somewhere on the page.

Hopefully, someone will be able to provide an answer rather than just speculate and ruminate.

Kaled.

Powdork




msg:201802
 4:27 pm on Mar 20, 2004 (gmt 0)

I have reported some sites that would (I would think) warrant a manual penalty because they are polluting the serps with pages that don't exist. My guess is that they haven't been removed by hand because Google wants to catch them with their automated technology. That leads me to believe they may be planning on executing the js.

[edited by: Powdork at 4:37 pm (utc) on Mar. 20, 2004]

irishaff




msg:201803
 4:34 pm on Mar 20, 2004 (gmt 0)

I dont know a lot about javascript , but have a small question..

I use JS links to conserve PR ( ie i dont want to leak it to contact page for example ) . the format is javascript:document.write....

Does this mean the pages will now be spidered with this new bot?

micahb37




msg:201804
 4:47 pm on Mar 20, 2004 (gmt 0)

GG has made mention in the past that Googlebot will follow links in js.


- you might want to simplify your javascript. Google can often extract urls from javascript, but I can believe that doing utterly weird stuff might mess things up.

[webmasterworld.com...]

Yidaki




msg:201805
 6:15 pm on Mar 20, 2004 (gmt 0)

>Google has been spidering and indexing my javascript files for maybe a year now.

Forgive me but i simply don't believe you. I haven't seen this on any of the 50+ sites i manage for me and my clients - many of these sites use javascript for many things - from popup to document write. Never ever had any single js crawled by gbot nor any other major bot.

Enough reasons for me to say that you're either wrong or simply just trying to blow another bubble.

Regarding GBot/Test ignoring robots.txt: it wouldn't surprise me. Doh! A bot that doesn't identify itself with a user agent that contains correct contact info and/or correct whois info +plus+ GoogleGuy claiming that he doesn't know anything about a Testbot ... sorry, but this stinks ...

cyberprosper




msg:201806
 7:37 pm on Mar 20, 2004 (gmt 0)

Check the IP address... it is not google. It is some guy who has named his browser :Googlebot: Ban him from your site and move on.

bull




msg:201807
 8:07 pm on Mar 20, 2004 (gmt 0)

Check the IP address... it is not google

[webmasterworld.com...]

kaled




msg:201808
 12:08 am on Mar 21, 2004 (gmt 0)

Yidaki,

If you've read many of my posts you would know that I very rarely make definitive statements. When I do, you can bet your life that I can back them up.

When you have read and verified the stickymail I am about to send you, I trust you will apologise for calling me a liar.

Kaled.

Chris_R




msg:201809
 2:05 am on Mar 21, 2004 (gmt 0)

Now now, let's keep it NICE.

There are a number of reasons both of you could experience what you have experienced and neither one would be wrong.

Google is crawling javascript for me - not for a year I don't think but for a few months at least. It is in the index - with the reverse links pointing to the page with the javascript links.

Google has made some MAJOR changes in the last month or so in the links they crawl and index. Probably the biggest change in two years from what I monitor.

As with any other thing google does - they do it in steps. They probably started off crawling simple javascript stuff first - and moved on from there. Depending on your PR, JS config, and other factors would depend on whether they would crawl you.

borisbaloney




msg:201810
 4:10 am on Mar 21, 2004 (gmt 0)

I've been pondering this for an hour or so but can't work it out - what use is indexed javascript files for searchers?

This 58 message thread spans 2 pages: 58 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved