homepage Welcome to WebmasterWorld Guest from 54.196.69.189
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Accredited PayPal World Seller

Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Google, robots, disallow
cyberdyne




msg:4307353
 3:00 pm on May 3, 2011 (gmt 0)

One entry in my robots.txt is shown below, yet today, two Google IP addresses (64.233.172.18, 74.125.75.17) visited [u]only[/u] two files in a directory named /jscript/ .

There was no Google-related U-A in the log entry (Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7), but shouldn't Google be adhering to the below rules regardless of the IP or U-A that they use?

User-agent: *
Disallow: /j


Thanks in advance.

 

enigma1




msg:4307442
 5:44 pm on May 3, 2011 (gmt 0)

but shouldn't Google be adhering to the below rules regardless of the IP or U-A that they use?

No actually and you cannot tell if it was human or bot just because the IP is allocated to google. Robots.txt are "guidelines" and there are ways to force even the popular spiders to go through restricted folders and scripts. They are also various google services regular visitors could use to retrieve stuff from your site (eg translation tools) and even automate them.

One way to get around it - to a certain extend - is setup a cookie and check it on the server end by having a redirect or something along these lines. If no cookie is present don't allow access to these scripts. If they're js files you may have to wrap them with a server script to check the cookie value.

cyberdyne




msg:4307448
 6:01 pm on May 3, 2011 (gmt 0)

OK, not sure how I would go about all that but I'll look into it. Thanks for your reply.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved