Forum Moderators: open

Message Too Old, No Replies

Shim Crawler

From University of Tokyo

         

GaryK

1:58 pm on Oct 9, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Shim Crawler
157.82.246.104

10/03/2005 05:28:37 /robots.txt Shim+Crawler
10/03/2005 05:31:21 /browsers Shim+Crawler
10/03/2005 09:54:08 /robots.txt Shim+Crawler
10/03/2005 09:56:52 /browsers Shim+Crawler

The requests for robots.txt returns an HTTP "406 Not Acceptable" result.

Then it goes on and tries to read a folder which returns an HTTP "301 Moved Permanently" result.

I'm not sure why it's getting the 301.

Does a 301 get generated when you request a folder and get served that folder's default page? If so it never made it to the default page.

Dijkgraaf

11:04 pm on Oct 10, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, some web servers will return a 301 or a 302 pointing to the default page for a folder that is requested, rather than serving up the default page.
Maybe it will come back later and spider the URL that the 301 response gave it.

jdMorgan

12:12 am on Oct 11, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> I'm not sure why it's getting the 301.

Another possibility: Do you redirect www- to non-www domain requests or vice-versa?

I get a lot of robot requests for the incorrect domain name and they are redirected, resulting in log entries such as you describe.

Jim

GaryK

6:36 am on Oct 11, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, I do redirect non-www to www. I keep forgetting about that. You've explained it to me at least twice. :)

keyplyr

6:58 am on Nov 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Came yesterday and got robots.txt. Showed again today and got robots.txt then index page 37 times. If this is just another student project, I'd prefer they learn somewhere else. Let's see if a robots.txt disallow works since it seems to know where it is.

keyplyr

10:19 am on Nov 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It has returned several times, obeyed robots.txt.

GaryK

2:03 pm on Nov 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks for the update.