Forum Moderators: open

Message Too Old, No Replies

Scooter/1.0

Lets do the timewarp, again...

         

Josk

2:09 pm on Feb 25, 2002 (gmt 0)

10+ Year Member



Hi,

Just noticed that I had a visit from Scooter/1.0 (64.152.75.6) last Thursday...

Josk

msgraph

6:53 pm on Feb 25, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How was it spidering? Grabbed a page or more than that?

Josk

9:36 am on Feb 26, 2002 (gmt 0)

10+ Year Member



On one occasion it grabbed about 300+ pages, and elsewhere it only grabbed 22. So it looks like it still works, if a bit dusty...

skirril

6:50 pm on Mar 9, 2002 (gmt 0)

10+ Year Member



However, it looks like it has problems with robots txt..

In there I have something along the lines of:

UA: *
Allow: /
Disallow: /img
(some other disallows)

(some other UA's)

Today I saw it grab something like: /img/img1.jpg

Ideas?

a question on the side, is the whole robots.txt read & parsed, or is it first match?

And are things like:

UA: 1st robot
UA: 2nd robot
Allow: /foo
Disallow: /bar

legal?

Skirril

Key_Master

7:19 pm on Mar 9, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Skirril, no that isn't allowed. Also "Allow" isn't allowed. To ban robots.txt obeying spiders from visiting your images directory use:

User-Agent: *
Disallow: /img/