Forum Moderators: open

Message Too Old, No Replies

CydralSpider/1.8

Still not a robots.txt checker...

         

pendanticist

12:19 am on Jan 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



bull posted here, back in October that CydralSpider/1.8 did NOT check robots.txt...

[webmasterworld.com...]

...and nothing seems to have changed.

213.246.63.116 - - [13/Jan/2005:15:13:29 -0800] "GET / HTTP/1.1" 403 480 "-" "CydralSpider/1.8 (Cydral Web Image Search; http*//www.cydral.com)"

pendanticist

3:58 am on Feb 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Cydral just came back. Guess what? It's checking robots.txt!

'Bout time they conform...

jdMorgan

4:11 am on Feb 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Frankly, I think some of the authors/owners search for discussion of their robots and end up here. We've seen that several times; a thread is started, and the first day that the thread shows up in any big search engine, we see a representative posting here. It is amazing to me how many robots don't check robots.txt, don't have contact info in the user-agent string, can't handle multiple user-agents per record as described in the Standard, or have other flaws that are fairly obvious. Makes me want to put up a "Best practices for Web robots site."

Thanks for the update on this one, now we see if they obey it. ;)

Jim

volatilegx

4:54 am on Feb 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Makes me want to put up a "Best practices for Web robots site

What stops you?