Forum Moderators: open

Message Too Old, No Replies

atraxbot

using new UA

         

keyplyr

10:03 am on Nov 3, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Old UA: atraxbot/0.2
(older thread [webmasterworld.com])

New UA: Atrax Solutions atraxbot/0.3; http://www.atraxsolutions.com/atraxbot

Pfui

5:22 pm on Nov 3, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Two Qs, please, related to the prior thread:

1.) Did it read/heed standard robots.txt format --

User-agent: *
Disallow: /

-- or did it appear to require its own entry?

2.) Did it crawl from .atraxsolutions.com? Or .twtelecom.net? Or --?

keyplyr

7:06 pm on Nov 3, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This crawl it read/obeyed robots.txt, no bad behavior.

Last crawl it requested disallowed files as well as not obeying it's own disallowed UA entry. Coming from twtelecom.net both times.

Buk what irks me is I had it blocked by UA because of bad behavior last time, then they changed UA, got access and did a linear crawl of the entire site.

Atrax Solutions:

The crawler collects documents from the web and builds the constantly updated index used by our tools.... Atrax Solutions is currently in alpha testing. Additional information will be provided here once the beta-testing phase of our tools becomes available.

What these tools are is yet to be known.