Forum Moderators: open
As you might have heard some rumours about, FAST is preparing a very significant service scaleup this fall. In order to do this, we are deploying a totally new and distributed crawler architecture. The new crawler has the ID of "FAST-WebCrawler/3.2 test", and is currently sweeping some final test runs. If anyone is troubled by this, please let me know and I will immediately forward to the testing team.
We hope our tests will bring you a significantly improved service this fall.
Best regards,
- Knut Magne
Knut Magne Risvik - kmr@fast.no
Director of Engineering
Fast Search & Transfer ASA
On the downside, I find that at least between Sept. 17th and 19th, the disallow lines in my robots.txt file were ignored by the test spider. No trespassing observed since Sept 20th though, so it may have been a temporary problem.
Good luck getting it to final!
Here is what I have in my spider log:
HTTP_USER_AGENT = FAST-WebCrawler/3.2 test
REMOTE_ADDR = 66.77.74.214
Name: cr011r01-test.sac2.fastsearch.net
The IP of the crawler is 66.77.74.208 and the UA is "FAST-WebCrawler/3.2 test"
1) For many sites I manage, it just doesn't like removing pages from its index. I have loads of pages, no longer on the server, no links to them.. and they persist.
2) If you change the index page, i.e. change the name of the index page that the server serves up, it will not recognise it - it just keeps trying to find the old one and then disapearing.
I'm really appreciative of FAST sharing their plans with us. I'm also glad that their spider is being upgraded. I just hope that they put a bit more "intelligence" into it!!