Forum Moderators: open
This pattern's been repeating for the last 48 hours. Any ideas what the deal could be? Fairly clean robots.txt and never had a problem with the old ink or google spiders.
No patern though, didn't visit me for 2 days and now this. It was doing it almost daily before.
Month after month it came back for the old, dead files.
Posts and e-mails pondering why a bot should be so stupid came and went without solutions.
Line after line of convoluted access_log files containing the same redundant requests.
And now?
And now the cycle begins anew........
66.***.**.40 - - [26/Feb/2004:21:21:04 -0800] "GET /Blah-ishblah.html HTTP/1.0" 404 2847 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
Ain't progress a wonderful thing?
"As of the 1st of March 2004, we will no longer be accepting URLs for inclusion via:
- Inktomi Search Submit
- AltaVista Express Submit
- Fast PartnerSite
Pricing and other details of new and exciting programs to replace the above services will soon be provided. The new products will include all previously supported engines and more."
They gave no indication that they were going to do this (and they haven't commented on the above mailing) so your guess is as good as mine as to what they intend to do.
Sorry if that did not seem clear from my original message above jdMorgan.
My opinion is that Yahoo intends to replace the crawlers/bots due to a merger of its search technology and that Slurp is still allowed to hit sites, but is not currently indexing them. Tim (Yahooguy in the forum) said that Slurp was being replaced with YahooSlurp did he not? I take this to mean not just a simple renaming, but a completely new crawler.
Yahoo were very upfront and open about the fact they intended to use Inktomi in the Yahoo Search Engine (which of course got thousands of more PFI Inktomi sales) only to turn around and say - "Well, we didn't mean THAT part of Inktomi!"
Why the secrecy? Why the helpfulness to individual matters (which, by the way, is working a treat in the PR department) but no actual information to help us prepare for the new product.
Keeping quiet about Yahoo crawler's current activities and not even mentioning the scrapping of three PFI programs to mould a new single program makes Yahoo look a little flaky. People paid out shed-loads for Directory Inclusion and Inktomi Inclusion and now curse their mistake. With all these secret changes do you really think people are going to keep throwing their money Yahoo's way?
.... yeah, you're right, they probably will ...
Yahoo-VerticalCrawler-FormerWebCrawler/3.9 crawler at trd dot overture dot com; [alltheweb.com...]
The site is in Y!s index, but isn't refreshed and gets very little traffic. The pages present in the index seem to be age-old.
The site is on a sub-domain of a .com domain, like widgets.example.com.
Unfortunately, he left a day earlier than I had expected and I never did get the opportunity to talk to him.
-- Is their engine sophisticated enough to figure out that layers are used in menus. So hidden div's don't automagically mean hidden text.
-- Are affiliate links likely to get you banned? Amazon Buybox? Ebay Feed? CJ Links?
-- Is Y!Slurp just plain old broken (I have indications it is just Slurp with a new refferal agent, as I see some *old* lame url's with session IDs being slurped every few days - despite no inbound links to those IDs)?
But that just leaves another questions: how do I get out of that trap?