| 1:51 pm on Mar 4, 2005 (gmt 0)|
you could always ban it in your htaccess file (or in the windows file that's ht's cousin if that's your environment.
I'm not a fan of Slurp myself,as it can't seem to grasp the meaning of 301 or 404 messages, but, as I am happy where I show up in yahoo, I put up with what I don't like about Slurp.
| 2:24 pm on Mar 4, 2005 (gmt 0)|
Would you guide me through the process of the ip blocking, becuase I am quite a newbie in the matter, plus there are a whole bunch of other unidentified bots I would like to block by ip. PM would be fine.
My site is on Linux OS.
| 2:30 pm on Mar 4, 2005 (gmt 0)|
Slurp obeys robots.txt, however it may take a while between your making a change and Slurp recognizing the fact and stopping indexing.
If it has been a while, or if Slurp has fetched the new robots.txt, then you may have a syntax or other problem with the file. Have you tried validating [searchengineworld.com] it?
What exact syntax are you using to block Slurp?
| 2:52 pm on Mar 4, 2005 (gmt 0)|
Yes, I have validated the robots.txt and it's just fine. The Slurp got it today with "200" a few times, but it's still crawling like crazy - no indexing though - so I see no reason to let it eat my bwidth, if it's just going to show my home page :) but it's not obeying the robots.txt
| 3:10 pm on Mar 5, 2005 (gmt 0)|
Why would you (or anyone else) want to stop Yahoo from spidering your site?
Just curious. - Larry
| 2:16 am on Mar 6, 2005 (gmt 0)|
I have a ban, but the Slurp still eats my website at the speed of 50-60 pages/minute...
| 4:25 am on Mar 7, 2005 (gmt 0)|
OK. I don't know how big your site is.
If its in the 1000s of pages you have bandwith costs and concerns.
I welcome Slurp because Y gives me fairly decent SERPS positioning. - Larry
| 10:35 am on Mar 12, 2005 (gmt 0)|
I will complete go with Larry, You some want to stop slurp. y is one of the major search engine. It good to hear that Y is crawling the site frequently. Make some change in the SE preference area which will helpful in the search result.
| 2:16 am on Mar 22, 2005 (gmt 0)|
Interestingly, Slurp caches robots.txt and also caches DNS for at least five days. It was my sole visitor to the IP a webserver was running on that had "misplaced" ::cough, cough:: it's DNS record. :(
| 9:35 pm on Mar 29, 2005 (gmt 0)|
Odd that it caches robots.txt, since it seems to grab that from my site twice a day or more at times! Maybe that's not Slurp. :O
| 9:57 pm on Mar 29, 2005 (gmt 0)|
maybe someone is downloading your site using slurp as the user agent? Never had problems with them not obeying robots.txt
Sure it's not their shopping bot YahooSeeker?
| 10:37 pm on Mar 29, 2005 (gmt 0)|
Lemme correct my previous post - Slurp's grabbed my robots.txt 1012 times so far this month, an average of about 33 times a day. Compare this with msnbot (7 times a day) and Googebot (twice a day) and I'm beginning to wonder what Slurp finds to fascinating about my robots.txt :)
| 7:32 pm on Apr 26, 2005 (gmt 0)|
Ok, you all are going to laugh at me I know. I am new to the game of webmastering and search engine opptomizing. But where do I find my robots.txt in my directories. I cant seem to find it. I did a validation test on my site and it came up with 48 warnings a 117 errors. alot of "capitolize this and that's" and some other weird "grammer" stuff. I am going to be reformating the site and it's text so that isnt a problem. But what I have a question about is this. Where is my robot.txt file and how do I fix the warnings and errors when I get them? Also, since I don't write html code what would you suggest, as far as reading material to understand more on the line of what is discussed here?
| 9:22 pm on Apr 26, 2005 (gmt 0)|
You HAVE no robots.txt file until you yourself create it and upload it
to your host isp. It goes in the same directory as your regular website pages.
I would do a lot of spell checking to reduce 'grammar' errors too. -Larry
| 9:47 pm on Apr 26, 2005 (gmt 0)|
That would explain alot.
By the way, I would have to look in my aws stats to see who/what has been looking at my sites correct?