Forum Moderators: phranque
The stats look like this:
68.142.250.26 - - [DATE] "GET /example.htm HTTP/1.0" 301 348 "-" "SLURP Info"
68.142.250.92 - - [DATE] "GET /robots.txt HTTP/1.0" 200 64 "-" "Slurp Info"
68.142.250.26 - - [DATE] "GET /example.htm HTTP/1.0" 200 10854 "-" "Slurp Info"
Notice that the files it fetches are exactly the same. Once with a 301, then once with a 200.
P.S., not sure if this forum is the best place for this thread, but my site is on an Apache server.
RewriteCond %{HTTP_HOST} ^example.(.*)
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
So I guess that explains it. Slurp is hitting the non-www form of the url, then redirecting to the www form.
What gets me is that this is a 301 (permanent), yet yahoo has been trying to pull the non-www form for months. When it encounters a 301, shouldn't it learn to stop crawling the old url?