Forum Moderators: phranque

Message Too Old, No Replies

Yahoo Slurp sees a 301 then a 200.

No redirects exist. How is this possible?

         

crobb305

4:56 pm on May 26, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



When I view my logs, I see that slurp tries to get a page that exists, sees a '301', then sees a '200' after the 'redirect'. There are no redirects in my htaccess, so I don't know why Yahoo is seeing this. And, could this explain why Yahoo has dropped some of my pages from the index?

The stats look like this:
68.142.250.26 - - [DATE] "GET /example.htm HTTP/1.0" 301 348 "-" "SLURP Info"
68.142.250.92 - - [DATE] "GET /robots.txt HTTP/1.0" 200 64 "-" "Slurp Info"
68.142.250.26 - - [DATE] "GET /example.htm HTTP/1.0" 200 10854 "-" "Slurp Info"

Notice that the files it fetches are exactly the same. Once with a 301, then once with a 200.

P.S., not sure if this forum is the best place for this thread, but my site is on an Apache server.

ChadSEO

5:09 pm on May 26, 2006 (gmt 0)

10+ Year Member



crobb305,

Do you do a 301 redirect from http://example.com/example.html to http://www.example.com/example.html? Assuming the two sites share a log file, which they likely do, then this might explain it.

Chad

crobb305

5:14 pm on May 26, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Chad. You are right! Thanks. I just realized that I have a rewrite rule to rewrite non-www to 'www' as fillows:

RewriteCond %{HTTP_HOST} ^example.(.*)
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

So I guess that explains it. Slurp is hitting the non-www form of the url, then redirecting to the www form.

What gets me is that this is a 301 (permanent), yet yahoo has been trying to pull the non-www form for months. When it encounters a 301, shouldn't it learn to stop crawling the old url?

ChadSEO

5:21 pm on May 26, 2006 (gmt 0)

10+ Year Member



I have had 301s on dozens of links on my site for about 9 months now, and I still get dozens of hits everyday from just about every search engine for the old URLs. I'm not sure if this is because people are still linking to the old ones, or just a desire to be thorough.

Chad