homepage Welcome to WebmasterWorld Guest from 54.196.196.62
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
Forum Library, Charter, Moderators: Brett Tabke

Paid Inclusion Engines and Topics Forum

  posting off  
Ink - 301 Googlebot - 200
301 Moved Permanently 200 OK
Alternative Future




msg:27652
 1:05 pm on Jun 3, 2003 (gmt 0)

Hi all,

After checking through my webstats I found that Inktomi was returning a 301 whereas Googlebot was returning 200

Example:
Ink
"GET /dir1/widget-blue-large.html HTTP/1.0" 301 292 "-" "Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; [inktomi.com...]
Googlebot
"GET /dir1/widget-blue-large.html HTTP/1.0" 200 44023 "-" "Googlebot/2.1 (+http://www.googlebot.com/bot.html)"

I know that the 301 refers to the resource being assigned a new permanent URI, but am lost here - why does Googlebot find the requested path whilst Ink can't? (The path is valid!)

I use the following mod_rewrite rule:
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI}!Navigate\.do
RewriteRule ^(.*)$ /Navigate.do?cta=$1 [L]

Any help is appreciated.

Many thanks,

-gs

 

jdMorgan




msg:27653
 4:42 pm on Jun 3, 2003 (gmt 0)

Alternative_Future,

Three possibilities that I can think of:

  • Ink is asking for your site under a different tld -- for example "yousite.com" rather than "www.yoursite.com" and is being redirected, either by you or by your host configuration.

  • Google has updated it's database to ask for "Navigate.do" directly, while Ink is still looking for "index.html" or similar. Ink appears to be horribly slow about updating - I've seen it ask for pages that went 404-Not Found or even 410-Gone more than six months ago, even though there are NO links to these files anywhere on the web.

  • Another possible reason for this is that your code returns 302-Moved Temporarily status on requests for anything other than Navigate.do, so Slurp (correctly) assumes that these page names are being redirected only temporarily, and will eventually return to their original URLs. In order to correct this situation, you may wish to change your RewriteRule to:

    RewriteRule ^(.*)$ /Navigate.do?cta=$1 [R=301,L]

    HTH,
    Jim

  • Alternative Future




    msg:27654
     7:54 pm on Jun 3, 2003 (gmt 0)

    Thanks again Jim,

    After checking my server stats again I have noticed it is requesting www.mysite.info which is redirected to www.mysite.com by myself from another mod_rewrite rule I have in place in the server root. Your first suggestion certainly falls into the possibility.
    The others I will follow and monitor over time but well worth noting :)
    Also in addition I am going to place the [R=301,L] into the .htaccess file just as back up.

    [added]After implementing the [R=301,L] I have noticed that the URI changes to the Navigate.do?cta=widget-blue-large.html path rather than the www.mysite.com/dir1/widget-blue-large.html. Will this have any effect on SE's as it now has the ? along with parameters back in place?[/added]

    Thanks.

    -gs

    jdMorgan




    msg:27655
     8:38 pm on Jun 3, 2003 (gmt 0)

    Alternative_Future,

    If that's a problem, you may have to, errrr, cloak for Slurp.

    Actually, I got in a hurry and mis-read your original RewriteRule. It was a server-internal redirect as originally posted. If you're reasonably sure it was the .info -> .com redirect that caused the anomaly, just put your posted rule back the way it was.

    It certainly sounds like this is the case... Google is requesting .com, and Slurp is requesting .info, thus the different server responses. Spoof yourself as Slurp and Googlebot at Wannabrowser if you'd like to see for yourself.

    Jim

    Global Options:
     top home search open messages active posts  
     

    Home / Forums Index / Advertising / Paid Inclusion Engines and Topics
    rss feed

    All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
    Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
    WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
    © Webmaster World 1996-2014 all rights reserved