homepage Welcome to WebmasterWorld Guest from 54.205.205.47
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe and Support WebmasterWorld
Home / Forums Index / WebmasterWorld / Webmaster General
Forum Library, Charter, Moderators: phranque & physics

Webmaster General Forum

    
Fetch as Googlebot working strangely
Fetch as Googlebot returns 404 for 301'd pages..Strange?
nikhilrajr




msg:4509309
 12:41 pm on Oct 18, 2012 (gmt 0)

Hi,

I did a 301 redirection from example.com/about-contact/ to example.com/contact/. And everything is working fine. But not with Googlebot. I tried to fetch the old url in GWT but it returns 404 for googlebot. Any idea why this happens?

 

g1smd




msg:4509310
 12:45 pm on Oct 18, 2012 (gmt 0)

Change your Browser UA to Googlebot, install the Live HTTP Headers extension for Firefox and check again.

nikhilrajr




msg:4509329
 1:18 pm on Oct 18, 2012 (gmt 0)

Checked it. It says "Bots get the naked version" same as in Fetch as Googlebot in GWT. What might be the problem?

nikhilrajr




msg:4509336
 1:26 pm on Oct 18, 2012 (gmt 0)

SEOmoz also lists the old url as 404. So something is causing problems for the bots. Any idea?

lucy24




msg:4509739
 8:21 am on Oct 19, 2012 (gmt 0)

I'm missing something. Is the redirect happening? That was the point of using Live Headers or equivalent. If the redirect takes place as intended, it doesn't matter whether the old URL exists or not. All that matters is the new one.

"naked version" implies that you're redirecting via javascript or some other optional means, instead of an absolute, unconditional redirect from the server side-- php script, config/htaccess etc. How is the redirect coded?

nikhilrajr




msg:4509752
 9:04 am on Oct 19, 2012 (gmt 0)

Yes, the redirect is happening but if I select Googlebot as user agent then no redirection takes place and the old url will return a page with "Bots get the naked version". The problem is with crawlers only as SEOMoz also returned 404 for old urls.. I am using htaccess. But not sure. Have to ask the web designer...

lucy24




msg:4509760
 9:49 am on Oct 19, 2012 (gmt 0)

Have to ask the web designer...

Uhm, yeah, sounds like a plan. The redirect is clearly not originating from htaccess, or else it would happen equally to everyone. Is the web designer also the ongoing webmaster, or is he/she only responsible for the front end of the site?

If the old page doesn't exist (404), what are the robots getting the naked version of? If you disable javascript do you get redirected yourself?

nikhilrajr




msg:4509761
 10:00 am on Oct 19, 2012 (gmt 0)

Ha my designer is great :) He is the only person handling the front end of site. I disabled javascript and the redirection works fine.

phranque




msg:4510045
 1:31 am on Oct 20, 2012 (gmt 0)

what type of content management system do you use?
have you checked the .htaccess file or the source code on the page served for anything unusual?

nikhilrajr




msg:4510213
 6:58 pm on Oct 20, 2012 (gmt 0)

Wordpress..There is nothing on the old url's page source just "bots get the naked version". Can anyone tell me what that means? What is with naked version and javascript? lucy24 said ""naked version" implies that you're redirecting via javascript or some other optional means, instead of an absolute, unconditional redirect from the server side-- php script, config/htaccess etc." Any reference link..

lucy24




msg:4510252
 9:24 pm on Oct 20, 2012 (gmt 0)

Oh. Oops. I thought that was google talking. So it's one of those programs that looks at what kind of browser the visitor has got, and builds the page accordingly? If so, I wonder what the plainclothes robots get. Saying outright "bots get the naked version" sounds unnervingly like "this site serves cloaked content" doesn't it?

Where does the redirect happen? Within the page code itself? What does your config file or htaccess say about the original URL?

nikhilrajr




msg:4510670
 6:41 am on Oct 22, 2012 (gmt 0)

It's actually happening at the host level so I've dropped them an email to see that they can ignore it.

phranque




msg:4510676
 7:19 am on Oct 22, 2012 (gmt 0)

There is nothing on the old url's page source

what is in the source of the script that generates your response?

i haven't seen any answers from you about the contents of your .htaccess file.

It's actually happening at the host level

what does that mean?


by the way, welcome to WebmasterWorld, nikhilrajr!

nikhilrajr




msg:4511099
 3:57 am on Oct 23, 2012 (gmt 0)

Thanks phranque..
Identified an issue at the server level where the bots are not properly being redirected. WPEngine is the hosting provider. WPEngine has a redirect management system at the Nginx level and I will add the redirects there so this can be avoided. I'll keep you posted once it's completed. Thanks for the help everyone.

phranque




msg:4511137
 6:15 am on Oct 23, 2012 (gmt 0)

make sure you understand the implications before you implement your solution.

Cloaking - Webmaster Tools Help:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66355 [support.google.com]

Official Google Webmaster Central Blog: How Google defines IP delivery, geolocation, and cloaking:
http://googlewebmastercentral.blogspot.com/2008/06/how-google-defines-ip-delivery.html [googlewebmastercentral.blogspot.com]

nikhilrajr




msg:4511197
 9:16 am on Oct 23, 2012 (gmt 0)

If the objective was to display alternate content to entice search that would be cloaking... But not a 404. This is dumb :)

phranque




msg:4511230
 10:18 am on Oct 23, 2012 (gmt 0)

i'm not saying your current or future implementation will be seen as cloaking.

however, you should be aware of the issues whenever googlebot is seeing a different response than a human visitor.

nikhilrajr




msg:4511241
 10:45 am on Oct 23, 2012 (gmt 0)

Yes, I understood what you said. Thanks for the reference links.
This is not related. But check this out connected with cloaking "Why hasn't Google banned Quora for hiding answers from search engine visitors?" [quora.com...]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Webmaster General
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved