Welcome to WebmasterWorld Guest from 54.163.54.95

Forum Moderators: phranque

Message Too Old, No Replies

Fetch as Googlebot working strangely

Fetch as Googlebot returns 404 for 301'd pages..Strange?

     
12:41 pm on Oct 18, 2012 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2012
posts: 48
votes: 0


Hi,

I did a 301 redirection from example.com/about-contact/ to example.com/contact/. And everything is working fine. But not with Googlebot. I tried to fetch the old url in GWT but it returns 404 for googlebot. Any idea why this happens?
12:45 pm on Oct 18, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Change your Browser UA to Googlebot, install the Live HTTP Headers extension for Firefox and check again.
1:18 pm on Oct 18, 2012 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2012
posts: 48
votes: 0


Checked it. It says "Bots get the naked version" same as in Fetch as Googlebot in GWT. What might be the problem?
1:26 pm on Oct 18, 2012 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2012
posts: 48
votes: 0


SEOmoz also lists the old url as 404. So something is causing problems for the bots. Any idea?
8:21 am on Oct 19, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13218
votes: 348


I'm missing something. Is the redirect happening? That was the point of using Live Headers or equivalent. If the redirect takes place as intended, it doesn't matter whether the old URL exists or not. All that matters is the new one.

"naked version" implies that you're redirecting via javascript or some other optional means, instead of an absolute, unconditional redirect from the server side-- php script, config/htaccess etc. How is the redirect coded?
9:04 am on Oct 19, 2012 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2012
posts: 48
votes: 0


Yes, the redirect is happening but if I select Googlebot as user agent then no redirection takes place and the old url will return a page with "Bots get the naked version". The problem is with crawlers only as SEOMoz also returned 404 for old urls.. I am using htaccess. But not sure. Have to ask the web designer...
9:49 am on Oct 19, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13218
votes: 348


Have to ask the web designer...

Uhm, yeah, sounds like a plan. The redirect is clearly not originating from htaccess, or else it would happen equally to everyone. Is the web designer also the ongoing webmaster, or is he/she only responsible for the front end of the site?

If the old page doesn't exist (404), what are the robots getting the naked version of? If you disable javascript do you get redirected yourself?
10:00 am on Oct 19, 2012 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2012
posts: 48
votes: 0


Ha my designer is great :) He is the only person handling the front end of site. I disabled javascript and the redirection works fine.
1:31 am on Oct 20, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10563
votes: 15


what type of content management system do you use?
have you checked the .htaccess file or the source code on the page served for anything unusual?
6:58 pm on Oct 20, 2012 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2012
posts: 48
votes: 0


Wordpress..There is nothing on the old url's page source just "bots get the naked version". Can anyone tell me what that means? What is with naked version and javascript? lucy24 said ""naked version" implies that you're redirecting via javascript or some other optional means, instead of an absolute, unconditional redirect from the server side-- php script, config/htaccess etc." Any reference link..
9:24 pm on Oct 20, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13218
votes: 348


Oh. Oops. I thought that was google talking. So it's one of those programs that looks at what kind of browser the visitor has got, and builds the page accordingly? If so, I wonder what the plainclothes robots get. Saying outright "bots get the naked version" sounds unnervingly like "this site serves cloaked content" doesn't it?

Where does the redirect happen? Within the page code itself? What does your config file or htaccess say about the original URL?
6:41 am on Oct 22, 2012 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2012
posts: 48
votes: 0


It's actually happening at the host level so I've dropped them an email to see that they can ignore it.
7:19 am on Oct 22, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10563
votes: 15


There is nothing on the old url's page source

what is in the source of the script that generates your response?

i haven't seen any answers from you about the contents of your .htaccess file.

It's actually happening at the host level

what does that mean?


by the way, welcome to WebmasterWorld, nikhilrajr!
3:57 am on Oct 23, 2012 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2012
posts: 48
votes: 0


Thanks phranque..
Identified an issue at the server level where the bots are not properly being redirected. WPEngine is the hosting provider. WPEngine has a redirect management system at the Nginx level and I will add the redirects there so this can be avoided. I'll keep you posted once it's completed. Thanks for the help everyone.
6:15 am on Oct 23, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10563
votes: 15


make sure you understand the implications before you implement your solution.

Cloaking - Webmaster Tools Help:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66355 [support.google.com]

Official Google Webmaster Central Blog: How Google defines IP delivery, geolocation, and cloaking:
http://googlewebmastercentral.blogspot.com/2008/06/how-google-defines-ip-delivery.html [googlewebmastercentral.blogspot.com]
9:16 am on Oct 23, 2012 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2012
posts: 48
votes: 0


If the objective was to display alternate content to entice search that would be cloaking... But not a 404. This is dumb :)
10:18 am on Oct 23, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10563
votes: 15


i'm not saying your current or future implementation will be seen as cloaking.

however, you should be aware of the issues whenever googlebot is seeing a different response than a human visitor.
10:45 am on Oct 23, 2012 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Sept 26, 2012
posts: 48
votes: 0


Yes, I understood what you said. Thanks for the reference links.
This is not related. But check this out connected with cloaking "Why hasn't Google banned Quora for hiding answers from search engine visitors?" [quora.com...]