Welcome to WebmasterWorld Guest from 54.146.55.156

Forum Moderators: Robert Charlton & aakk9999 & andy langton & goodroi

Message Too Old, No Replies

"Googlebot can't access your site"

     
1:02 am on Sep 21, 2012 (gmt 0)

Senior Member from AU 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 22, 2003
posts: 2040
votes: 110


Ted told me I should be able to search for this topic discussed here recently. Unfortunately I can find nothing at all.

I received an email from Google with the subject "Googlebot can't access your site http://example.com/".

Which of course is rubbish. I rarely look at webmaster Tools but for quite some time now Google has had my site under two names.

http://www.example.com/ [correct] and;

http://example.com/ [incorrect] and the thrust of the email from them.

Over the last 24 hours, Googlebot encountered 1 errors while attempting to connect to your site http://example.com/. Your site's overall connection failure rate is 50.0%. You can see more details about these errors in Webmaster Tools


Remedy anyone?

Thanks
8:05 am on Sept 24, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10553
votes: 13


The server would have to perform some equivalent of the -d and -f test on every request, and pore over your htaccess to make sure the request isn't coming from someone who will end up being blocked (core comes after all mods including rewrite). You're looking at a significant detour into a php script for every single request, because a server-level redirect on its own would happen before the request ever reaches your individual site.

someone has to do the work eventually.
the best solution is to host both virtual hostnames on the same server so you can do this work for the first request instead of delaying it until the subsequent request.
5:32 pm on Sept 24, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Apr 8, 2008
posts:107
votes: 0


I just wanted to chime in here and say that I have been seeing the same things. We get constant errors in GWT like the ones described above. Even weirder, we are getting requests for strange robots.txts.

vanityURL -> 301 -> deepurl.html
logfiles request -> deepurl.html/robots.txt

Completely ridiculous, I'm pretty sure this is an issue with GWT if not google's crawler.
8:42 pm on Sept 24, 2012 (gmt 0)

Senior Member from AU 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 22, 2003
posts:2040
votes: 110


We get constant errors in GWT like the ones described above

For another site, for robots.txt I often get this from GWT, Another "Googlebot can't access your site":

http://www.example2.com//robots.txt


Note the double forward slash, I just do a manual fetch and the GWT becomes a happy face.
10:44 pm on Sept 24, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13210
votes: 347


the best solution is to host both virtual hostnames on the same server

Well, I was talking specifically about shared hosting. Different set of choices. Can't remember what OP's situation was, since the lead-off post doesn't say one way or the other.
11:22 pm on Sept 24, 2012 (gmt 0)

Moderator This Forum from GB 

WebmasterWorld Administrator andy_langton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 27, 2003
posts:3331
votes: 135


We get constant errors in GWT like the ones described above


I'm certainly no stranger to crazy crawl behaviour from Google, but I'd be a bit wary that this isn't as a result of an unintended response from your own server (e.g. a redirect).

The problem is that Google doesn't provide references for most data in GWT (or complete data!) and analysing it yourself can be a difficult exercise without the right tools.
This 35 message thread spans 2 pages: 35