homepage Welcome to WebmasterWorld Guest from 54.237.54.83
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Googlebot can't access your site. But why?
pkKumar




msg:4564021
 5:57 am on Apr 12, 2013 (gmt 0)

I have recieved a mail today and check the WMT also. The messages are:
http://example.com/: Googlebot can't access your site
Over the last 24 hours, Googlebot encountered 5 errors while attempting to retrieve DNS information for your site. The overall error rate for DNS queries for your site is 71.4%

Also this one:
http://example.com/: Googlebot can't access your site
Over the last 24 hours, Googlebot encountered 1 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 50.0%.

Now for first error, it is showing that some dns issue from our end. But how we will we verify that, they have not mentioned the date and times.. so what we supposed to ask from hosting provider. Also the site is accessible through fetch as googlebot now. i have both the domains i.e www and without www in wmt, where www domain is set as prefered domain, also at site level i am 301 redirecting everything from without www to www.

second message seems more strange.. i dont have robots.txt file and i dont require that also.. so what's the point of blocking or unblocking.

 

lucy24




msg:4564049
 7:51 am on Apr 12, 2013 (gmt 0)

Notice that the second error didn't say "can't find your robots.txt" it said "can't access your site".

Now, you might think that if it can't reach your site at all, then it's a pretty academic question whether it can read robots.txt or not. But in the mind of the google computer they are separate things. If there's no recent robot.txt on file, then everything else also grinds to a halt.

A request for robots.txt has to be answered with either
200 >> they happily walk off with the file (it is not a good idea to visibly 301 redirect robots.txt)
or
404 >> they looked and couldn't find one (I guess a 410 would work as well "I used to have one but decided not to bother")

Anything else, and the googlebot will retreat until it figures out what the situation is.

Do a "fetch as googlebot" on a few high-profile pages and it will soon see that everything is working properly.

But if the problem comes up a lot, have a closer look at your host. If your site has recurring DNS issues, it is not likely that only google and nobody else is affected.

pkKumar




msg:4564098
 8:59 am on Apr 12, 2013 (gmt 0)

@lucy24, then i should worry more about the first error only i guess i.e dns resolution issue. But if i contact my host, they will check the current status and its fine now so they will say everything fine from their end. How can exact time and date can be found when googlebot has problem in accessing the site ? so that i can mention that to host.

phranque




msg:4565814
 12:39 am on Apr 18, 2013 (gmt 0)

How can exact time and date can be found when googlebot has problem in accessing the site ?


not the exact time, but the date should be on the GWT notice and it was in the previous 24 hours as stated.
also note that your hosting provider is not necessarily responsible for your DNS configuration.

i would suggest trying some of the many tools for checking the health of your DNS configuration.

lucy24




msg:4567411
 12:59 am on Apr 24, 2013 (gmt 0)

:: bump ::

My turn. Datestamp on gwt message, 22 April. Timestamp on e-mail, 23 April 2013 5:05:46PM PST.

Over the last 24 hours, Googlebot encountered 1 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%.


:: detour to raw logs ::

66.249.75.169 - - [21/Apr/2013:08:52:12 -0700] "GET /robots.txt HTTP/1.1" 200 672 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.75.111 - - [21/Apr/2013:16:47:58 -0700] "GET /robots.txt HTTP/1.1" 301 581 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.75.169 - - [21/Apr/2013:16:47:58 -0700] "GET /robots.txt HTTP/1.1" 200 672 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.75.169 - - [22/Apr/2013:11:29:01 -0700] "GET /robots.txt HTTP/1.1" 200 672 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.76.169 - - [23/Apr/2013:03:14:12 -0700] "GET /robots.txt HTTP/1.1" 200 672 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.76.111 - - [23/Apr/2013:05:13:26 -0700] "GET /robots.txt HTTP/1.1" 301 581 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.76.169 - - [23/Apr/2013:05:13:26 -0700] "GET /robots.txt HTTP/1.1" 200 672 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

(omitting other pages retrieved by assorted googlebots during same time period)

:: further detour to same days' error logs with focus on 66.249 ::

Nope, dead silence here too, though I see the snippet+bot is still around. Now that I've locked it out, it has become sulky and no longer asks for the favicon.

Conclusion: Having exhausted the possibilities of search, google is now embarking on an ambitious campaign to create a new variety of arithmetic, with particular reference to percentages and timekeeping. As a warmup, it is offering new interpretations of

"24 hours"
"postpone"
and
"100.0%"

:: final detour to gwt for obligatory trouble-free fetch of robots.txt ::

Well. That's five minutes of my life I'll never see again.

phranque




msg:4567623
 5:24 pm on Apr 24, 2013 (gmt 0)

which hostame is getting those 301s?
www. or non?
have you tried fetch as googlebot for the reported account?

lucy24




msg:4567693
 9:45 pm on Apr 24, 2013 (gmt 0)

It redirects from without to with. And I realized after posting that the original question involved DNS issues, which wouldn't show up in error logs anyway.

You'll notice that the 301s come in pairs, so you have to assume the original redirect is followed immediately by a 200 request to the correct domain.

:: detour to orignal e-mail ::

D'oh! Thanks, phranque, didn't even notice that. The e-mail names a domain, and it's the "other" form.

:: further detour to wmt under "without" version ::

The message is sent to both forms of the domain name, the error is flagged on both sides, and the messages are listed separately as unread. But when I go to "fetch as googlebot", I find that yesterday's successful fetch-- using the "with" form-- is also listed on the "without" side of gwt.

Which is why I so often deplore the lack of a "noidea" emoticon

RP_Joe




msg:4567891
 11:39 am on Apr 25, 2013 (gmt 0)

I would seriously consider looking for another hosting company.
I work with many different clients using many hosting companies. Whenever I see those errors, there is normally a hosting problem.
If you switch to a company with a nginx server you're going to get much better performance. Faster load times.

tedster




msg:4568365
 10:35 pm on Apr 26, 2013 (gmt 0)

Googlebot can't access your site

Note this message from Matt Cutts on the Google Webmaster Forum in recent days:

Hey everyone, please don't worry about this message at this point. Enough people are getting this message that I suspect it's an issue on our end.

[seroundtable.com...]

lucy24




msg:4568403
 2:00 am on Apr 27, 2013 (gmt 0)

I suspect it's an issue on our end

Really? Ya think?

:)

phranque




msg:4568479
 12:43 pm on Apr 27, 2013 (gmt 0)

Googlebot encountered 5 errors while attempting to retrieve DNS information for your site.

Whenever I see those errors, there is normally a hosting problem.


since googlebot has no idea what the IP address is without a DNS lookup, i would be interested to know how this could be a hosting problem.

I would seriously consider looking for another hosting company.

perhaps you meant another DNS provider?


If you switch to a company with a nginx server you're going to get much better performance.


i would be interested to know how "a company with a nginx server" is going to provide inherently better performance.
what company and what server are you comparing this against?

lucy24




msg:4574327
 8:50 am on May 15, 2013 (gmt 0)

Well, now they've outdone themselves. Here I am on the wmt main page, the one where you pick from a list of sites. In my case it goes

example.org
www.example.com
(the name forms I use)
and then
www.example.org
example.com
(the name forms I don't use).

The last two-- the forms I don't use-- each have the "Googlebot can't access your site" notice. One dated, hmmm, a month and a half ago, the other from several weeks ago.

The listing for www.example.com-- my main site-- has a preview with my old header. The listing for example.com-- the wrong name, the one it claims it can't access-- has a preview with my new header, changed only a week ago.

Hm. Now think hard, google. If you haven't been able to access the site since April 22, how do you know it uses a header image I only created last week?

Har de har.

phranque




msg:4574332
 9:04 am on May 15, 2013 (gmt 0)

a preview with my old header.


i wonder which user agent is used for the GWT preview...
(Google Web Preview?)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved