lucy24 - 7:59 am on Nov 16, 2012 (gmt 0)
Can I just go ahead and add 'Disallow: /directoryname/' in my robots.txt file? Will you suggest that? Will that stop Google from accessing that particular directory when it arrives for crawling the next time? So eventually, I won't be seeing those errors anymore right?
Won't do any good now. Google never forgets an URL. The URLs would simply be shifted from the "not followed" tab to the "blocked by robots.txt" tab.
'server error' 'soft 404' 'Access denied' 'Not found' 'Not followed' 'Others'
Gosh. Some of those I've never even seen.
But still, doesn't Google follow a URl with 2 redirects? Google says they accept up to 2 redirects.
It's got to be at least three. Mechanical redirects alone can be two separate steps (with/without www and directory-slash) if you've got a sloppy host and/or carelessly written htaccess. Add one more if you throw a "real" redirect into the mix. If they excluded all sites that weren't optimally coded, all results for all queries would drop into the triple digits :)
But the same URL is redirect to another URL when I copied it and paste in a Web browser?
Do you have any rules about handling requests in different ways depending on the referer? Or cookies? Almost all search engines come in with no referer-- and of course no cookies. But some of your pages probably expect people to be coming in from another page, or with some kind of background.
:: idly wondering about disparity between "1 error/25 attempts" robots.txt listed in WMT under robots.txt fetch, and 0 errors/1 attempt shown in logs for same date ::
:: not-so-idle wondering about enormous number of different Googlebot-Mobile from correct IP ::
Whoops. Sorry. I'm outta here.