Forum Moderators: open
DotBot accounted for well over half of the past month’s redirects, topping even bing.
Is your site popular with forums or blogs and maybe they're linking to your pages with insecure URLs?Not that I know of, and definitely not to this wide extent. Especially since, if there were incorrect HTTP links out there, other search engines would be following them too. Other than search engines, most links to the pages I'm especially interested in are from a curated directory, whose listings are correct.
If it was just DotBot I'd say maybe it's a dumb crawler.Oh, it’s definitely a dumb crawler. I would like to know how they find out about deep-interior URLs, like /directory/subdir/pagename.html when they have never seen /directory/subdir/ which is the only way to get there. (I just checked. This happens far too often to be accounted for by a few random incorrect links.) Do they get their shopping list from someone else, like scraping a search engine's full listings?
especially for robots.txtI don't know how many sites do this, but I exempt robots.txt from all canonicalization redirects. Some robots seem to get confused if a robots.txt request is redirected, and you don't want to give them any excuse for noncompliance.