BacklinkCrawler

IP: 5.9.65.19 (given in full because they have used this exact IP for a very long time)
UA: BacklinkCrawler (http://www.backlinktest.com/crawler.html)
robots.txt: yes, and may be compliant
headers: fully humanoid

I met this today (which is to say in yesterday's logs) and it drove me bonkers because I could swear I'd seen the name before, but I couldn't find it anywhere--not in my robots.txt, or the hole-poking section of access controls, or my Header Access checklist.

Turns out its most recent visit was in March 2015, and that was on my old site (now reduced to my personal site), shortly before I changed over to header-based access controls. In the past--going back to 2011 which are my oldest saved logs--it has used 46.4.two-different-exact-IPs and 144.76.one-exact-IP. I kinda think they're all Hetzner, so almost all those earlier visits (robots.txt, sitemap, front page) were blocked. Now, thanks to humanoid headers, they slipped right in.

It doesn't seem to like URLs ending in pagename.html, because all it requested were directories at various depths. In particular, it did not ask for anything in the /boilerplate/ directory, which is roboted-out but its constituent pages are linked from everywhere. That's why I say with hesitation “may be compliant”.

Oh, and the URL in the UA leads to a “Seite Nicht Gefunden” page in two languages. I first wondered if someone else had stolen the UA, but the IP matched earlier visits. I'll see if anyone reads the site's Kontakt form.

BacklinkCrawler

lucy24

keyplyr

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week