homepage Welcome to WebmasterWorld Guest from 23.20.77.156
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
findlinks bot
130.83.167.153
Bewenched




msg:4524723
 2:03 am on Dec 4, 2012 (gmt 0)

findlinks/2.6+(+http://wortschatz.uni-leipzig.de/findlinks/)

 

keyplyr




msg:4524772
 5:47 am on Dec 4, 2012 (gmt 0)

what about it?

lucy24




msg:4524795
 6:46 am on Dec 4, 2012 (gmt 0)

It visits me every few days. Not sure what it wants, but I hate to think it could be actively malign. I mean, Leipzig, they've been around forever ...

Ziel von FindLinks ist die Beschaffung der Datengrundlage für NextLinks. Dazu werden aus möglichst vielen HTML-Seiten (zunächst aus den Domänen .de, .at und .ch) die darin auftretenden Links analysiert.

Uh... they've used up all of Germany, Austria and Switzerland and are now scraping the barrel of dot com?

Die Datei robots.txt wird vom FindLinks-Server berücksichtigt. Änderungen in einer solchen Datei wirken sich nach spätestens ca. 30 Tagen aus.

Oh, wait, I think I've read that before. 30 Tagen?! Are they kidding? Even the googlebot doesn't go much past 24 hours. They're exaggerating anyway. Quick detour to logs suggests that they alternate between robots.txt and some other request, so it can't be more than a few days. Why they even bother with the separate crawls is anyone's guess.

keyplyr




msg:4524825
 8:00 am on Dec 4, 2012 (gmt 0)

I've denied them via robots.txt for years. So far they've always obeyed and stayed away from my files.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved