homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Yahoo / Deprecated - Altavista, Alltheweb.com
Forum Library, Charter, Moderator: open

Deprecated - Altavista, Alltheweb.com Forum

What is FAST PyCrawler

 12:29 pm on Apr 30, 2003 (gmt 0)

Has anyone seen this in their logs before?

"Denodo Crawler (licensed FAST PyCrawler 2.4)"



 1:10 pm on Apr 30, 2003 (gmt 0)

I've not seen it.

Googled [google.com...] and found this: http*//citeseer.nj.nec.com/541220.html The Denodo Data Integration Platform (ResearchIndex).

Could you post the full UA String?




 1:12 pm on Apr 30, 2003 (gmt 0)

Which IP does it come from? Doesn't look like a Fast crawler on first sight.


 1:40 pm on Apr 30, 2003 (gmt 0)

i'll have a go. it can be a bit of a mission sorting through the logs.

seems to me to be some organisation using FAST technology.


 1:58 pm on Apr 30, 2003 (gmt 0)

in the meantime i came across this article


on Natural Language Processing by a Dr. Anastasio Molano, Denodo Technologies.


 3:09 pm on Apr 30, 2003 (gmt 0)

Crawlers IP was - which resolves as an IP owned by Denodo Technologies in Spain (pretty much as you'd expect lolol)

the crawler hit our site pretty much every day last month.

that's about all the extra info the log file provided.


 5:17 pm on Apr 30, 2003 (gmt 0)

Not a real Fast crawler - not from the guys in Norway anyway ;)


 3:57 pm on May 3, 2003 (gmt 0)

So, after doing some research I've come to the conclusion that this most likely is a bot licensed by Fast to Denodo.
It looks like Fast has licensed bots to a handful of organisations, mostly for scientific purposes.

Denodo is a company specialising in natural language search applications, quite an interesting field.
BTW: thanks for the link to the article, richmc, had it bookmarked long ago but never got around reading it.

As Fast is one of the engines which has NLP (Natural Language Processing) implemented to a certain degree there is a chance that some tech transfer between the two companies is involved, though I don't have any information in that regard.

The Py part of the UA seems to indicate the bot is either programmed in Python. Or it's a bot dedicated to crawl Python files, which I find rather unlikely.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Yahoo / Deprecated - Altavista, Alltheweb.com
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved