homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Marketing and Biz Dev / Cloaking
Forum Library, Charter, Moderator: open

Cloaking Forum

robot.txt beeing a php file
an efficient way to detect robots?

5+ Year Member

Msg#: 3313385 posted 12:29 pm on Apr 17, 2007 (gmt 0)

Hi there, all is in the topics.
I think about setting the robot.txt file to actually beeing a php file. Through this i plan to add all IP address in an internal DB for non-yet-known indentified robots's IP.
This would had to been associated with usual rules of spider detection, through already known IP address, host and agent, The goal of this fourth check on the robot.txt's access is to detect a bot who would had passed though the first 3 test...

Do you think that the SE's hidden bots, those who can detect us doing cloacking (using none referenced ip, hidden agent and hidden host) will still access the robots.txt?

volatilegx explained he also used some additional check of his own to identify specifics SE behave, such as not accessing the css files etc...I also read somewhere else that SE must always access the robot.txt file, Do you think this method based on the robot.txt 's access could be confident?
I'm just afraid this would specificly some behave they would not reproduct when they tried to hide themself, when the goal of their visit is to compare result between an hidden bots and their official bots?

any opinions on this matter?




WebmasterWorld Senior Member 10+ Year Member

Msg#: 3313385 posted 9:18 pm on Apr 18, 2007 (gmt 0)

Welcome to WebmasterWorld, grant_green :)

It's not a good way to verify bots. Sneaky bots won't check the robots.txt file, and even regular bots won't check it every time they request a file.


5+ Year Member

Msg#: 3313385 posted 9:20 am on Apr 20, 2007 (gmt 0)

Thanks for the tips Volatilegx! this is helpful. nice board BTW :)

Global Options:
 top home search open messages active posts  

Home / Forums Index / Marketing and Biz Dev / Cloaking
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved