homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Local / WebmasterWorld Community Center
Forum Library, Charter, Moderators: lawman

WebmasterWorld Community Center Forum

Requesting forum name change
add Crawlers please

 8:38 am on Apr 28, 2008 (gmt 0)

"Search Engine Spider Identification" by itself is good, but scope needs to expand to cover benign and creepy crawler identification by user agent, IP ranges and behavior, they are non search engine related, topic can't find a better home on WW than here where it is covered anyway, but always feels like an illegitimate child.

How about changing:
"Search Engine Spider Identification"
to become
"Search Engine & Crawler Identification"

Thank you



 4:04 pm on Apr 29, 2008 (gmt 0)

Thanks for the suggestion.

Crawler or Spider, or robot (bot), does that really matter to those that know?


 8:45 pm on Apr 29, 2008 (gmt 0)

What I meant was that the title is too limiting and needs to include crawlers that are not originating from search engines.


 7:54 am on May 1, 2008 (gmt 0)

Good point, and becoming more of a problem every day.


 4:15 pm on May 1, 2008 (gmt 0)

It is a good point. I think a lot of questions, including ones in other forums like Webmaster General and sometimes PHP, deal with things like honeypots, ban lists, catching bad bots and such.

Might not be a bad idea to lump the good guys with the bad guys in one set of discussions since IDing the good guys is only an issue because there are bad guys and vice versa.


 6:46 pm on May 1, 2008 (gmt 0)

How about something more inclusive like:
"Search Engine Spider and Other Automated Activity Identification"

With the charter being to help identify automated activity from Search Engines, Crawlers, Link Checkers, Scrapers, Botnets, etc."

Just a thought.


 10:45 pm on May 1, 2008 (gmt 0)

a little clumsy. I like Hobbs proposal unless you want to just cut right to it

Good Bot, Bad Bot Identification


 4:26 pm on May 8, 2008 (gmt 0)

I like Hobbs short one, too.

"Search Engine & Crawler Identification"

Or focusing on agents being identified:

"Search Engine Spider and Automated Client Identification"



 12:46 pm on May 12, 2008 (gmt 0)

the problem is that we really want to stay away from general purpose bot/ip identification. We don't like to get into senarios where we are id'ing private individuals. Posting their IP address is often considered an attack on privacy. We have also had numerous incidents where people would post ip's and hope that the intended ip would be the victim of a ddos attack (which has happened several times). The attackee, then comes back here and rants, raves, and posts all sorts of threatening stuff.

JD is on the right track there, but I wonder how we avoid the privacy issues?


 8:31 am on May 15, 2008 (gmt 0)

We don't like to get into senarios where we are id'ing private individuals

And I think that's why Brett you guys made that forum pre-moderated, also hiding the last IP octet rule except for recognized search engines takes care of those scenarios.

If anything, we need to spot, analyze and discuss everything new crawling our sites search engines or not, specially those originating from known hosting data centers, this interest ranges from business survival down to a hobby levels like plane & train spotting, I sure like to know more about them before getting run over by one.

Deep inside I wish it becomes as simple as "Crawler Identification", search engines thrown in as a bonus :-)

Global Options:
 top home search open messages active posts  

Home / Forums Index / Local / WebmasterWorld Community Center
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved