homepage Welcome to WebmasterWorld Guest from 54.166.122.65
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Local / WebmasterWorld Community Center
Forum Library, Charter, Moderators: lawman

WebmasterWorld Community Center Forum

    
Requesting forum name change
add Crawlers please
Hobbs

WebmasterWorld Senior Member hobbs us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3636996 posted 8:38 am on Apr 28, 2008 (gmt 0)

"Search Engine Spider Identification" by itself is good, but scope needs to expand to cover benign and creepy crawler identification by user agent, IP ranges and behavior, they are non search engine related, topic can't find a better home on WW than here where it is covered anyway, but always feels like an illegitimate child.

How about changing:
"Search Engine Spider Identification"
to become
"Search Engine & Crawler Identification"

Thank you

 

engine

WebmasterWorld Administrator engine us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3636996 posted 4:04 pm on Apr 29, 2008 (gmt 0)

Thanks for the suggestion.

Crawler or Spider, or robot (bot), does that really matter to those that know?

Hobbs

WebmasterWorld Senior Member hobbs us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3636996 posted 8:45 pm on Apr 29, 2008 (gmt 0)

What I meant was that the title is too limiting and needs to include crawlers that are not originating from search engines.

engine

WebmasterWorld Administrator engine us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3636996 posted 7:54 am on May 1, 2008 (gmt 0)

Good point, and becoming more of a problem every day.

ergophobe

WebmasterWorld Administrator ergophobe us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3636996 posted 4:15 pm on May 1, 2008 (gmt 0)

It is a good point. I think a lot of questions, including ones in other forums like Webmaster General and sometimes PHP, deal with things like honeypots, ban lists, catching bad bots and such.

Might not be a bad idea to lump the good guys with the bad guys in one set of discussions since IDing the good guys is only an issue because there are bad guys and vice versa.

incrediBILL

WebmasterWorld Administrator incredibill us a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



 
Msg#: 3636996 posted 6:46 pm on May 1, 2008 (gmt 0)

How about something more inclusive like:
"Search Engine Spider and Other Automated Activity Identification"

With the charter being to help identify automated activity from Search Engines, Crawlers, Link Checkers, Scrapers, Botnets, etc."

Just a thought.

ergophobe

WebmasterWorld Administrator ergophobe us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3636996 posted 10:45 pm on May 1, 2008 (gmt 0)

a little clumsy. I like Hobbs proposal unless you want to just cut right to it

Good Bot, Bad Bot Identification

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3636996 posted 4:26 pm on May 8, 2008 (gmt 0)

I like Hobbs short one, too.

"Search Engine & Crawler Identification"

Or focusing on agents being identified:

"Search Engine Spider and Automated Client Identification"

Jim

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3636996 posted 12:46 pm on May 12, 2008 (gmt 0)

the problem is that we really want to stay away from general purpose bot/ip identification. We don't like to get into senarios where we are id'ing private individuals. Posting their IP address is often considered an attack on privacy. We have also had numerous incidents where people would post ip's and hope that the intended ip would be the victim of a ddos attack (which has happened several times). The attackee, then comes back here and rants, raves, and posts all sorts of threatening stuff.

JD is on the right track there, but I wonder how we avoid the privacy issues?

Hobbs

WebmasterWorld Senior Member hobbs us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3636996 posted 8:31 am on May 15, 2008 (gmt 0)

We don't like to get into senarios where we are id'ing private individuals

And I think that's why Brett you guys made that forum pre-moderated, also hiding the last IP octet rule except for recognized search engines takes care of those scenarios.

If anything, we need to spot, analyze and discuss everything new crawling our sites search engines or not, specially those originating from known hosting data centers, this interest ranges from business survival down to a hobby levels like plane & train spotting, I sure like to know more about them before getting run over by one.

Deep inside I wish it becomes as simple as "Crawler Identification", search engines thrown in as a bonus :-)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Local / WebmasterWorld Community Center
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved