homepage Welcome to WebmasterWorld Guest from 54.163.91.250
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
GlueText
Pfui




msg:4382823
 9:47 pm on Nov 2, 2011 (gmt 0)

One of IncrediBILL's rant-worthy faves [webmasterworld.com...] this bad bot and sad UA is now hailing from:

208.83.208.179
Mozilla/4.76 [en] (Win98; U)

robots.txt? NO

Courtesy of [robtex.com...] --
IP a.k.a. gluetext.com, www-other.gluetext.com, www-w.gluetext.com, mail.gluetext.com

ISP: Data Centers Canada

 

Staffa




msg:4382841
 10:59 pm on Nov 2, 2011 (gmt 0)

Sheesh!
Out of curiosity I just had a look at gluetext dot com, typed in two good keywords for one of my sites and their search engine (!?) began processing in real time. Where they get original URLs from I don't know (maybe one of the big three because there is no way that they can crawl the entire internet in under a minute) but they obviously did find my site and promptly started to rip through it (at least attempted to), in real time, while I was waiting at their site for the results to show.
Fortunately, I have a good system in place to prevent this from happening otherwise they would have gotten away with everything, now they got nothing.

UA : Mozilla/4.76 [en] (Win98; U)

208.83.208.179
no referrer
no robots.txt

It goes without saying, the whole range is now blocked !

keyplyr




msg:4382893
 1:06 am on Nov 3, 2011 (gmt 0)

Thanks for the heads-up.

Had the RackSpce range reported by incrediBILL already blocked.

I put up the block on this new range, then searched for my top search terms. Competitors that are below me in the big 3 SERP showed up but not me.

208.83.208.176 - 208.83.208.183
208.83.208.176/29

or the wide deny 208.83.208.0/21 but I don't know what else is in there. Anyone?

Staffa




msg:4382899
 1:46 am on Nov 3, 2011 (gmt 0)

keyplyr could you see their attempt to access your site in your log files or do your log files record nothing when a range is blocked. Just curious

I blocked the whole
NetRange: 208.83.208.0 - 208.83.215.255
CIDR: 208.83.208.0/21

they all belong to the same server farm ;o)

keyplyr




msg:4382915
 2:29 am on Nov 3, 2011 (gmt 0)

@Staffa

Yes my log files do show 403 responses when I block ranges, however I'll need to wait a couple more hours to review logs. I'll post again then.

keyplyr




msg:4382998
 11:12 am on Nov 3, 2011 (gmt 0)

403'd total of 32 times, spaced out over 3 minutes.

Requested the top 3 web pages for the search terms I used. These would be the same web pages the big 3 SEs would list for these terms.

208.83.208.179 Mozilla/4.76 [en] (Win98; U)

Staffa




msg:4383163
 7:10 pm on Nov 3, 2011 (gmt 0)

Thank you keyplyr, I was just curious if their behaviour would be different when hitting a site that has blocked them (yours) and one that had not yet (mine).

After I entered my keywords I waited a moment while their progress thingy was running. Then I thought it was taking its time (in internet terms) to show the first results and on a hunch went to look what was happening at my site in the meantime and saw them trying to grasp page after page. Woh, I immediately closed the browser tap with their website and the their visit ended.

During that time they attempted to access 58 pages in about 1 min 46 sec.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved