Forum Moderators: open
If it was merl.com why did they try to forge this?
THe funny thing about this crawler it looked for the robots.txt, went to some pages and also some graphics. Anyone know more about the bot Larbin?
The class B belongs to Horizon Research Inc. I scanned through the class C to see if there was anything interesting, but there isn't.
somewhere.com -> 66.92.72.194 which is actually a DSL line (dsl092-072-194.bos1.dsl.speakeasy.net). I don't think it is really connected to the spider. There isn't much out there yet on this one. The IP is now off or firewalled..
ip: 64.211.217.66
ua: larbin_2.2.0 crawl@compete.com
From CNNfn: A look at Compete, the benevolent 'Big Brother' of online shopping
Frankly, speaking from the spider's receiving end, I think the concept is kind of sleazy. If my competitors want to know what *I'm* doing they can call me and pay me as a consultant. Or, they can data mine my competitors 'til the cows come home. I don't care. But stop snooping around my sites, for godssake.
I banned them. My logfiles show they get bounced out every day or two or three.
Idiotgirl
I don't think I'm all that suspicious about how the data gets used. They're pretty up front about what type of information they gather for their clients. And if your site is ranked in any search engines, in isn't exactly like you're hidden. I can't turn around without some spam bot cruising around the back alleys of client sites trying to parse email addys and snag images.
It gets to be a drag sometimes.
BUT - I don't feel like freely feeding my competitors with a one-stop shopping source of information about my sites, what I'm doing, how they're structured, how many hits they get, how often they're updated... yadda yadda. That's where my benevolence comes to a screeching halt. You want to know what I'm doing? Ask me! After all, I'm the one who's doing it! Or, visit each site and see for yourself. If it's that interesting and valuable (you gotta be kidding, right?) that you need all that information about my sites - then pay me for my expertise. Or they're free to disappear, go hassle my competition, or whatever. Regardless, I don't welcome their visits.
It's the concept itself I have a problem with.
Ban all user agents except googlebot and a few selected friends? (remember to allow IE :))
Ban specific IP's? We'll then spend the whole time trying to work out which bots are useful and which are not. Plus 000s of ADSL lines are installed everyday.
Nope were stuck with it :(
I guess the least we can do is ban the no brainers who are to stupid to change the UA name.
Once upon a time I liked what I did for a living. That was in a galaxy far, far away.
Now, instead of designing and scripting for dollars - seems I'm continually sidetracked at all hours of the day and night - preoccupied with such unpleasantries as guarding content from spam bots, tacky cacheing SE's, losing and/or keeping SE rankings, thwarting copyright infringe-ers, tracking images that are being directly linked to from various low-life ya-hoos, checking my logfiles for the onslaught of bogus entries from every script kiddy that felt like creating a virus this week... in my spare time for websurfing I can't swing a dead cat without being beaten senseless with popup ads from Hell, or take time to read my spam-mails (telling me how cheap I can get Viagra this week).
Are you getting my drift? If this sounds like a rant - it IS.
I don't know what happened from the time I got into this way back - whenever - and now - but I'm just about fed up with it.
So, back on task, when some <i>sleaze-bot</i> comes tromping through my sites to (basically) spy on how I do things - I'm going to kick that little UA to the curb so fast it'll make its head spin.
I'm not playing the game any more. Unless it's Google, AltaVista, etc. or a major player that can help me achieve my goals (to put my client's sites on the charts and generate page views) - <b>they're outta here!</b> Period. Get lost! Don't come back. (I treat door-to-door salesman the same way.)
Whew. Man, I feel a little better now.