homepage Welcome to WebmasterWorld Guest from 23.22.194.120
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

This 39 message thread spans 2 pages: < < 39 ( 1 [2]     
Xenu Link Sleuth
Should anonymous linkers be blocked?
grandma genie




msg:4471605
 12:26 am on Jul 2, 2012 (gmt 0)

Hi,

I understand this is a link checker, possibly looking for 404s. Is it normal for the person doing the checking to hide their identity? What do you make of this?

75.128.105.nn - - [01/Jul/2012:13:21:41 -0400] "HEAD / HTTP/1.1" 301 - "-" "Xenu Link Sleuth/1.3.8"

75.128.105.nn - - [01/Jul/2012:13:22:43 -0400] "HEAD /example HTTP/1.1" 301 - "-" "Xenu Link Sleuth/1.3.8"

75.128.105.nn - - [01/Jul/2012:13:27:13 -0400] "HEAD /example/ HTTP/1.1" 200 - "-" "Xenu Link Sleuth/1.3.8"

Assuming whoever this is has links to my site. It would be nice to know who they are. The IP belongs to Charter.

-- GG

 

lucy24




msg:4473185
 7:40 am on Jul 6, 2012 (gmt 0)

Because speed traps, volume traps and other ways of easily detecting steath UAs don't typically apply to bots that have permission which is why using one of my allowed user agents to gain access would be a breakthrough for the bad bot as validated access disables all other tests and you get a free pass.

... and, conversely, I doubt that more than 1% of robots think that far ahead. Granted, those are the 1% you really have to worry about ;) but you gotta concede that your average robot is too stupid to find its access plate with both hands.

keyplyr




msg:4473198
 8:22 am on Jul 6, 2012 (gmt 0)

GG - the person running the link scan is not "hiding their identity."

If they visited your site with a browser, the UA string would contain IP, time stamp, request type, referrer, and UA.

That's exactly what is being displayed here, except instead of a browser UA, it is the software UA ("Xenu Link Sleuth/1.3.8") which is what is actually sending the requests. There's no referrer since the software is not coming from another web site it is installed on the owner's machine.

grandma genie




msg:4473284
 4:01 pm on Jul 6, 2012 (gmt 0)

Blend27, in answer to your question, yes. Hmmm. Curiouser and curiouser. I checked back 6 months in the logs and found that xenu only showed up about once a month, until July. In July it showed up 32 times. So I'm gonna block it. Goodbye Xenu, too much of anything is not good.

wilderness




msg:4473293
 4:32 pm on Jul 6, 2012 (gmt 0)

July 3,
Stop dinking around and simply add Xenu to your UA deny list.


July 6,
In July it showed up 32 times. So I'm gonna block it. Goodbye Xenu


;)

incrediBILL




msg:4473407
 12:05 am on Jul 7, 2012 (gmt 0)

... and, conversely, I doubt that more than 1% of robots think that far ahead.


Depends on the bot, regular crawlers don't, the underbelly of the internet scrapers do because they'll stop at nothing to get what they want.

Had one scraper that kept getting caught in my traps that actually went to the extreme of reducing his camouflage crawl rate to one page per day so I expanded my time trap to track IPs for 48 hours and sure enough they kept pinging the site around every 24 hours and continued with sequential page requests at that slow pace for hundreds of days.

It was the funniest thing I ever saw.

</thread hijack>

lucy24




msg:4473411
 1:36 am on Jul 7, 2012 (gmt 0)

It was the funniest thing I ever saw.

Oh, lord, it's the robotic equivalent of painting the Golden Gate Bridge. By the time they're done they would have to start all over again to pick up the last few years' worth of changes.

Still with us, Grandma? If I remember rightly, you are not in fact anyone's grandma or even grandpa. I suppose it was explained somewhere.

grandma genie




msg:4473737
 4:38 am on Jul 9, 2012 (gmt 0)

The rumors of my age have been highly exaggerated.

blend27




msg:4478318
 12:47 am on Jul 24, 2012 (gmt 0)

It's back from 98.154.143.247(IP in StopForumSpam) - 20 days later...

followed by 72.93.206.97 - same headers.

wilderness




msg:4478322
 1:24 am on Jul 24, 2012 (gmt 0)

blend,
I've this same IP on July 2nd, perhaps even since.
I've simply been passing over the 403's.

I just chalked this IP and another RR IP as somebody from this forum flexing their muscles, since most every visit seems to coincide with this threads activity ;)

FWIW, we use to have a forum participant that would inject absurd UA's and visit sites because he thought it was cute. Don't recall the identity.

This 39 message thread spans 2 pages: < < 39 ( 1 [2]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved