Forum Moderators: open
Read robots.txt but still took a disallowed file from each of my sites.
This is apparently the parent company behind Ancestry.com and other such sites.
[edited by: volatilegx at 12:55 am (utc) on July 24, 2006]
[edit reason]
[1][edit reason] fixed broken link - added space between URL and closing parens [/edit] [/edit][/1]
I see you edited the user agent. I'm not sure why as it's now a different user agent than the one I posted.
Mozilla/4.0 (compatible; MyFamilyBot/1.0; [myfamilyinc.com)...]
That's what I saw in my logs. There is no space after the URL. :)
From my point of view and regarding my widegets ;)
I've have some unique contacts and correspondence result from folks who have found older items on my pages while tracing family geneaolgy.
Last night when this thread came up, I was most sure that I had a portion of MyFamily's ranges denied. I was unable to find any deny based on UA or IP.
Possibly I did have it denied at one time and removed it?
Has anybody had an extensive crawling or is the bot only hitting a page or two (perhaps a specific page they previously marked and crawled)?
Don
MyFamily.com, Inc. Corporate <---
360 West 4800 North
Provo, UT 84604
Phone: 801-705-7000
FAX: 801-705-7001
pr ¦*--AT--*¦ myfamilyinc.com (hate spambots though their site posts this in the open)
and the response I got back today was:
Your message
To: PR
Subject: MyFamilyBot/1.0
Sent: Fri, 4 Aug 2006 20:34:52 -0600
was deleted without being read on Sat, 5 Aug 2006 16:58:03 -0600
A user with the name "A Hathaway" was the final recipient I believe (first name reduced to single letter by me for privacy).
FYI: The UA that hit me was:
Agent: Mozilla/4.0 (compatible; MyFamilyBot/1.0; http:// www myfamilyinc com) (I pulled the .'s to prevent a link)
host: nat.myfamilyinc.com which resolves to 66.43.16.199
Black hat or unprofessional? Hard to tell at this point - a toss-up.
I've not seen them back though. Yet.