Welcome to WebmasterWorld Guest from 54.145.65.62

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

Mozilla/4.0 (compatible; MyFamilyBot/1.0; http://www.myfamilyinc.com)

Took disallowed files

     
10:39 pm on Jul 23, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 17, 2002
posts:2251
votes: 0


Mozilla/4.0 (compatible; MyFamilyBot/1.0; [myfamilyinc.com...] )
66.43.16.199
nat.myfamilyinc.com

Read robots.txt but still took a disallowed file from each of my sites.

This is apparently the parent company behind Ancestry.com and other such sites.

[edited by: volatilegx at 12:55 am (utc) on July 24, 2006]
[edit reason]
[1][edit reason] fixed broken link - added space between URL and closing parens [/edit]
[/edit][/1]

5:53 pm on Aug 4, 2006 (gmt 0)

Administrator

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month

joined:Jan 14, 2004
posts:859
votes: 3


Just seen this today.
No Robots.txt file requested.
Looked at Homepage, and was greeted with 403 Permision Denied. No further requests where made.
Came from 66.43.16.199 Also.
2:24 am on Aug 5, 2006 (gmt 0)

New User

10+ Year Member

joined:July 25, 2006
posts:39
votes: 0


Just hit me for the first time. Got the robots.txt file and the / page.

Why they provide zero information on their website only they know for sure... grrr.

3:00 pm on Aug 5, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Sept 17, 2002
posts:2251
votes: 0


Dan,

I see you edited the user agent. I'm not sure why as it's now a different user agent than the one I posted.

Mozilla/4.0 (compatible; MyFamilyBot/1.0; [myfamilyinc.com)...]

That's what I saw in my logs. There is no space after the URL. :)

3:50 pm on Aug 5, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5459
votes: 3


The real problem with the geneaolgy sites is that data is accumulated to be sold on CD's.
I have some family members are heavily involved in family history and have reported that some of the data from within their databases (after using similar sites for free hosting) has been included in CD's sold to third parties.

From my point of view and regarding my widegets ;)
I've have some unique contacts and correspondence result from folks who have found older items on my pages while tracing family geneaolgy.

Last night when this thread came up, I was most sure that I had a portion of MyFamily's ranges denied. I was unable to find any deny based on UA or IP.
Possibly I did have it denied at one time and removed it?

Has anybody had an extensive crawling or is the bot only hitting a page or two (perhaps a specific page they previously marked and crawled)?

Don

11:26 pm on Aug 5, 2006 (gmt 0)

New User

10+ Year Member

joined:July 25, 2006
posts:39
votes: 0


I wrote to them via their posted corporate contact e-mail:

MyFamily.com, Inc. Corporate <---
360 West 4800 North
Provo, UT 84604
Phone: 801-705-7000
FAX: 801-705-7001
pr *--AT--* myfamilyinc.com (hate spambots though their site posts this in the open)

and the response I got back today was:

Your message

To: PR
Subject: MyFamilyBot/1.0
Sent: Fri, 4 Aug 2006 20:34:52 -0600

was deleted without being read on Sat, 5 Aug 2006 16:58:03 -0600

A user with the name "A Hathaway" was the final recipient I believe (first name reduced to single letter by me for privacy).

FYI: The UA that hit me was:

Agent: Mozilla/4.0 (compatible; MyFamilyBot/1.0; http:// www myfamilyinc com) (I pulled the .'s to prevent a link)

host: nat.myfamilyinc.com which resolves to 66.43.16.199

Black hat or unprofessional? Hard to tell at this point - a toss-up.

I've not seen them back though. Yet.

3:18 am on Aug 6, 2006 (gmt 0)

Administrator

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month

joined:Jan 14, 2004
posts:859
votes: 3


After digging though my First Sightings Log files
I found the following header, which I am assuming they want everyone to email problems too etc..

From: SearchBot@myfamilyinc.com