Welcome to WebmasterWorld Guest from 54.166.14.86

Forum Moderators: open

Message Too Old, No Replies

Pages dropped after spider visit?

Has anyone else noticed this?

     
6:03 pm on Jul 6, 2000 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:May 9, 2000
posts:23785
votes: 453


Has anyone noticed pages dropping out of av index after this spider visits?
brillo.pa.alta-vista.net
7:52 pm on July 6, 2000 (gmt 0)

Senior Member

joined:June 27, 2000
posts:1548
votes: 0


I wish!! I have tried to kill a page from AV for almost a month and while the spider has visited, it will not remove the 404 file not found page from the index.

-G

9:53 pm on July 6, 2000 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 20, 2000
posts:1702
votes: 0


did the pages have at least %75 of the same content? I haven't experienced this myself, but it reminded me of some pretty interesting info that I've been reading about duplicate detection in vector databases.
7:36 am on July 7, 2000 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:May 9, 2000
posts:23785
votes: 453


Yes, the pages are meant to have the same content. I'm running an experiment on this trying to get the pages out. Bait laid, Scooter came and I'm watching closely.
8:07 am on July 7, 2000 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 21, 1999
posts:370
votes: 0


Seth, I have been wacked a number of times regarding pages with similar content. (could be around the 75% mark). These pages have obtained fantastic positions, stuck around for approximately 15 days and then boom - Gone!

Duplicate elimination tech - possibly. It would be good to know what A.V's criteria is for treating pages as sufficiently dissimilar. The REAL gold for me would be to define the exact specs for Inktomi's doorway eliminator.

BUT, this would be suitable for another discussion on another day.

4:24 pm on July 7, 2000 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 23, 2000
posts:1277
votes: 0


Pete, there are several possibilities for duplicate elimination that at least Google, if not AltaVista and Inktomi, can employ.
They have some pretty slick methods of finding mirrors and duplicate pages including the 75% thing Seth was talking about. Included is IP number matching, URL string similarities, similar link structure between two sites, and content matching. You must have unique content to have a chance at staying in these databases. Although I have seen dupes slip by, it is the exception rather than the rule and one spam notification by a competitor can bring rankings to a slam pretty fast. To key to slipping by the detectors is making them think you have differing, unique sites....
4:44 pm on July 7, 2000 (gmt 0)

Administrator from GB 

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:May 9, 2000
posts:23785
votes: 453


JamesR, welcome.
It'll be interesting to see if the simple, duplication technique I'm using gets the pages/site booted out of AV et al. It's AV I'm targeting to measure the "sensitivity."
It's a pretty crude technique but the result will show how close to the line to go, or not go as the case may be.
I actually want the pages out without deleting them from the site.

I'll report back when the test is complete, but, I expect to have to wait until the index is updated.

Coming back to the question, did anyone notice which spider is the culprit?

5:19 pm on July 7, 2000 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 20, 2000
posts:1702
votes: 0


Thanks James, Did you happen to save that report? I tried going to it today and the site wouldn't pull up. (hopefully it's just temporary server problems.
5:51 pm on July 7, 2000 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 23, 2000
posts:1277
votes: 0


Seth, I did not save it, only bookmarked the URL and couldn't reach it yesterday also. I hope they didn't figure something out either and the site is down due to system error and not someone pulling the plug. Had more extraction work to do on that one. Seems to me I may have seen that report somewhere else...if I find it I will email you.

Edited by: JamesR

6:40 pm on July 9, 2000 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 21, 1999
posts:370
votes: 0


James R - thanks for the info - if you get hold of that report will you include me in that mail

Thanks
Pete

boyleman

2:48 am on July 19, 2000 (gmt 0)

Inactive Member
Account Expired

 
 


Did you guys ever find that report that seth was talking about, the one about the 75% thing? I'm interested in that as well. Thanks!