Forum Moderators: open
Very closely, maybe not immediately, but sooner or later one or both of the equal pages will get dumped.
If you have many pages, especially from the same domain, with similar wording and structure and keyword placement they will likely get flagged as machine generated (templates). Mostly it is AltaVista you need to be concerned about in this regard, but most of the majors are sensitive to exact or near exact duplicates.
Vary the page length, and page titles, don't repeat big chunks of text from one page to another, and you should be fine on all the engines.
Like Air said, Altavista is very very good at tracking down duplicates. I think they just won a patent on the process they use, so it is sure to work if they can patent it. But they don't seem to notice it right away. It usually takes a few weeks after you have been listed.
I would suggest re-wording your titles, descriptions, and link text. Again, like Air recommended, change some of your body text as well. That way you don't have block of pages all within a range of X kb.
Altavista Wins Patents [doc.altavista.com]
<blush>I didn't do a very good job of looking</blush>
Its an obvious mistake Ive been making. I have been creating pages which are completely different in regard to content, page size, densities e.t.c
Often, probably the only constant between these pages is the outgoing links. Really good to know and provides a possible explanation for why some of my pages on Alta have been dropped after being in the results for about 2 - 4 weeks
But seriously -- I'm assuming that "outgoing links" must mean links to another domain, right? Otherwise, it seems to me, that pages from one website with standard menu navigation would all look like duplicates to this method.
Does this mean that they have a predetermined allowance for duplicate links? Like there can be no more than x amount of duplicate links?
or
Does this mean that if there are more duplicate links than non-duplicate links then they will consider it a duplicate page?
On the other hand, if page C has 5 links and page D has 6 links, and all of the links on C also appear on D, then the union is 6, and the ratio is 5/6, or 0.87
The value of this ratio is always between 0 and 1 inclusive; what we need to figure out is this: what is the critical value? I would guess about 0.7 .
I know that this method was created especially to eliminate doorway pages on different domains. But it also hurts when different domain names point to the same set of pages. This I know from experience. Got to keep duplicate paths to the same page away from AV, or they all may get dropped.
However, doesn't this patented method also open the door to an unethical form of competition? Seems to me that a low-life competitor doesn't need to plagiarize, just duplicate your links on a very different page. Submit to AV, and your production page would be penalized along with his scam page.
Nah, they must have though about that, right?
The documentation for their patent makes it clear that they are not looking at the page copy, only at the links themselves. I also think it's only the destinations that get compared, not the link text, and it's "outbound" links that matter, not in-site navigation.
Whatever the exact method, the bottom line effect I have seen is that all duplicates or near-duplicates get dropped, especially if they are on different domains. I can understand not wanting to index all of the duplicate pages. I just wish that at least one of the set would stick. This method has hit several of my clients who mirror their site on more than one domain name, even though the dropped pages were in no way doorways.
My solution has been to re-submit the pages, being careful to use just one domain. Then I addressed the incoming links, asking webmasters to link only to that one domain. Most are very cooperative, and they appreciate the information about AV as well.
My client's dropped pages are now back in the index and beginning what I hope is a climb through the ranks.
thanks
wayne
Generally speaking pages that have no links pointing at them (internal or external) tend to get flagged as doorway pages. So it is a good idea to have links on optimized pages and have other pages point at the optimized pages from your site.
The thing that seems to be an issue with AV is that if two pages have the exact same outgoing links (from what I can see these are links to external sites) then they get big points as potential duplicate pages.
One site is going to have at least 4 links to the same page, most likely the index page.
Then add hyperlinking to internal pages to this mix and I feel there is going to be a real mess here. Then there are the doorway pages with additional links!
Am I off the track here or what? Can anyone clearly state the rules as you understand them.
Thanks Gary
< So A would have links to B, C, D and E on it's friends page, as well as links to pages within Site A. B would have links to A, C, D and E, with links to pages within B...etc. >
Yes, that is correct. Even if you add additional links that the other sites have you still have the "double" page problem I have surmised. Still don't know if that is a problem and I agree that each site having additional different URL's listed is a good idea.
Here is something that I just now wondered about! Do the SE's look at the IP address's? All of our "friends" are on the same server, therefore same IP address. If these outgoing links were *really* friends on the same server or even with close proximity IP addresses, that could cause a problem if the SE's look for this. I can get around it as I am sure most of you could too, * BUT *, is this a problem?
How about "contact us" similar addresses and phone numbers? Maybe I am a bit paranoid, can anyone expound on this?
Gary