Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google Crawled 87 pages deep in a linear fashion!

So much for 3 levels down....

         

sandpetra

10:20 pm on Apr 27, 2007 (gmt 0)

10+ Year Member



Recently I attempted to create a duplicate of a site structure (it was a type of 101 site) and create a new site with the same directory and file structure with totally different content in a totally different market.

My main folder is @100 levels deep (ie pages from 1-101) with one link to "next" to get the level down. On the older site, these pages where linked to all over the place (mainly via two "content" pages though).

I had published the site with only 13 of these pages complete so took the "next button" off page thirteen. Unluckily for me I managed to replace this via another computer I was updating from and guess what - Google spidered to page 99 at least.

Unfortunatley it was all dup content as the remaining 87 pages hadnet been wiped yet so hellow supps.

Results - every 5th page in the sequence apppeared in Google results.

I know you can never be positive about such things but i am positive there wasnt enough traffic to this site for anyone to be linking to these other pages etc - Google found these.

On a side note - can i ever get these pages out of supps or should i redirect everything to a new directory and get links to it?

tedster

8:02 pm on Apr 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google continunes to amaze me by finding the most tucked away urls

can i ever get these pages out of supps

Once you've got unique content on those pages, they certainly "can" come out of their supplemental status. Of course, as supplemental pages, they are crawled infrequently. But more than that, you've described a very deep silo structure. If you want them all out of supplemental, I'd say you'll need some other click path to get to the deep ones.

g1smd

8:30 pm on Apr 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Use robots.txt or the robots meta tag to keep the duplicates out of the index until such time as they are ready to be indexed with the "real" content.