Welcome to WebmasterWorld Guest from 188.8.131.52
How to make sure they are all listed in Google?
Should i create seperate page and place links to all 30000 files there?
Will google notice it (size of the page will be large because of links)
Or should i place links to all of them from homepage?
May i use hidden links or they all should be visible?
How will my ranking increase?
The thing to consider is pr flow through your site. You need to get maximum to the pages that are targetting competitive phrases. e.g.
Your "cheap widgets" page should have maximum pr and your "cheap blue and pink widgets with green spots" page should have a lot less because it will probably be competing with only a few other relevant competitor pages.
Therefore the less pages linked from your homepage is good because they will gain more pr from your homepage. You can then link from these internal pages to less important pages that need less pr and so on.
Its important to get a reasonably high pr home page to interest the spider in going deep into your site. The alternative is to get links pointing deep into your site from other sites. This will send the spider directly into your lower level pages and it will spider from there.
I would choose the 50 most important pages and link them from your home page. I would also make 5 site maps (linked from the home page) with 100 links each for your next batch of important pages. I would then 'breadcrumb' the rest of the pages with links connecting each page in a tree like structure, if you follow me. Keeping themes going deep into the site would be good if possible, like:
'blue widgets' links to 'blue widgets + green' links to 'blue widgets + green + brown' links to ....... etc.
Perhaps the users would not notice if all the links were white on white - running up and down the sides of the home page. I've seen others use this quite effectively to achieve high rankings in Google.
You might want to play around with all these suggestions. What's the worst that could happen - loss of traffic or a ban from Google? Come on now, roll the dice.
We build sites with the homepage being predominately for the spider. If done well, it can make the internal pages all rank well with the homepage not ranking. This can be very good for the user and google. It gets the right page in the serps, rather than the homepage which means the user has to click again, and possibly again and again.
The concept that users should arrive at the homepage is not always a good one, especially with a 30,000 page site. Making the homepage a type of sitemap for spiders has merit. You can still make it work for direct referals with a site search at the top, along with good drop down navigation. Underneath this you can pack in the links, which probably won't be even noticed by the user.
"One guy emailed me and asked to place his articles with backlinks to his site. Articles are the same as on his site.
Can google penalize my site for plcing same pages from other site?"
My input would be to consider a 'tree' arrangement, i.e. menued according to hierarchy and
carefully cross-linked by parallel relevance.
Think militarily for a moment. The CIC (president) is your index page.
Your 4-star generals are your main menus.
These should refer (link) to one another if not play golf together.
Your colonels, should do likewise, but go bowling and trade fried chicken recipes.
So on down the ranks until you reach the 'raw-assed recruits'.
Even they should only refer to related pages,
BUT all should refer back to
a) The president (main index page) -and-
b) The site-map page.
I cannot offer advice on the SITEMAP page. I suggest that it be relatively small.
referring to ranks above temporary lieutenant perhaps
Page rank is ordered differenly of course. - Larry
I agree with larryhatch although...
>30,000 pages of new content? Oooof! You must be one prolific author.
He may have a huge company that has been in business for 20 years and wants to put more stuff online. These pages may be really focused and good. Possibly much better than the rubbish people write under the guise of "really good helpfull original content and I should be top" - Sigh....
I would however love to know of a site map utility that could divide this batch of 18,000 into index pages with no larger than 100 links. Currently it is alphabetized and some letters have thousands of links on the page. Can anyone recommend a program that could slice up a nice sitemap like this?
So are the articles already on a site? Are they indexed?
Maybe this 'original article guy' has produced a site that has flash, music, 30,000 home page links etc. and has given up trying to get into Google.... hence the request for someone else to publish them. Or maybe he is cunningly trying to get 30,000 links to his site.
Andreww may well have done his homework and reckons its worth the effort. If the articles are already indexed all over the web I would personally not bother to add them to my site. The duplicate content issue would deter me. However, if I could easily out rank the original site and the articles were nowhere else, I would do it.
Business is Business. It could be a very clever move by Andreww.
>It could be a very clever move by Andreww
If it is 30,000 articles - Andreww hasn't confirmed and I can understand why - who do you think the winner is going to be...
The guy who puts together 30,000 pages of content or Andreww whos asking if its okay to put 30,000 invisible links on his homepage?
Someone has taken 5 years to get 30,000 articles
Andreww takes 5 days to copy them (with permission).
Andreww has noticed some sites have invisible links and rank OK. He is smart enough to post a question about this on Webmasterworld.... he avoids the hidden links and gets some good ideas for doing the site properly. He gets better ranking than the original guy.
5 years versus 5 days.
Andreww then goes on holiday for 5 years to await the next 30,000 articles :)
There could be 25,000 products. 25,000 books. 25,00 anything. As pointed out, maybe it is 25,000 'articles', there's a lot of legitimate scenarios where you could get access to that amount of content.
This forum seems to lately be offering more cynicism than help to people with questions.. if you don't want to assist the guy than don't say anthing at all, haw-haw'ing and insinuating he's a spammer is pointless: spammers don't care about ethics anyways.
Google states that they only really take notice of 100 links per page. Reading between the lines I think they would follow every link but not allocate any pr beyond the first 100 links they see.
No, the recommend a limit of 100 links for useability with the user in mind. And while they may decide not to rank a page as well that has more than a few hundred links, as far as PR is concerned, it continues to work the same way with the 1000th link as it does with 100th.
Yep, your right. I just found:
Apart from the above, I've always had the gut feeling that both the anchor text, surrounding words AND the page title are very important to any link off a page. Therefore having 'breadcrumb' links is not only more user friendly through a site but also can establish a series of 'on theme' links which are worth more. In otherwords, pages deep in the site have a series of on theme links leading to them, established via the title of each page in the sequence.
My problem is keeping Google from indexing all of these pages while still having it index the important content. Plan 'A', just putting the pages out there, appears to have tripped the duplicate content filters. Google crawled over 5000 pages but didn't pickup any of them. Plan 'B', putting robots noindex tags on the pages with no extra details worked somewhat better, but there is still a risk of Google having to crawl all these pages to find out not to index them.
I'm now on Plan 'C', use rel="nofollow" attributes to try and keep the spider from going down paths with no details in the first place. This one is new and hasn't had time to do much yet but we'll see if it helps.
Currently, the indexable and non-indexable content are in a common tree (/location/sublocation/item.html). In order to use robots.txt I would need to split the content into two trees which would be a pain to manage. Still, if worse comes to worse, that may end up being Plan 'D'.
I am also often puzzled seeing some unknown sites with a few thousand pages indexed @G, but many of them are prefectly legitimate.
The human interests are huge and it is impossible to judge the validity of someone's site based on content quantity alone.
200,000 articles, recipes, jokes, products or whatever is possible to collect before publishing, but the question is really would it be smart enough to put them online all at once.
Besides duplicate content, it is possible that it would trigger some other filter based on content/timing ratio.
In any case I would release them in several phases.
Regarding navigation links try to research some other sites with huge index.