Forum Moderators: open
sorry to be so forthright, but this is completely untrue.
our main site runs to 5 figure numbers of pages and nearly all pages are indexed by google
knotz there is no need to change your set up but ensure you have decent site map/s.
as i said sitemap/s ... in my view each category or section of a site should have its own site map or maybe you want to call it an index page, or even just integrating site maps into the structure of the site itself. rather like links pages, they need to be implimented imaginatively.
imo one of the largest and best designed sites on the web is the bbc, every section you get to has fantastic site maps, every page links fantastically to many relevant pages and so on.
knotz. here's what to do:
1. make a list of your 1000 urls
2. use your word-processor or perl to turn it into a list of 1000 links to each url (ideally with the title of the page as the wording of the link)
3. use your word-processor or perl to paste 50 links onto the bottom of each of the 40 pages listed
forget about it for at least 1 month so you don't stress out unnecessarily or do anything stupid like go back on yourself.
and presto hey. your site will start to see more and more listings as each month passes. it probably won't all go up straight away, but it will in the end.
step 4 is optional and leads to a world of chaos where you no longer know when robots are coming and going. also, if you follow step 4, you will be carrying out what many people consider to be a form of activity which google may penalise; i don't personally hold sway with that theory, but it seems to have the majority vote.
4. paste a set of 20 to 50 links to other pages on every page of your site - that way each new link it eats will give it a list of 20 to 50 further links and speed up the rate at which it finds all the 'new' ones.
5. when all the pages are up, remove those extra links from your pages unless they are pages whose content will be regularly updated;
nb here is a huge discussion i found on this forum about whether google "punishes" sites for "aggressive link building"...
[webmasterworld.com...]
the only post you should read and obey on that thread is post number 124 (which outlines 'a simple method to get into Googles good graces') - be positive about the whole thing - that seems to be key
I understand you tried your best to optimize your thousands of pages
I get 1.5M page views a month, heading for 2M around June based on current trends.
The bulk of all these pages are dynamic so all of the content, except for maybe 100 static pages, are optimized on the fly and the optimization also changes based on how the content is accessed.
It's actually pretty tricky :)
I have various search shortcut links that are both great for customers to simplify accessing content but also gives the search engines a specific point of entry which helps massively with creating more SERPs all over the place and dead-on AdSense targetting. Each search shortcut pulls up specific data streams based on various keyword terms of interest and the various optimizations performed on the page are specific to the terms of each shortcut query. Not only that, but I make each search shortcut look like a static entry point with a keyworded page name. When I implemented the last draft of this dynamic optimization last summer I got a 300% traffic boost in a month.
Most of the hits to my site are deep link hits because of this optimiztion which is why the primary keyword which drives them to the home page only accounts for an average of 2.5% of the total site traffic which is still tens of thousands of visitors a month.
Still not true Spectre. I have a new domain (purchased April 5th) that started with 50 pages, now has 110, all but the ones I added last night have been spidered and are bringing traffic.
I agree with the other suggestions, and in addition, you might try changing your links from relative to absolute, and try acquiring a link from a page that is PR 5 or better.
Google claims it indexed 38,500 pages
Yahoo claims is indexed 36,800 pagesThey are both wimps, they stopped short
here's a little maths-teacher's demonstration of the way things are (followed by an english teacher's postscript)...
check out the number of kelkoo uk's pages listed on google (a thing i call plog); do it by entering the following in google's search box:
site:www.kelkoo.co.uk
note that there are 459,000 pages listed
now scroll down and click on 'search within results'
(if you're keen, also go to kelkoo.co.uk and note
that on the vast majority of pages you go to, the word "shopping" appears right at the bottom middle)
search (within the results) for
"shopping"
(include the quotemarks)
only 66,900 of kelkoo's allegedly listed pages are actually properly indexed;
now, bearing in mind that a number of their pages may NOT say 'shopping', go back and change "shopping" to "on" or "and" or "the"
they all end up being approximately the same.
then i tried amazon.com...
56,500,000 in the site:www.amazon.com search
and
52,100,000 in the search within results for "Home"
hence, be positive; in the long run, google WILL ingest ALL your pages.
it's got 52 million for amazon. why worry about your 1000 here or 40,000 there? google's strength is derived from its comprehensiveness. it needs to list all your pages.
this time i tested the domain
search.ebay.co.uk
they have approx 1 mill. pages listed on that domain
and approx 200 thousand have content, the others don't.
hence the estimate of 15 to 20% of pages listed is definitely a good conservative approach to business planning.
nb - what this proves, since ebay is one of the most profitable e-sales portals on the web, is that these pages which turn into contentless titles are occurring not as a result of any penalisation system, but as a result of difficulties in swift and accurate keyword ranking;
the more data there is, the harder it becomes for previous keyword scoring algorithms to end up matching the right pages;
hence it is likely that at its current 8 billion pages, google is entering the 'second teething' period of the lifespan of a being - aka adolescence.
the temptation is for most people to imagine that the reason only 5 or 6 thousand of your 40,000 pages are listed, or whatever, is something to do with someone wanting to make things hard for you.
in fact, google are probably MORE pissed off about it than YOU are.
except for maybe 100 static pages, are optimized on the fly and the optimization also changes based on how the content is accessed.
that's true of any properly designed online data archive;
eg a bogstandard retailer's website contains a script for lists of products in particular category -
1. the categories script can be accessed in a few dozen to a few hundred ways and depending on how it's accessed, it will give a page with specific words on it, including keywords generated for each particular category - and since each category could have up to a few hundred pages, that's a few thousand optimised-on-the-fly pages to begin with
2. product pages: then you get their 1500 to 15000 product entries, each a database item which is accessed from a single script - in the case of each manifestation of the script, the page produced is optimised by means of category keywords and related item keywords which all end up on the same product page;
it's a totally natural form of 'optimisation' - in fact it's the real optimum, which not-so-optimum sites aim to be like and thereby put themselves through 'optimisation'.
information sites, like for example the classics.edu site, with its many books etc; or some cookery site with thousands of recipes in different categories - ALL (if built correctly) have this same natural optimum presentation of information - which includes repeating the correct keyword metatags in the correct places and repeating the correct html menus on pages on the right pages, to be sure that you have abundant information for robots to work with.
there is no way on earth that google seeks to only partially represent these collections of pages, let alone to penalise them for repeating keywords across pages where relevant (since it's perfectly ethical and correct for those sites to do so)
even if google DOES have methods for countering actual spam, which consists of people mimicing the above sort of structure but with zero content and plenty of irritating popups, which themselves pull in a lot of bucks, sadly,
those methods have to always take second priority to the main purpose of google, which is to comprehensively list everything which IS legitimate;
i think the main problem in the last couple of years things have seemed so turbulent is that there has been, apparently, a big increase in spam and viruses;
but that will surely die down one way or another and then things will be swimming along nicely.
google DOES want to keep all of US internet-businesses in the money, rather than the brick-and-mortar businesses who are, largely, the root cause of the virus turbulence of recent times!