Forum Moderators: open

Message Too Old, No Replies

Thousands of pages indexed but not showing in results

Know something similar?

         

kaijohannkursch

3:37 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



We manage a new site (some months old) with a great amount of indexable pages. It has been indexed in three steps. Firstly index page, secondly several thousands of pages and thirdly...

On 9-11 September googlebot crawled nearly 200.000 different pages, but those pages does NOT show on Google results... and googlebot is visiting the site unfrequently (some hits on index nearly every day... one day a thousand of hits...)

Site is PR6 (surely PR7 next update) with a lot of inbound links (and growing daily), most pointing to index, but also to internal pages (some from high PR sites). We've noticed Google showing more pages on other sites, even counting more backlinks since then.

It's been more than a month and we are waiting for the pages to show... Does someone know of something similar? Could we expect Google to show these pages soon?

[edited by: kaijohannkursch at 5:04 pm (utc) on Oct. 12, 2003]

plasma

4:21 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



Sticky me the url

plasma

4:37 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



site:www.yourdomainfds.org -gfgfdsgfdsgfds
returns ~6000 pages, so at least you're in the index.

A normal search for your 'double keyword' returns you at position 5.

Everything looks normal to me, it's just that your pages are not keyword specific enough to rank high in keyword searches.

Remember that inbound linktext is very important and I guess everyody is linking to you with your domainname as linktext :(

kaijohannkursch

4:45 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



Everything looks normal to me, it's just that your pages are not keyword specific enough to rank high in keyword searches.

Have you read my first message?

I was not asking about visits or rankings... I was asking about crawled pages not showing in results (around 200.000 pages since more than a month).

I know our currently pages showing in results, I said in the first message, I was not asking about it.

plasma

5:39 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



I was asking about crawled pages not showing in results (around 200.000 pages since more than a month).

Oh, seems like I've missread your post.
Hm, then I have no clue.

Maybe googlebot 'thinks' these pages have duplicate content?

claus

6:36 pm on Oct 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From zero to 200K in three months or so is an amazing growth, and a chunk of 200K pages is a very large amount. Is it a shopping site with a lot of product pages?

A very large site in terms of pages is bbc.co.uk - a G search for "bbc" on that domain yields 3,1M pages. In three months or so you've published 6% of this number of pages. Guess: Perhaps there is some kind of limit as to how many pages that will be indexed at once.

Actually, i'm surprised that the Gbot spidered all these pages (that's very deep), given that your site is not older, but it's probably your high PR that did it.

/claus

kaijohannkursch

6:53 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



The site is a directory, not a shopping site. And I think you have misunderstood my message (and plasma's). 200,000 pages have been crawled but only ~8,000 are displayed on results...

Why to crawl such amount of pages if Google does not display them?

I don't understand.

cabbie

7:36 pm on Oct 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It was only a month ago .Give it a couple more weeks I reckon.

kaijohannkursch

7:53 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



"Only" a month to show crawled pages?

I never saw so much time to display crawled pages in search results. It usually takes some days... and google has increase indexed pages (even backlinks) since then...

plasma

8:11 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



Why to crawl such amount of pages if Google does not display them?

webmap

kaijohannkursch

8:16 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



?

cabbie

8:19 pm on Oct 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>>"Only" a month to show crawled pages?

Google might be updating continuously now but they're in no hurry to index deep pages on a relatively new site.

kaijohannkursch

8:39 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



Google might be updating continuously now but they're in no hurry to index deep pages on a relatively new site.

No problem if Google takes more time to crawl/index our site, I don't complain about that, but... why has Google crawled 200,000 pages if it does not show them in results?

I don't understand googlebot doing hundreds of hits per hour for no reason at all... the question is should we be worried about that? is it usual that thousands of pages are crawled but in "limbo" since more than a month?

claus

9:05 pm on Oct 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> should we be worried about that?

Not so much as to lose sleep over it. As you have already seen a 33% increase to 8K pages, you should be allright. The Googlebot is probably wondering what hit it and trying to get a hold of the situation. Most likely it's still wondering if it should be worried, sampling a little here and there, trying to make it's mind up.

The deep scan was probably retrieving all your links, and as you're running a directory, that would be millions (theoretically max 20M, likely 2-5M) - it has to digest those links and that takes a while. Imho, you have simply moved too fast.

/claus


Added: Thanks a lot for starting this thread; extreme sites are rarely discussed, i'm glad you shared :)

[edited by: claus at 9:32 pm (utc) on Oct. 12, 2003]

SEOPTI

9:07 pm on Oct 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You shouldn't be worried at all. With this amount of pages you will end with a grey toolbar because you are spamming.

plasma

9:15 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



With this amount of pages you will end with a grey toolbar because you are spamming.

Flamebait?

It looks like a decent site to me, many pages is not necessarily spamming.

kaijohannkursch

9:21 pm on Oct 12, 2003 (gmt 0)

10+ Year Member



With this amount of pages you will end with a grey toolbar because you are spamming

High amount of pages = spam? Don't make me laugh
Obviously you don't now the meaning of spam...

asinah

3:04 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



First of all don't worry, they will show up but please send me the url as a sticky mail and I will look at the site. The name of the game with google is to take easy and wait.

Here comes my story. We launched last year a travel portal with travel content of 2500 pages (all pages PR5), an ODP directory 350000 pages, and a XML feed to Amazon with another 400,000 pages. We have added as well a live weather-feed to 9000 cities in the world (all PHP mode rewrite in plain html and googlebot loves those pages and we have on average 30% of those pages in the index)

On average google crawls about 150-200,000 pages per month and we have on average about 100,000 pages in google. We serve 9 languages giving us about 10,000 visitors per day by a database powered with 14500 hotels x 10 languages makes about 145,000 pages.

Last month we reached a critcal stage and hit for the first time 150gb per month in traffic. Our server crashed and we moved to a new provider (DNS was lost for almost 7 days as we had a heated debate with our former hosting provider that just couldn't serve the bandwidth).

We have moved already to a new provider based in Canada but they somehow had problems with a dns server and google was still indexing the pages but no more the main index ( I am sure it was related to the google cache but who knows ).

A strange pattern I noticed was that the problem started with the shopping mall (the problem started before our server crashed) of the amazon feed.

googlebot knows that we are an affiliate but they seldom put more then 10,000 XML pages on the google index. They have crawled over 200,000 pages from our Amazon feed but they don't show them. It has been going on for the past 5 months and I don't believe that they will add more then 10,000 pages. (BTW: We have about 2500 inbound links from other websites incl. many sites with PR5 and PR6).

I guess you have to buildup more quality inbound links from external sites but it could be that the site is to new. A Dmoz link is an important factor as well. (We have 9 links on Dmoz but not to the shoppingmall pages. We have in addition 6 links on all Yahoo domains but again not to the shopping mall).

A couple of month ago, it was just so easy to get more pages on google, as they added about 2000 pages per day but we haven't got any new pages for the past 7 weeks added in google.

My feeling is that google doesn't seem to like shopping mall pages that much, if they are pulled from amazon or from any affiliate site.

I maybe wrong but I still have 14500 hotel pages in google, 3000 weather pages, 5000 Dmoz directory pages but the shopping pages seem to have been removed by google from 25,000 pages to 2360 pages in a matter of weeks.

What is strange however is that the 2360 pages of the mall I have in google are all linked from different country guides.

Example:
Our Travel guides (25 countries) has been online for the past 16 months and have many inbound links to those pages from many sites. Every Amazon product page on our server linking directly from those travel guides are still showing up on google but the other pages have been completly removed from level 2 onwards.

So my guess is because the site is new, google will take months before they will be shown throughout the google index or google just don't like them. 5 weeks ago google had 35 million amazon pages in here index and today it is about 25 million.

We still generate per day about 3000-4000 visitors on average and we still make about 100 dollars in comission per day but amazon revenues went down by 85% and we now are slowing down the whole shopping operation as it cloaks to much bandwidth for just to little revenues.

Please send me your website as a sticky email.

seofreak

3:51 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



my experience with too many pages has been that you need to wait for atleast 2 updates before the maximum start showing up. even in the next update, i doubt you will see all indexed.

kaijohannkursch

4:04 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



I think you all have misunderstood me. I was not talking about crawling times. I manage a lot of sites and I am a SEO since several years, I know time is needed to get the sites fully indexed.

I was talking about the time passed from where the pages are crawled until where they are displayed in results... Never saw so long time (more than a month) and that's what worries me.

spud01

4:06 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



I was talking about the time passed from where the pages are crawled until where they are displayed in results... Never saw so long time (more than a month) and that's what worries me.

From start to finish, i.e. registration of domain name, setting up dns records to point to the server hosting the site, setup hosting of course, produce the html pages and make then neat and tide to conform to guidelines some SE's like you to abide by.

10-DAYS - I know this as I have done it for one of our clients sites. This occured about a month after the last dance...so your guess is as good as mine why the site was indexed so fast and been found in SERPs for the keywros it was optimised for, plus it tops other sites long established in the SE.

Another example...but of different nature...

After a massive lal/downturn in backlinks showing in google after the last major re-index I know see all the backlinks showing back up again for a particular site I take-care-off.

Originally it had 300+ pages indexed, then it went to 200 something then to 19 and then to nada just a couple of weeks ago.

It now stands at 272 for the site atm.

And I thought the hypothesis that index.html pages will not be counted as backlins, is now gone out the window.

asinah

4:34 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



kaijohannkursch,
you could have saved the time of posting. As you put it you have been an SEO for many years but your site shows at marketleap on google with 0 (zero) incoming links. This means not one website links to your site. I would be very surprised if you are able to keep that number up in google and my guess is that within the next 30 days google will chop down your links to Level 2 only which would be 31 links.

Also you should reconsider removing adsense from your site and add real content. Many of your categories in your directory has zero entries and only a banner to adsense.

So your post is a little bit confusing. Many of the googlebot crawlers you get is actually the mediaserver from google adsense and not the googlebot deep crawler as you serve on every single page adsense.

Also your PR6 for your mainpage I wouldn't give much credit that google indexes all your pages because google thinks that blabla.com/?c=82-4 is your mainpage alsom if it is not.

What you should do is get some PR5 or PR6 inbound links and setup a link exchange. On my site I have a link exhange with 970 travel websites that point to my site and we always investigate each site that gets linked back from our site.

Also you should setup all your links such as blabla.com/World/ and not as blabla.com/?c=63-1

but since google shows that you have not one incoming link, my guess is your links will be dropped until Level 2 which means you will have 31 links left in the index.

Good luck!

asinah

4:48 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



Quoted by kaijohannkursch:
I was talking about the time passed from where the pages are crawled until where they are displayed in results... Never saw so long time (more than a month) and that's what worries me.

My response:
No need to worry. Google is still indexing on my sites for the past 1 1/2 years links and they still don't show up.

The crawler reads the page and follows links. It doesn't mean they will show up at all in the index one day.

We run ODP of Dmoz as well with 450,000 pages and about 35,000 pages getting crawled on average per month. They never show up.

As it looks google indexed all pages up to level 3. This is great actually. You should work on moving level 4 into level 3. This should give you at least another 10-20000 pages indexed.

As a last closing note don't advertise on your site at all about Search engine optimization and don't offer the script with a price of 20USD. It could improve a lot your rating on google.

Lightfoot

4:55 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



kaijohannkursch:

Some really helpful posts here, but you're knocking them all back!

You're a real charmer aren't you!

kaijohannkursch

4:56 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



This means not one website links to your site

The directory has PR6 and 1190 backlinks. Those pointing to index, we have dozens of backlinks pointing to internal pages also.

The links grow daily (several hundreds a week), as we offer a free script with links pointing to the index and the script itself.

Many of the googlebot crawlers you get is actually the mediaserver from google adsense and not the googlebot deep crawler as you serve on every single page adsense

Not. We difference googlebot and media bot. Googlebot did 200,000 hits to different pages, media bot does a lot more.

Also your PR6 for your mainpage I wouldn't give much credit that google indexes all your pages

Google has indexed thousands of pages, and there is no reson to not continue doing that. In fact it crawled the mentioned 200,000 pages (and that's because I posted, we are waiting these pages to show)

What you should do is get some PR5 or PR6 inbound links

We have links from PR6-7 sites...

Arnett

6:06 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



Only" a month to show crawled pages?

I added 5,000 keyword optimized static html pages to my site during the period between June & August. The pages were not spidered until September and are now only beginning to show in searches.

Before Dominic/Esmerelda pages with a PR4 or higher were spidered monthly and updated monthly. Pages with a PR less than 4 were spidered and updated quarterly. I know this because I put an SSI date call in the footer of my pages and check the cached copy to see when the page was spidered.
I'm not saying that this schedule has changed since the new "rolling" update or that it hasn't.

Google has been very slow to bring in regular and timely updates to backlinks and they are the backbone of PR. Check the PR of your new pages. If the PR is less than 4 you may have to wait until all the backlinks are calculated and updated before your new pages become part of the rolling update schedule.

kaijohannkursch

6:21 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



Some really helpful posts here, but you're knocking them all back!

Sorry, but I did not find the posts helpful until the last messages (not only helpless... they were not answering what I asked at all).

But the last messages help me to understand what might be happening.

Thanks to Asinah and specially Arnett.

Regards.

onedumbear

6:36 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



"If you want to sticky me your url i would be happy to give you some critiques on your website and a few other pieces of information that don't matter to you"

seriously though
I really would not worry about it. There are lots of changes going on everywhere including google. Patience is one of the most important webmaster "tools", imho. Just keep building your sites and pages.
Sometimes my pages are spidered and then indexed within a few days, sometimes it takes weeks after spidering, and i have a couple that have waited about 6 weeks.

I believe that the way you introduce a new page to your sites also has something to do with the speed of spidering and appearance thereafter. As many people have mentioned PR can affect this too, but i doubt that's your problem.
Really would not worry though.

Net_Wizard

8:31 pm on Oct 13, 2003 (gmt 0)



This is what happened...

Googlebot 'fetches' your pages, it doesn't know what those pages are yet but took them anyway and stored in a raw master index. Which in turn...

Each URL have to wait their turn to be analyzed, graded, PRed, and ranked accordingly. In short, the actual indexing process.

If these are new URLs, never before in the Google index then the URLs have to run the whole gamut of the so called 100 factors and this could take a long time to process.

Be patience, little by little the processed URLs will be included in the various data centers.

If there are no duplicate content then consider yourself lucky that Google actually crawled 200,000 pages.

Cheers

kaijohannkursch

9:09 pm on Oct 13, 2003 (gmt 0)

10+ Year Member



Net_Wizard, I think you are wrong.

Pages don't need to be PRed to be indexed. In fact, there are a lot of pages indexed though without PR; As a matter of fact not only without PR (something Google recalculates approximately each month), but also without backlinks being counted (in the last months this occurs more frequently than the PR update).

Let's go further. There are indexed pages Google does not know even their content (those displaying only the url as title and no content at all. Sure you have realized them)

About the 100 factors... they are calculated REAL TIME when you search any term given. Obviously Google does not calculate these factors for each of the millions of different search terms a single page may target (think about all possible combinations).

If Google would need to do that kind of operations over the crawled pages, it would have displayed them gradually. There is no need to release them in bulk.

Onedumbear, thanks for your message. It's pleasant to know this does not happen only to us.

This 48 message thread spans 2 pages: 48