Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Pages Dropping Out of Big Daddy Index

Part 2

         

GoogleGuy

7:59 pm on May 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Continued from: [webmasterworld.com...]


internetheaven, you said:

I had 20,300 pages showing for a site:www.example.com search yesterday and for the past month. Today it dropped to 509 but my traffic is still pretty constant. I normally get around 4,500 - 5,000 to that site per day and today I've already got 4,000.

So, either Google doesn't account for even a small percentage of my traffic (which I doubt) or the way Google stores information about my site has changed. i.e. the 20,300 pages are still there, Google will only tell me about 509 of them. As far as I can tell, I think the other pages have been supplemented.

That resonated with something that I was talking about with the crawl/index team. internetheaven, was that post about the site in your profile, or a different site? Your post aligns exactly with one thing I've seen in a couple ways. It would align even more if you were talking about a different site than the one in your profile. :) If you were talking about a different site, would mind sending the site name to bostonpubcon2006 [at] gmail.com with the subject line of "crawlpages" and the name of your site, plus the handle "internetheaven"? I'd like to check the theory.

Just to give folks an update, we've been going through the feedback and noticed one thing. We've been refreshing some (but not all) of the supplemental results. One part of the supplemental indexing system didn't return any results for [site:domain.com] (that is, a site: search with no additional terms). So that would match with fewer results being reported for site: queries but traffic not changing much. The pages are available for queries matching the supplemental results, but just adding a term or stopword to site: wouldn't automatically access those supplemental results.

I'm checking with the crawl/index folks if this might factor into what people are seeing, and I should hear back later today or tomorrow. In the mean time, interested folks might want to check if their search traffic has gone up/down by a major amount, and see if there are fewer/more supplemental results for a site: search for their domain. Since folks outside Google couldn't force the supplemental results to return site: results, it needed a crawl/index person to notice that fact based on the feedback that we've gotten.

Anyone that wants to send more info along those lines to bostonpubcon2006 [at] gmail.com with the subject line "crawlpages" is welcome to. So you might send something like "I originally wrote about domain.com. I looked at my logs and haven't seen a major decrease in traffic; my traffic is about the same. I used to have about X% supplemental results, and now I hardly see any supplemental results with a site:domain.com query."

I've still got someone reading the bostonpubcon email alias, and I've worked with the Sitemaps team to exclude that as a factor. The crawl/index folks are reading portions of the feedback too; if there's more that I notice, I'll stop by to let you know.

[edited by: Brett_Tabke at 8:07 pm (utc) on May 8, 2006]

Pico_Train

5:03 pm on May 9, 2006 (gmt 0)

10+ Year Member



I've been wondering what you guys are talking about regarding pages disappearing from the index and today, for kicks and out of curiosity, I did a check on my sites.

Guess what? I've joined your club. Gone from 150 pages or more to roughly 50 pages.

Great stuff.

Pico_Train

5:07 pm on May 9, 2006 (gmt 0)

10+ Year Member



Relevancy,

The site is quite new. Less than 1 year old, launched in August last year so close to 9 months old.

Some directory links of course but others as well have been obtained.

tigger

5:14 pm on May 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>EVERYONE: Are the sites/domains that are dropping pages fairly new?

12mths

>>Have you used link directories as your main source of link development?

nope most one way links from friends

but a friend that is experiencing the same problems his site is 3 years old but I don't know about his links

Relevancy

5:26 pm on May 9, 2006 (gmt 0)

10+ Year Member



Tigger,

Do you know if your friend has changed his registar info recently? If so the domain can then again be seen as newish.

Plus even if you dont get directories links, doesn't mean there isn't a web of directories with your friends sites and therefore the links from there sites are not as powerful anymore.

[edited by: Relevancy at 5:31 pm (utc) on May 9, 2006]

RibaRiva

5:29 pm on May 9, 2006 (gmt 0)

10+ Year Member



Relevancy:
My site is 16 months old--all hand-written, no dupes, no re-directs, nothing wrong I can think of. Pages started peeling away about a month ago. The dropped pages were all added within about a three-month period starting in December, but not all pages added during this period were dropped. Pages that link to my home page were not dropped. On some servers, they've begun coming back. My default server now shows 167 pages out of 789 (including forum posts). The "new results" servers mentioned earlier claim to have 789 results but I can only get to about 165 of them and the last few are supplemental. Traffic has declined only slightly but it should be shooting up at this time of the year. This is such a disheartening mess. I want to add new content but what's the point if nothing is going to show up.

F_Rose

5:31 pm on May 9, 2006 (gmt 0)

10+ Year Member



For the past few weeks we had loads of supplemental results, G listed old pages that had either 301 redirects or 404 error pages.

As of today G got rid of our supplemental results.

However, they are not indexing most of our pages.

Whichever way I do site: (w/ or w/o www) it would come up 24 pages. (as of today)

Should I expect now, the fact that G got rid of the supplementals they will start indexing more pages of our site?

dramstore

5:37 pm on May 9, 2006 (gmt 0)

10+ Year Member



All my pages that are going are new ones within the last 4 months too

tigger

6:38 pm on May 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Do you know if your friend has changed his registar info recently? If so the domain can then again be seen as newish

nothing has changed

>The dropped pages were all added within about a three-month period starting in December

thats about the only thing I'm picking up from this that the dropped pages are pages that have been added over the last 4 months and that applies to both our sites. It's almost like anything newish has been dropped

wheelie34

6:49 pm on May 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



which shows it could be a rollback pre March 06 as discussed in the big daddy thread, my missing pages are also approx 4 - 6 months old, I have seen an improvement over the last 3 days of pages indexed, anyone else getting any back.

I hear some sites are only now starting to loose pages, mine have been gone for almost 2 months! and have been up and down over the last month about 5% each way, currently on a high 432 last week it was 375 should be 990 ish.

Seems like a stage thing to me, our moans and groans have been the same, just at different times over the last 2 months or so.

Spanish_eye

6:54 pm on May 9, 2006 (gmt 0)

10+ Year Member



I have a 2 year old site and most of my deeper pages have not been indexed since the start of BD (lost about 1800 pages). Oddly though, it ranks better than ever for the pages that are indexed!

Another site, 8 months old, pages gone from 60,000 down to 252. Pages are coming back at a rate of 10 a week...at this rate I might have to get a job!

As I don't want to get a job I have been doing some testing. I find that I can get small new sites, to rank well within a week and stay there. OK I would have to make a lot of new sites to get over this Google indexing problem but I've got to do something. How can new sites have pages indexed before established sites? It just doesn't make any sense.

Come on Google, sort it out.

Spanish_eye

7:03 pm on May 9, 2006 (gmt 0)

10+ Year Member



Just wanted to add that although my main site ranks better than ever (with less pages) my home page, which used to be indexed every day, now only gets indexed every two weeks! Something is definitely VERY wrong here!

dramstore

8:04 pm on May 9, 2006 (gmt 0)

10+ Year Member



Someone mentioned this before, but could be a sandbox type of thing, sandbox being the delay to calculate (or recalculate in this case) new pages since the starting checkpoint.

If thats the case, hopefully they're doing quickly!

g1smd

9:21 pm on May 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Not only are some sites having less pages appear in the index (these are the "experimental" and "cleanup" datacentres as far as I can tell), but some sites are also falling victim to Google showing the same snippet for every page of the site in a site: search (instead of showing the meta description, or whatever) and hence getting a result like 1 to 3 of about x000. Previously a result like that would be an indication of a duplicate content problem, but in this case I guess that Google is just working on old data for the snippet. It has been long apparent that the data for the indexing and ranking, and for the snippet, and for the cache itself all come from separate databases.

The "same snippet for every page" appears to be happening in most (if not all) datacentres.

gcc_llc

12:32 am on May 10, 2006 (gmt 0)

10+ Year Member



Interesting. I've had about 20k pages reindexed on half of the DC's. Numbers are actually going up, something that hasn't happeneind in about 4 weeks.

gcc_llc

12:33 am on May 10, 2006 (gmt 0)

10+ Year Member



I also saw one time while doing a search a section on the side (left) breaking down where my pages where indexed. It had Web, Froogle, Images, etc. with status bars showing where the majority of my pages were indexed. I saw this once but I couldn't recreate it again.
This 249 message thread spans 17 pages: 249