Welcome to WebmasterWorld Guest from 188.8.131.52
Forum Moderators: open
The pages haven't been totally removed, instead it seems that the pages exist in the Google SERPS but they have no title and no snippet and therefore no longer appear for any searches.
At first I thought this was some sort of penalty / filter to remove some of the controversial search sites from its index, but it seems this applies to other large sites, e.g. dmoz. I would estimate that dmoz has had around 200,000 pages "nuked".
Has anyone noticed this phenomenon on any other sites?
Like I have said before, I think Google is archiving large sites maybe with a poor linking structure or maybe low PR (or a combination of the two!) to the supplemental index.
Obviously non-auto generated content could also be affected, but I believe its a measure put in place to stop web masters from filling the index full of auto generated rubbish.
I could of course be completely wrong...
I think there are three different factors involved in this No title/no snippet problem.
1. Penalty - (as experienced by certain sites that link to their own serps), this one is possibly manually applied by the sites being removed from the crawl.
2. Penalty - where pages are just full of Affiliate Links - probably algo applied.
3. Change in crawl depth meaning that large sites are having problems having every page indexed - these sites need to increase backlinks but there is probably not a penalty.
I think I had a site with factor 3 and I have increased backlinks and pages are now showing.
That is basically the same symptom as I described and I agree with you. Not enough backlinks / Page Rank is not high enough.
Smaller sites with a moderate PR and good link structure generally have a decent level PR throughout the entire site. These are unaffected...
My problem started a little over a month ago when my 10,000 page site started to show symptoms of pages without titles/descriptions.
Since then things have become progessively worse with my google indexed pages now below 4,000 and the majority of these pages are minus titles/descriptions.
Knowing that a lot of people are in the same boat as me, I have been waiting patiently for a 'Google fix'
but now I'm getting nervous as my site is sinking into Google oblivion.
The only advice I can offer is to contact Google. I did this 13 times over a six week period and eventually what I describe above happened. I laid it on the line explaining that the problem was killing my business. I got several replies asking me to "be patient" and telling me that my request had been passed to their engineering department, etc.
Because they will not comment on specific problems, (even if they take any action), I cannot say for sure that this helped but my gut feeling is that it did. When I say "helped" I must emphasise that the net effect of what has happened is that I am no further forward traffic wise and I have lost three months of business as a result. But I now at least have some hope.
Incidentally I am a consultant who works from home and I have several links into my site from reference sites in my industry. The traffic from these has kept me afloat. This has taught me a very valuable lesson about who my friends really are.
You need to give it time though which if your income is hurting, is obviously not a very helpful suggestion but is what we're suffering ourselves just now.
So far, only one thing makes me a bit nervous: i have many double listings due to Case-InsensitivitY.htm / case-insensitivity.htm. In the past google has been clever enough to merge such double listings. Don't know if it lost its cleverness in favor of a penalty obession.
However, i've read most of this thread and so far i didn't find a possible hardware / software cluster issue mentioned. After reading Matt Well's Interview [webmasterworld.com] i've learned that big search enignes with server clusteres and distributed indizes use something called title records that hold the title and the cache copy of each url.
Could it be possible that the url-only listings appear because the title records for them are not available - due to server error, server update, hardware expansion or simply partially index update of the index? I've read somewhere on the WebmasterWorld ocean that google is expanding the hardware cluster. Or could it be possible that the crawler simply works faster than the indexer and therefor the title record indexing is a bit behind?
A couple of days ago Google did a very deep crawl and today surprise surprise, my titles and descriptions have been restored, not all, about 350 pages so far, but that's a start Go Google:)
BTW because I got so despondant with google I didn't update the site.
4. There's no description of my site.
The Google index contains two types of pages--fully indexed and partially indexed pages. Your page is currently partially indexed, which means that although we know about your site, our robots have not read all the content on your page(s) in past crawls. This does not adversely affect your PageRank or your inclusion in our index. It does mean that we don't 'know' what to call your page, so it gets listed with the URL as the title and no description.
We appreciate the frustration this causes webmasters who work hard to make their sites accessible to users. We are working to increase the number of fully indexed pages in our search results to alleviate this problem.
What if this isn't a storage issue as much as a transfer issue. What if their margins have got so low that they need to limit the expense on bandwidth transfer?!
Wouldn't this also compliment their close coming IPO? It could be for the extra money required to keep Google on top!