cyberax, I think that's the big question of the moment.
When people start reporting something new, I tend to assume it's just some peculiarity with their site. When people claim "New Penalty!", I tend to assume that they've missed some other problem.
This probably isn't just one thing. Google's quest for freshness has at times reduced crawl depth (anyone who remembers the Big Switch last Summer will know what I mean). Still, there are sites/pages that don't fit the normal answers (in this case, get more PR/deeplinks).
If there's a change of crawl-depth habbit at the same time as a new crawl-depth-reduction penalty, then this is another goal for Google in the battle to stop people from understanding its mechanics too well.
Yep, I am going to post on this one too.
I had a deepish crawl for one of my sites Friday night and the new serps have updated this morning reflecting this. I know exactly which pages were taken as I cache each page on the server after being viewed and I had only just cleared this cache.
This mornings serps reflect the new pages which were visited and then I have the no title/no description sydrom on lots of other pages that were not visited (replacing what were full title/description before these serps appeared.)
As has previously been posted, this tends to happen when Google knows a page exists but has not indexed it! - Well it looks like Google knows of a lot of pages at the moment but is not crawling them :(. (On my site I personally think that this is because my site in question is still being crawled and Google updates the Serps so quickly now - before the site crawled is complete.)
But for this to happen with DMOZ and WW - how many more backward links are required?
When I see a problem again and again and again - I don't just start thinking problem, I start thinking underlying technical problem.
Because where there are technical problems, people start spreading your resources thinner (most problems, if you think about it, are concerned with how far you can spread your resources.)
IMHO, FWIW, IMO, etc.,etc.,e.,(latter e stands for etc. BTW ;)
G has had crawling problems for months. IMHO of course (IMHO used to save electricity :)
"When I see a problem again and again and again - I don't just start thinking problem, I start thinking underlying technical problem. "
Google is maybe refreshing it's index and soem sites get caught in the middle for a few days.
The more you look around the more evidence you can see of this :(
I have seen some big shopping search sites effected, and perhaps even a big booksellers UK site (Trying to decide if they are affiliate urls or proper urls - definetly seem to be a fair number of proper urls.)
They could always buy more machines to crawl. Maybe the problem they're having, if there is a problem, is updating the index continously if they have too much data.
Come now, come now - how could they possibly have a problem!
Is it possible? Get real ;) They've been having problems for 6 months or more, and the evidence is on these boards.
>They've been having problems for 6 months or more, and the evidence is on these boards.
I haven't been following the threads. Same conclusion I guess.
Jakpot: as you know, this 'no title/no description' has been going on for months.
It's like taking your pedigree dog back to the breeders saying:
"look, but it's got a leg missing!" and them saying, "well at least it's got three"
Fair enough, you think.
Then you you go back next week, and say:
"but look, it's got a leg missing!" and them saying, "well at least it's got three"
So you go back next week...
There's a danger the dog will be dead before someone fixes its leg.
Either that, or the breader going out of business ;)
It seems to me that they wouldn't put all these abbreviated links in the SERPS if they could help it. It wouldn't make sense to punish the searcher that could use the additional information.
This was first seen in Oct 2003, cannot find the link to that thread but it was long.
For this problem to still be around, and being noticed / Seen More & More frequently now, isn't it strange that GG has never commented on this.
How many more sites do we have to see for it to be confirmed that Google has a problem in getting these sites back in the index.
ok, here is a theory.
-The no title/description occurs when a page has not been updated for a period of time.
-In the last month we have seen many webmaster stating they have seen little of the googlebot (myself included), it seems high pr sites (6+) are getting crawled more and (5-) are getting crawled alot less.
Thus, as a result of less crawling, the lack of title/description occurs more.
Google would have no response to this as it is a result of them cutting costs in crawling (not something to boast about).
Just a thought.
>result of them cutting costs
Computers have never been this cheap, and I'm sure that they can pay for the bandwidth with AdSense, etc. Of course, they could be getting froogle.
Problem? Sounds like a good way to eliminate spammy sites, affiliate sites, and non-organic sites to me.
|Sounds like a good way to eliminate spammy sites, affiliate sites, and non-organic sites to me. |
Which way? And I know some BIG and clean sites out there that experience the same thing.
|-The no title/description occurs when a page has not been updated for a period of time. |
Seeing that on a site where about 7-10% of the content changes on a daily basis. (statistics, news, ...)
|-In the last month we have seen many webmaster stating they have seen little of the googlebot (myself included), it seems high pr sites (6+) are getting crawled more and (5-) are getting crawled alot less. |
The BIG sites mentioned above have PR 6-8 which makes me think that there must be another reason.
And I personally don't think that it's a problem, because I've heard from somebody playing with SPAM (PubCon people know who I'm talking about) that this kind of penalty exists.
|And I personally don't think that it's a problem, because I've heard from somebody playing with SPAM (PubCon people know who I'm talking about) that this kind of penalty exists. |
But isn't it affecting a lot of quality sites?
Yep, that's why I'm so curious about it. It most be some common practice for big sites that they penalize, but they might filter the good ones manually. That's another thing I've heard at PubCon: There will be more manual SPAM fighting in the future ....
Saying that, I'm really curious how this penalty can still exist. Maybe the BIG (brand) sites have people that can talk to Google about that and the big sites simply loose. No idea ... sorry.
The one site I pay a lot of attention to that is afflicted with this problem is easily the #1 site in its niche, and they use a load sharing tactic of making www.site.com us.site.com uk.site etc.... which leads to literally millions of pages of duplicate content and I'm sure one very confused Googlebot.
I'm in a hurry:
No duplicate content/sourcecode and no load balancing used.
I couldn't find anything they have in common so far, unfortunately.
>>>Problem? Sounds like a good way to eliminate spammy sites, affiliate sites, and non-organic sites to me.
I used to think this too. However with more and more sites being effected, including good quality sites it is started to point more towards being a bug. :(.
|Problem? Sounds like a good way to eliminate spammy sites, affiliate sites, and non-organic sites to me. |
My consultancy site, which is information based, has been afflicted with this for nearly two months now. I have had no useful response to my requests for assistance. The content on my site does not change on a regular basis but it is an authority site.
Surely sites cannot get punished for not changing their content regularly? I would say that sites with content that does not change frequently would be more likely to be authority sites. Otherwise the Bible, the Koran and the Encyclopaedia Brittanica would all have been taken off the bookshelves years ago. Content does not have to change regularly to be useful.
|Content does not have to change regularly to be useful. |
Agreed, that's why at the start, it may have looked liked it was a spam filter kicking in, but when you take a look at the broad range of sites that are affected, good clean, no spammy techniques to be seen, then this is pointing more and more towards a bug that's growing.
Did you mean Dmoz is not a clean site as a lots of only urls are listed in the G SERP's. And what i understand with the listed url's in the google for DMoz, the cateories having commercial sites, only those are having the only urls listed in the SERPS. Like if a cat is for books, or guide, or informative, that is full listed in G with Title & Description but when it comes to Hotels, travel tourism, or other services the problem starts here only.
Any expert comments.
It is not just a big site problem (Obviously on big sites it is more noticeable).
Just checked some smaller sites and there seem to be loads of sites effected.
My work colleague who only has a site for which he has done for fun (200 pages) and is not at all spammy and does not know how to ompitise (no fear of OOP) has lots of pages with just the title/description too.
Noticed some other members well known site have also got this problem but are not posting - are they concerned too or just waiting to see how this shakes out
Here's my take on this.
Google introduces an algorithm that favours authority sites / portals etc. Google knows that a side-effect of this algorithm change is that every optimiser on the planet will start creating portals purely for the sake of getting top ranking under a wide range of search terms. It is also aware of these "search engine results" sites that benefited greatly from the algorithmic change.
So, Google introduces a filter whereby in large portal sites it decides to disregard pages that fall into this filter (low PageRank, too little content, too few links, too many links etc etc). It still indexes them as they are unique and need to be indexed but once the filter kicks in it retains no further information on them, including a cached copy of the page, which results in them not performing under any search terms.
This makes a lot of sense to us, particularly since we have many examples of literally thousands of pages that have been fully indexed and other examples of pages that have not been fully indexed and have been that way for quite a few weeks.
If this is correct, the big question is, what implications does it have for someone who is listed in a small dmoz category that has received this treatment? Will they still get the benefit of the link or is it now disregarded?
The other possibility is of course that Google is broken!
However small sites are affected to.
Leaving the only conclusion that Google is broken!
at small sites the official reason (http://www.google.com/help/interpret.html#P) might be true.
I'll try to define sites:
small: < 500 pages
big: 500 - 15k pages
very big: 15k - 100k
BIG: > 100k
|It still indexes them as they are unique and need to be indexed but once the filter kicks in it retains no further information on them, including a cached copy of the page, which results in them not performing under any search terms. |
Interesting theory... if that's the case though it would appear that the filter has gone wild because it has removed lots of "good content" pages as well, such as pages from WebmasterWorld.
| This 312 message thread spans 11 pages: 312 (  2 3 4 5 6 7 8 9 ... 11 ) > > |