I too may have found my problem as several months back the server I had moved to crashed so we used and older version of the site to get it back up on another server we moved from, DNS the IP to the older server, then uploaded all the changes to the site to this server. Well I forgot some files I had deleted to keep our asp pages from displaying were still on this server. We had gone htm
I was going through my supplemental site now, did a site:www.mydoamin.com and was looking for clues as to the reason and dubus head saw and mydomain.com/products/aspid=222
Shoot look at your site hard it was my fault Google is correct I just hope it will correct itself as I deleted all the files now so it is dead pages.
ScottD - Not seeing meta tags as the big daddy W'sup problem. Have 2 sup hell sites with ~5000 pages mostly all have unique meta descriptions and keywords - with unique content.
OK, put on your tin-foil hats, gather up, and listen carefully. I'm only gonna say this once! ;)
|It is reassuring to know we are not the only ones this has happened to, and all theories, however crackpot they might seem, are welcome because the more we all chip in the more likely it is someone will have that Eureka moment and the bits will fall into place |
The key thing to remember is that BigDaddy is being deployed to resolve canonical problems.
I suspect sites are going supplemental down to the home page to identify the canonical domain. Then the site is respidered and rebuilt in Google's index using "new infrastructure," revised code that better handles canonical issues.
Many here have reported:
1. A page count drop and/or pages going supplemental.
2. Until the home page is the only page indexed.
3. Heavy spidering.
4. Pages coming back.
5. Site fully indexed.
Google has 20-25 billion pages indexed so this sequence is gonna take some time to repeat on all of them. They can only process "a few" sites per day. Perhaps one percent? And they're doing this on "live" DCs because it's the only place they have enough storage.
I don't follow DCs or try to monitor BigDaddy but I saw traffic tank and then resume on two well established white hat sites in January. The full cycle took about three to four weeks and the sites are now back, fully indexed, and doing as well as ever. One uses a Google site map, the other doesn't.
I'm uncertain about how Google decided which sites to process first or decides which to process next. Any ideas?
Bottom line, when the BigDaddy update hits your site, just sit tight and ride it out. And PLEASE, try not to whine too much, OK? :)
"I suspect sites are going supplemental down to the home page to identify the canonical domain. Then the site is respidered and rebuilt in Google's index using "new infrastructure," revised code that better handles canonical issues."
Good post my man. Could well be.
Some Kind of Shadow Site?
Background - BD HP Good, rest supplemental, default google All Good,
I noticed a supplemental result that looked like [mydomain.com...]
clicked on that it resolved ok but without style sheets, clicking on the menu links I got
All resolving to real pages but with // in the url instead of a single slash.
Looked deep and a lot not all of the supplements are // based
Looking at the site live with an editor and searching I did not find any url or links with a mysite.com//
Any Thoughts on how to get rid of those refferences in the index and why they point to resolving pages.
If you are curious or think you can help PM for the url.
You can set up a 301 redirect on your site to redirect those mashed URLs over to the correct one whoever asks for them, and then forget about it.
Make sure that all your non-www accesses are 301 redirected to the www version of the same page, site-wide.
Three useful searches:
ONE of those should return ZERO results.
[edited by: g1smd at 11:24 pm (utc) on Mar. 3, 2006]
" You can set up a 301 redirect on your site to redirect those mashed URLs over to the correct one "
I am on a windows shared server, I have done asp 301 redirects on individual pages for non www and other issues, but there I had a real page on the site to add the code to. Here I do not have a page with mydomain.com//page.htm so where do I put the redirect?
watercrazed I had this issue a while back and I resolved it with a base href, the other options are absolute links and mod rewrite.
A few people have had this, and it may be that someone is trying break your site apart with a link pointing to .com\\
However given what DaveAtIFG has just said you might want to wait and see if the canonical issue is resolved by Google for your site.
oof that was quick g1smd!
301...I'd look at the global asa...but it might be easier to wait or set up base hrefs depending on the number pages/urls affected
[edited by: tantalus at 11:36 pm (utc) on Mar. 3, 2006]
>> Here I do not have a page with mydomain.com//page.htm so where do I put the redirect? <<
Yes you do, I thought that you said that the URL serves content. (If it does not serve content, and simply gives a 404 then ignore it, it will go away on its own soon enough).
So, add a detector and a redirect on the page that says "If we are at the '//' URL then 301 redirect to the '/' URL".
Of course, I should have guessed this was some sort of IIS screwup; you can also use ISAPI_rewrite to fix this (but there is a 99% chance that your host will have no idea what you are talking about, and a 99.9% chance that they won't let you use ISAPI or have any other sort of fix installed [Hint: next time, use an Apache webserver if you can]).
[edited by: g1smd at 11:32 pm (utc) on Mar. 3, 2006]
Good post by DaveAtIFG.
All I can say is
STEP AWAY FROM THE COMPUTER.
Go talk to some real people. Have a beer. Come back Monday.
You can't do anything about it anyway. Strange things have been happening for two years - white hat sites dropping 100 places and then coming back a month later, sites not even found for their own unique names, 302s, Canonical problems, duplicate content filters.
Google will do what Google will do. I hope they sort out all the glitches. At least big changes mean that something is happening - they have been in a mess for a long time now.
By the way one casualty of all this is if you use Google Search on your website. No pages in the index means people using that search button won’t find a thing. Personally, I would have switched to MSN or Yahoo if I knew it would be this bad. Horrific roll out if you ask me.
DaveatIFG makes sense.
However, we have only our index pages listed and we are not seeing any spidering today. Time to be concerned?
" By the way one casualty of all this is if you use Google Search on your website. No pages in the index means people using that search button won’t find a thing."
Yep and this is one of the reasons we removed the google search box.
This search on 18.104.22.168
site:www.res ource-zone.com -inurl:showthread.php -inurl:forumdisplay.php
gave a few "normal" results (as expected) followed by several hundred URL-only results (as expected - they are URLs excluded by robots.txt and are slowly being delisted),
BUT, when I clicked on the "Search for English results only." link at the top left of the page, I suddenly got 22 000 results and all except the very first three were marked as supplemental results. These results are all pages that are disallowed in robots.txt and in a normal search have already dropped out of the index.
As soon as you specify "English only" they all re-appear as Supplemental Results (proof yet again, that Google does not delete old data, but merely hides it from search results, except that numerous glitches bring it back in some searches when it should still remain hidden).
Something is in the works, and I have no idea what it is.
The &sa=N parameter also changed to &sa=X&oi=lrtip7 in the search URL too.
|King of Bling|
Good theory. We've ridden out storms before. Let's hope it's as smooth as you predict :-)
"...but why are you burning the Rum?"
Also searching on 22.214.171.124 with site:best of the home.c** -inurl:www I see many supplemental results, some were old pages that no longer exist, however, many were pages that do exist and have been redirected by 301 from non www to www for quite sometime. I checked one of the results, best of the home.c**/metal/sun.html, it went directly to the the correct www page and had a cache date of Feb 18, 2006.
Why would it still be shown as supplemental?
>>> The key thing to remember is that BigDaddy is being deployed to resolve canonical problems. I suspect sites are going supplemental down to the home page to identify the canonical domain. Then the site is respidered and rebuilt in Google's index using "new infrastructure," revised code that better handles canonical issues
DaveAtIFG, very sound reasoning...I'm very positive that your scientific prediction would turn out to be true!
I would agree that Dave's explaination sounds logical, however I also think that is the problem with his theory, it just makes too much logical sense.
Thank you for giving us an hope!
Do you mean that after the storm a site should regain all the pages instead the only supplemental ones in an automatic way?
Or do we need to clean all the "lost" page in our hosting? Do we need a good cleaning?
Hi DaveAtIFG! then this would be happening to many more sites actually. Which would mean that all sites including WebmasterWorld would have to go supplemental until it was reindexed there. As this has only happened to us in our sector I think after some consideration that it is not what you suggested.
Could someone just explain what "canonical" means in this context - thanks.
BTW - While all this is going on one of my competitors is running riot with 4 mirrors, traffic going through the roof and no one in sight to stop them.
>The key thing to remember is that BigDaddy is being deployed to resolve canonical problems.
First up - No traffic loss here.
But, I have a problem with this theory, well actually not with the theory, but with the Google execution!
Why make all these pages sup for this reason?
Why not simply leave the results "as is" and then respider from the known root.....Google knows the root and could remove pages accordingly!
This is a bit like the guy that says he has a water leak somewhere in his house, and someone says the solutuion is to demolish the house and rebuild brick by brick to eliminate the fault!
I'm not saying that Google isn't doing this, just that it appears like a dumb way to fix the problem.
Listen to Dave ... unless of course you enjoy hand wringing "Google is broken" and "the sky is falling" theories ... then by all means, have at it!
Remember, he said he will only say it once. He is a man of few words and If I know Dave as well as I think I do, he will not post again. ;)
Dave touches on the theory I hinted at earlier and a theory I really want to believe........
Personally I am not sure about that theory - we will see.
>>>>>Hi DaveAtIFG! then this would be happening to many more sites actually. Which would mean that all sites including WebmasterWorld would have to go supplemental until it was reindexed there.
WebmasterWorld has not got a canonical problem so this would not have required a de-indexing and then possible re-indexing. Google should easily be able to tell which sites have canonical issues - a reindexing would be a very very good thing for these sites.
We will see.
>> I checked one of the results, and it went directly to the the correct www page and had a cache date of Feb 18, 2006. <<
>> Why would it still be shown as supplemental? <<
As long the www pages are indexed OK, don't worry too much about some non-www pages still showing as Supplemetal. It takes Google one to two years to drop them from the index. It is a bug.
When you do a site:www.domain.com search you want to make sure that you do not hit the "In order to show you the most relevant results, we have omitted some entries very similar to the 2 already displayed. If you like, you can repeat the search with the omitted results included." message until most of your pages have been listed.
If that message comes up after only a few pages, say "1 to 5 of about 50" or "1 to 80 of about 800", then make sure that all of your titles, meta descriptions and page content are vastly different.
A "short" listing means the pages are filtered as being duplicate content. Having the same title and/or meta description on multiple pages is enough to trigger it.
I have four sites all white hat, three have 301 form non www too www, these are fine and not being hit. One without the 301 has been stripped back to home page and has been getting nailed by google spiders.
Canonical problems are being sorted hold on to your hats and ride the storm.
>> Could someone just explain what "canonical" means in this context - thanks. <<
If you read Post #400 in this other thread [webmasterworld.com] I think your questions will be answered. :-)
|Why make all these pages sup for this reason? |
You are not checking your results properly, pages which deserve to be supplemental as judged by google algorithm are now showing as supplemental results.
Legitimate pages are the ones that dropped out of the index go check again. Only legitimate page google now has for affected sites is your homepage. Take Dave's words he made the most appropriate guess. Nothing to panic just to wait this out.
Only ironic thing google should have tested these things before making bid daddy DCs live. They had a lot of time to do that. They shouldn't have affected millions of sites and their businesses by making huge changes in live datacenters.
>>They shouldn't have affected millions of sites and their businesses by making huge changes in live datacenters.
That's because Google now believes they are the web.
|King of Bling|
Dave may have a point...
Noticed Supp's returned variations of
| This 233 message thread spans 8 pages: < < 233 ( 1 2 3 4  6 7 8 ) > > |