|King of Bling|
Dave may have a point...
Noticed Supp's returned variations of
Yeah same here. https etc.
My sites that went supplemental large affiliate datafeed type sites. I thought I got wacked by duplicate content.
>>Noticed Supp's returned variations of
Try this one too:
I did some research and found many of my competitors that are also supplemental. Then I did server header checks on those domains and see they all have one thing in common...they all have 301 redirects setup from non-www to www.
Could it be that when a 301 redirect is setup on a domain from non-www to www that it takes a few months for the G index to recognize this change and in the mean time you go supplemental?
Just a guess...but I am trying to find something in common with everyone in this "club" so it can be resolved.
Dave may be on target. My entire site is supplemental
and the Google spider visited every web page last night.
This is hurting forums owner specially hard, since most of us for seo purposes have two different templates one for search engine which is call archives and another one for users, :) and unfortunately this is treated as double content :(
dave, while I tend to agree with some of your points, I have to point out this article [theregister.com], which simply confirms the obvious:
|"Based on our various research efforts, we believe that most of Google's near-term server purchases will use AMD's Opteron for the first time," the analyst wrote. |
A source also in the financial analyst community said, "I heard (about the switch) a long time ago. Word is finally leaking out. I heard that Google was in the process of switching to AMD, while Google was on stage at the last Intel Developer Forum."
For some reason the fine members of WebmasterWorld have had enormous difficulty over the years grasping the simple point that the question is not why would google go to opteron servers, 64 bit, but why they wouldn't do that.
Big daddy, as matt cutts said, is about new infrastructure, and this is what the new infrastructure is. The only things that surprise me are:
1. Why did it take google so long to switch over?
2. How did they keep it out of the news enough so that only server types in the know knew?
Obviously word must have leaked out, what astounds me is that google kept this from leaking out here the way you think it would have. I have to give google's little group of undercover posters some credit, they successfully redirected threads about this topic more than once.
So think 64 bit systems, just like it was always said would happen, probably with 40 bit indexes, just like it was said would happen. New infrastructure in this case means exactly what it says.
re forum owners: it's hurting forum owners who use that system, which I've always disliked intensely. Optimized forums don't need doubled data to make search engines happy.
That indeed maybe part of Big Daddy but it does not really go against what Dave has said.
Canonical issues are also a big big part of the new big daddy index.
It looks like Google has lost cached dates for the supplimental pages now.
I'm happy to ask someone to check this out. Please send an email to sesnyc06 [at] gmail.com with specific domains and the keyword "gonesupplemental". I have a theory about this, which I'm asking the crawl/index guys to check out, but I'll need 5-10 specific examples to check if my theory holds. If my guess is right, I'll try to get the crawl/index folks to get things back to the previous behavior.
I take it you know about SEOCh@t going this way. There is also a couple of examples at Threadw@tch.
Welcome back to WebmasterWorld :)
This question may be off topic, but I figure its the best place to ask it.
I was checking my server logs for the month and noticed that both the new mozilla google bot and the older google bot are downloading my robots.txt file often, but the perplexing thing is that on some occasions when the mozilla bot downloaded the file it returned a 301 status code, while all others were 200.
Why would it return a 301 code when the file is not redirected and never has been, while at other times it returns code 200.
Is this something that I need to be concerned about?
Were they all HTTP/1.1 requests or were some of them HTTP/1.0 requests?
sesnyc06 [at] gmail.com has mail. :)
Gone for some examples in the seo industry and some of the big forums that have been discussed elsewhere.
Talking about previous behaviour - for me Mozilla Googlebot has never been great at adding pages to the index - even when it appears to have been working correctly - so perhaps this might be related - something to do with the crawl to indexing behaviour?
What about pages that have disappeared from the BD index altogether? Do you need examples on that?
The first one I noticed with the 301 was HTTP/1.1 request, but others by the mozilla bot were also 1.1 but returned a status code of 200, as a matter of fact, there were 11 requests for the file in all and 7 of the 11 were the 1.1.
4 of the requests were by the googlebot 1.0 with a code of 200
7 were by the mozilla bot 1.1 with 6 giving a code 200 and one giving a 301 and one thing I did notice with the 301, there isnt any date and time in the server log if that makes any difference
Same as newwebster...what about pages that are missing?
All of the sites we set 301 redirects on (after years of being online w/o the 301s) only have their homepages listed. All other pages are not even listed as supplemental. What is up with that? What sign of things to come is this?
Googleguy I sen't you some concrete examples you will be surprised to see them. Please compare those results with Non.BD results.
One of my crappiest competitors has almost been completely reduced to supplemental.
Couldn't happen to a nicer copyright infinger and it's cheaper than a lawyer ;)
|King of Bling|
Thanks for the offer to check this out. I've sent you examples. Hope this works out :-)
|All of the sites we set 301 redirects on (after years of being online w/o the 301s) only have their homepages listed. All other pages are not even listed as supplemental. What is up with that? What sign of things to come is this? |
same thing with me, all 301 pages are missing with the exception of homepage.
I am hoping that the behavior newwebster and I are seeing is that Google is reindexing our sites, as though they are new.
I just got a respond from Google <that my site is not currently banned or penalized> + GG is investigating this matter. I guess its time to switch that "PANIC" button off and to wait until they fix it.
<Sorry, no email quotes.
See Terms of Service [webmasterworld.com]>
[edited by: tedster at 4:59 am (utc) on Mar. 10, 2006]
Another thought....if they are rebuilding the index and either your pages have gone missing or supplemental...how are they going to determine what sites are the original ones to display said content?
Textex, sit back man. Go for a texmex and leave it for additional info.
Where they had the data after the crawl and before they indexed it would be my guess.
You can call it by a lot of names. How about accumulated spidered page database.
It has been almost three years now since April 2003, when there was a major Google indexing snafu. This marked the end of Google's monthly update cycle, and Google even had to revert to the previous month's index. URL-only pages started to appear at this time. Word leaked out that it was the 4-byte docID problem, but GoogleGuy and most of WebmasterWorld, and everyone who depended on Google ad money, was quick to deny that there was any sort of problem. In August 2003, the "Supplemental Index" results appeared for the first time. There has never been a logical explanation for this index.
Google's counts for total number of hits have been totally bizarre for nearly two years now. They are not just inaccurate, but sometimes they are inaccurate by an order of magnitude. You cannot believe any number over 1,000 reported by Google, because anything higher than that is not verifiable.
In November 2004, Google increased their total count on their home page from 4 billion to 8 billion overnight. All the Google lovers out there bought it hook, line, and sinker. It's the ad money that distorts their perceptions.
I'm not eager to get flamed for the zillionth time on the 4-byte ID problem, but if GoogleGuy wants to deny it once again, for the record, that would be fine with me. He may also want to explain the last three years of weirdness for Google's generic results.
Let me say that there was some speculation a year or two ago about Google moving to 64-bit computing. This made a lot of sense, because as I've tried to explain many times, moving from a 4-byte (32-bit) docID to a larger docID is not a trivial matter. Not only would you have to rewrite a huge amount of software (the docID, which is unique for every page on the web, is ubiquitous throughout Google's entire system), but you would take a performance hit because it requires extra processing cycles to expand beyond 32-bit numbers if your processor can only chew 32 bits at a time.
It would make a lot of sense, if you are a Google engineer figuring out what to do back in 2003, to stall on the docID problem until you can migrate to 64-bit processors. For one thing, Google got a lot richer and 64-bit processors got a lot cheaper at the same time. For another, there's a new trend toward more processing power per watt, and Google's huge electric bills are a source of concern to them.
As of less than a year ago, Google was still on a 32-bit system, according to this quotation from CNET News.com by Stefanie Olsen, on April 21, 2005:
|Google executives also were asked about innovating in server architecture in the future, given that one of the company's biggest rivals, Microsoft, is developing search tools on a 64-bit architecture. Google currently runs its search service on a 32-bit architecture. Search experts say that platform may allow for advancements such as better personalization. Google co-founder Sergey Brin downplayed the importance of the underlying architecture. 'I do not expect that the particular choice of server architecture is going to be a deciding factor in the success of our service,' he said. |
Now we learn that Google has been moving to AMD 64-bit processing for some amount of time. We are in the middle of another very long and very curious update, with lots of Supplemental Index results showing up in weird places. I propose that Google is in the middle of moving to 64-bit processing on some of its data centers. While it used to be a problem of two or three separate 32-bit indexes that would get melded on the fly, now they also may have a new 64-bit index. The docIDs between the two systems are incompatible without extra software coding, and probably present a major integration headache. Three years ago I tried to think of how I'd make a smooth transition between two different docID systems, and all I got was a headache thinking about it, because it seemed so complex.
Okay, go ahead and flame me. I can take it because I'm used to it.
Well, the DocID appears in the URLs for the cache links, so do we look there for a longer string, or are they going to try to "hide" it?
Thanks for checking this out. I've sent you examples
I just looked up the DOCID on new and old datacentres for an indexed page. They are the same.
I don't think this is the reason.
Thank you for your post!
I sent you mail...
| This 233 message thread spans 8 pages: < < 233 ( 1 2 3 4 5  7 8 ) > > |