Welcome to WebmasterWorld Guest from 188.8.131.52
But the point is, why is it in the supps in the first place? That's the $million question. At the end of the day, it's unique content that's relevant to potential searches.
I'm going through some of my supplementals right now and asking myself the same question.
- Title/meta descriptions .. unique and descriptive
- Unique URLS .. bogus URLs blocked, entire site checked using Xenu
- No www/non-www issues, though I do have a subdomain.xyz.com vs. www.xyz.com/subfolder/ problem.
- Snippet searches reveal no dupes (though long tail searches wrapped in quotes using text off supplemental pages seem to pull up zero results)
- meaty unique content pages, 300+ unlinked text I wrote
- content positioned in source above navigation
- less than 30 outgoing links on a particular page I'm looking at
- pages all validate the last I checked.
Considering Google has been more picky about indexing pages, I wonder if the supplemental caches were refreshed but Google didn't then evaluate them for inclusion into the main index? In other words, before BD, pages would first appear on the main index then slip into the supps eventually. Supplemental Googlebot periodically re-evaluated those pages and some of them returned to the main index.
So Google refreshed its entire supplemental index, but post BD I see at least two major hurdles to jump over to get back into the main index: 1) no supplemental issues (i.e. identical meta tags, urls resolving to the same content, etc) 2) TrustRank/PageRank.
In other words, a perfectly structured page with original content can be stuck in the supplemental index just because the domain lacks juice.
At this point I'm *this close* to generating a few non-commerical spam sites to get a better feel for avoiding the supplemental index.
So, the PageRank rests with the URL that is being filtered out as the duplicate. Fix that by always linking to http://www.domain.com/ from within the site.
Within weeks the root index page will gain PR, and that PR will spread to lower pages and help a bit with some types of site spidering issues too.
I hear ya.. many of us are in this predicament. But wait.. don't worry it's for the "best" as I am sure google knows what they are doing. LOL
Sit tight and watch the last 10% slip away like mine :)
"...this is the same data refresh from June 27th and July 27th, just being refreshed again. We’ve been doing it for ~1.5 years at this point, and I’d continue to expect that we’ll just keep refreshing the data to that algorithm every several weeks or so."
Now how do we go about figuring out what is in our site (or Google's data) that leads to such huge ups and downs with a refresh...
Looks like you can identify a domain with troubles by this search:
Okay, but how can you determine exactly what the troubles are?
As far my sites are concerned, Google may as well not even exist. None of the site:example.com varieties mentioned in this thread bring up any results that I'd consider to be acceptable. The results at the [gfe-eh.google.com...] version are just horrible.
Look to [gfe-eh.google.com...] for the latest tweaks. Other datacentres are doing all sorts of other things.
Whatever happens over there (gfe-eh) is likely to be the basis for what goes live across the board in the next month or so...
Gosh I hope that doesn't stick. Doesn't look good right now.
worried and the only thing i can do is to simply sit down and watch the progress. Is this is right? or i have to do something to get back my position. Any advice?
Matt Cutts was right all along; just some of the Google-speak was too cryptic to understand what the long term implications of some things really are; and I can see that there are certain types of spam that these actions can severely cripple; as well as legitimate sites where the owner does not take enough care with their site architecture, or cannot interpret the symptoms of what is going wrong.
I'm now sure you're right and this is where google needs to have a re-think. The owners of businesses with good, honest websites like ours seem to have little chance conforming to google's ever-evolving requirements.
The people that will have time, resources and eventual rewards when they
take enough carewill be specialists and spammers.
It's just like the problems we're encountering all around the world today ... deal with one type of situation and you end up hurting a lot of innocent by-standers. Google used to have the reputation of seeing and nurturing the bigger picture, I think the game has got too big ... even for them.
Spammers Win .... Game Over ;-)
Site is an specific industry news sites, not commercial at all. No banners, no adsense or advertising of any kind. Not sure why a data refresh has this effect. I wonder if the next data refresh will see it return again?
I run a directory/search engine site. I get thousands of inbound hits from reciprocal links per day. My sites is pretty popular but still half of our traffic comes from Google. That said, with Google controlling the 3/4 of the search market it is kind of hard not to rely on Google a lot for traffic these days.
Aside from simply moving down versus dropping completely (was solid for a few years before June), it looks like the allinanchor reporting is now either discounting more of my links or is appreciably counting blog links for a few of my competitors whereas I didn't see any big changes on the last data refreshes.
We've all watched the MC video and read every thread on here when it comes to data refreshes, but I think we're still missing something. What is it that changes during that push time that wreaks such havoc?
What I found was the datacenters that show cache date of aug 11 have the most pages and the datacenters that show cache date of aug 16 have the least pages. At least for us. The swing in total pages is over 100,000 pages between the two cache dates and no major changes were made to our site.
I wonder when the next PR update will be.
[edited by: tedster at 6:06 pm (utc) on Aug. 18, 2006]