June 17, July 27, August 17 is a very significant dateline that has also hit my biggest site with loads of visual and text content. Six years old, good PR etc. At the same time another site of mine with mainly links and info went up a lot.
There should be some real experts to analyze the reasons for that carzy line of dates which is doing so much damage to solid publishers.
Why can`t publishers organize themselves in order to gain a little bit of strength and independance.
This Google monopoly is much too risky with such fragile technology.
I also have a phpbb forum with about 120.000 posts. Since August 17th there has been a drop in traffic to about 10%, so traffic from Google has almost DIED.
I didn't do anything in the last months, the site was growing very well, every day new useful content.
And above all: NO TRICKS since it has been online (2001), it was always clean, no cloaking, keyword stemming, invisible text, buying links or WHATEVER.
I simply can't see what I've done wrong!
Here's what I observed in our >100k+ page site:
LOW CONTENT:LINKS RATIO
We lost a number of pages from the index altogether. These pages had relatively little paragraph-based text (the unique content is in tables) and a high ratio of internal and external outbound links.
LIGHT ON CONTENT
We have pages that just list businesses in a city. The ones with only 1 or 2 listings (which previously had ranked well for [city] businesses) fell dramatically. The ones with >10 listings are still doing fine.
DUPLICATE CONTENT (MAYBE)
Our "city pages" operated like search results, so users could sort the businesses by various criteria. This meant it was possible to have the official URL like "/business/city-state.html" and then also have "page=city&state=ST&city=CITY&sort=name&order=1".
Here's what I'm doing about it:
- adding more paragraphs to our city pages
- eliminating some of the outbound links
- replacing the query-parameter-based sorting with AJAX-based sorts that put sort variables in the server-side session, then 301 redirecting anyone who tries to access our city page via query parms to the "official" page
One thing I'm also doing is spending time parsing our Apache logs:
cat apache_log ¦ cut -d\ -f7,9 ¦ egrep -v '200¦301'
This produces a list of URL's and response codes. Look for any response codes that are suspicious, especially 302 or 500+. Then figure out what's up with these pages.
|I also have a phpbb forum with about 120.000 posts. Since August 17th there has been a drop in traffic to about 10%, so traffic from Google has almost DIED. |
This thread might be of some use to you espeically when using PHPstyle forums and cms software:
Can I point you to these threads too: [webmasterworld.com...] and [webmasterworld.com...] and [webmasterworld.com...] as these also discuss duplicate content issues with popular forum, CMS, and cart systems.
Additional reading: [google.com...]
Okay, here's something new...
Remember that last year we went from page 1 to 700+. I believed at the time it was a duplicate content filter and after 301'ing a second domain to the first, a month later everything was fixed. No idea if that was the issue really.
So when on August 17th our SERP's fell again, I ruled out duplicate content as that's all been handled. Then the other day I stumbled onto something. Our second domain was back but as supplemental results even though it's still 301'd. Hmmm, our second domain is back and we're penalized exactly the same as last year? Can it be?
So I mentioned it to Matt Cutts last night. Note, I didn't tell him my domains.
Tonight I was shocked to find some of our listings back on 184.108.40.206 which happens to be my current www.google.com ip. I then checked to see if our second domain was showing up in the supplemental and it wasn't.
Coincidence? You be the judge.
I can only hope our SERP's are back as this has been a trial as I'm sure all of you can attest. And while the Bible tells us to "count it all joy when we fall into divers temptations..." that's a tough request ;-)
Google hangs on to redirected URLs for a year after the redirect is put in place. It shows them as Supplemental Results. That is the action that I have observed time and time again. While they are in the supplemental Index they are not harming things as far as duplicate content goes, as long as the URL they represent is now either a 301 redirect or is a result for a URL that now returns 404.
The Supplemental Results are cleaned away one full year after the redirect is first actioned: no sooner and no later.
Your result is exactly as I would have expected. The previous Supplemental Updates have occurred in 2005 August and 2006 February/March, and the currect one is in progress right now: some datacentres have been cleaned up more than others - and the work is not yet completed on any of themm, as far as I can see.
I tried to explain all this in: [webmasterworld.com...] and several other recent threads.
Was August 30 another "refresh" of the same algo?
I'd bet the the next data refresh will be on Thursday night, 7th September.
Make that a few minutes ago. A lot of pages resirrected three weeks ago are now back stupidly pinned to the last page of the results, while boatloads of spam are added.
I have a reliable source that was able to confirm that our specific troubles that started on 8/17 were related to DUPLICATE CONTENT.
Frankly this seems kind of buggy to me-- for the past 1-2 years Google has been pretty good at just "figuring it out." Then suddenly it's like they said, "we'd rather have 0 pages for the content than choose from among 4 different URLs."
It's also odd that they don't look to our very comprehensive and painstakingly created sitemap to guide them to the canonical pages.
The directive "design your site for users, not search engines" applies to static sites and on-page optimization. For complex 100k page dynamic sites, it's utterly bad advice.
|The directive "design your site for users, not search engines" applies to static sites and on-page optimization. For complex 100k page dynamic sites, it's utterly bad advice. |
Well said in lieu of the ongoing saga with Google and canonicalization.
[edited by: CainIV at 6:25 am (utc) on Sep. 7, 2006]
As far as I can tell this doesn't have anything to do with a new update just yet, but it very well could...http://gfe-au.google.com
My key money phrase on the site that has been experiencing issues during some of the updates went from #1 to #9 (nothing too drastic). The thing that worried me the most is that on the allinanchor: it went from #1 to AWOL -- how confusing is that?!
It is a day ending in 7 and a full moon, so I wouldn't be terribly surprised if some sort of funky data refresh is going on.
Sorry for the doublepost, but I discovered something odd.
On an allinanchor: search, I come back to the top of the second page when appending &filter=0.
Has anyone else ever seen anything quite like that? Back in June we were doing the "" and &filter=0 tests for the main search results, but this is the first that I've seen anything happen on the allinanchor: query as well.
The normal search filters out all of the duplicates and "similar pages"; this is, any duplicates, wherever they might rank.
The &filter=0 parameter shows all of the results, and it is not uncommon to see stuff appear very close to the top when you do that.
It's your wake up call to fix the problems with the site. Those pages when unfiltered will do really well. Get working.
I get what the &filter=0 is supposed to do in terms of showing filtered results on normal search, but have never seen it occur on an allinanchor: search before.
In that event, what can one get working on fixing?
Edit: I found it. A new scraper popped in that duplicated me on a cloaked redirect and Google nailed us for it...I hate that so much. Thanks g1smd.
[edited by: JoeSinkwitz at 10:09 pm (utc) on Sep. 7, 2006]
This is just nutty... Every day Google gives me a different IP so some days my results are back to normal and other days they're buried at 900+.
<No discussion of specific tools, sorry. See Forum Charter [webmasterworld.com]>
On the scraping, I'd appreciate it if someone could point me to (or explain) how to find and get these removed.
[edited by: tedster at 1:06 am (utc) on Sep. 8, 2006]
Ducki, here's what we're doing (and going to be doing)...if anyone else has ideas in addition to this, I'm all for it:
1. Through G sitemaps, submit a spam report on the scraper.
2. Get more links, build more content.
3. Contact scraper's host and registrar.
4. Get more links, build more content.
5. Send e-mail followed up by certified letter to scraper.
6. Get more links, build more conent.
We're up to #5 right now, though all the DCs gfe-* show me in the correct position again; still we're following through because it is a frightening proposition that an external entity can wreak so much havok on one's ranking. Hopefully Google can lessen the blow of these crazy dupe filters as the algo progressively evolves.
As b2net predicted above, it's Thursday evening, Sept. 7th, and a whole lot of my traffic seems to be coming back – as of about 10:30 California time.
I noticed a small improvement on Google traffic but only about 15% increase.
Can any one confirm if there was any type of major updates on Sept. 7 like we seen on the other dates? If there was i'm not really seeing it.
[edited by: Northstar at 11:37 am (utc) on Sep. 8, 2006]
notihng new here in stockholm!
....I,m freezing to death, googleguy!
Northstar >> I've seen some slight moves but nothing like an update or something, nothing unusual.
I still find current results full of low quality sites though. It might stick...surprisingly!
Zero sign of update. Still at about 5% google traffic.. disaster.
One site trying to make a comeback there. I don't if it's the future or if those DCs are missing some filters.
If those results stick I would be back to pre jun 17 or july 28 to aug 17 traffic.
I guess they are just DC with the filters taken off for aug 17.
Can but cross my fingers these DC stick.
anyone else find these results better, or see more datacentres showing these results
System: The following message was spliced on to this thread from: http://www.webmasterworld.com/google/3076125.htm [webmasterworld.com] by tedster - 12:21 pm on Sep. 8, 2006 (EDT -4)
I've heard a lot of talk about Google doing data refreshes on Tuesdays and Thursdays that end in a 7. Well Thursday, September 7th has come and gone and I still haven't recovered from August 17th.
Should I keep on waiting for things to iron themselves out and go back to normal, or should I just accept the fact that Google has gone to the dogs and start finding a new way to market my site?
I've seen a small update on some sites / terms that I monitor. On the seventh a few new pages of mine suddenly started to rank, then after a few hours of seeing varying sets of results, some of those pages just disappeared without a trace. Oddly enough, with a search term which one of the vanished pages was ranking for, now return two other pages on the same site which point to that page.
Added: the vanished pages are really vanished from Google, as in unable to be found even with a site:example.com "unique phrase" type search.
Also seeing a lot of "hotel spam" results for searches like "widgetville thingyburb".
[edited by: zCat at 4:46 pm (utc) on Sep. 8, 2006]
No major changes in my search results yet. Though some of my SERPs have actually dropped quite a bit since yesterday. It doesn't make a big difference though, since hardly any of my traffic comes from Google after getting hit on June 27 and Aug. 17. =P
I see pre-aug 17 results on those DCs as well, hopefully they will propogate to others...
A note for people checking different Google IP addresses. See this thread for more recent information: [webmasterworld.com...]
[edited by: tedster at 6:49 pm (utc) on Sep. 8, 2006]
| This 111 message thread spans 4 pages: < < 111 ( 1  3 4 ) > > |