Forum Moderators: Robert Charlton & goodroi
No black hat, nothing against the guidelines. But tonight, I've done searches for very specific things, as Google has suggested, and I know they're on my site, but they aren't showing up. If I dig long enough, my page with that info on it is buried deep in the SERPs.
Google isn't working as intended.
Oh, that same specific search on Yahoo and MSN both turned up my page in the #1 spot...
usually if a site gets buried like you describe it's vindictive of some sort of penalty.
i have had quite a few like that. (most of these were directories that were running off the shelf type scripts).
now the fun part for you is to go back through and try and find what may have caused it on your end.
can competitors hurt you?
well i don't know these days.
if i suddenly put up your site & content on a domain name which i have which has been laying around since 1999 + this domain has a fairly decent pagerank and old links.. if i proceed to copy your content and site structure et all, which one would google think is the authority? mine or yours?
lets just say your domain was launched in 2002, in this case my domain would have quite a few years head start in googles eyes?
might actually do some tests to see what actually happens, so if anyone has a site not doing anything and wants to be a guinea pig.. sticky me :-)
On a search tonight for "little widget painting" (without the quotes) I didn't rank anywhere in the top 10 pages and stopped looking.
This is a minor search term returning just 108,000 results.
I have a page called little-widget-painting.shtml that has been in place since last fall.
With the same search using &filter=0, I not only occupy the number one spot - I have the next five spots as well.
#1 is my little-widget-painting.shtml page.
#2 is my home page, where the category of "widget painting in general" is described, "little widget painting" is mentioned and a link provided to this page with anchor text of little-widget-painting.
#3 is a "widget carving" page that happens to contain a link to little-widget-painting.shtml page with anchor text of, guess, what...little widget painting, plus other mentions of widgets in other contexts.
#4 is my sitemap, which contains the same anchor text link
#5 is my widget section index page, which contains both the anchor text link, and a short description of the project, including the words little widget painting, because, darn it, that's exactly what it IS! The link to this page, cached Sep 28, is www.mysite.com/widgets/index.shtml - the correct link for that section's index page, with 47 backlinks shown in Google
#6 is also my widget section index page, except it is indexed as [mysite.com...] - no extension - cached Sept 25, also with 47 backlinks showing
Obviously Google HAS to apply a filter to determine the importance and priority of these pages. I would have expected the original project page itself to merit a mention on page one. IMHO it deserves that placement because it gives readers landing on that page exactly what they are looking for...original instructions on how to paint little widgets from a well-established PR5 (previously PR6) site.
Instead, Google is seeing SIX pages in my site that feature the same keywords and anchor text links - and can't seem to decide which one is the appropriate listing.
Guess it's an all or nothing scenario, and Google chose to give me nothing.
The listing of both www.mysite/directory/index.shtm and www.mysite/directory/ is disturbing. Where did that come from?
Current search results are mixed...top spot goes to a site with status in DMOZ as a directory. This site dominates the SERPS for all terms in our sector as of now. It offers hand selected links to other sites that offer things like "little widget painting," but these links are buried beneath Amazon books on widget painting, which have an original text description appended for each affiliate link...thus providing the page with its relevant keyword tally.
Neither the Amazon offers or the text links have been updated in months to my knowledge -- perhaps this now represents "tried and trusted" results somehow to Google - whereas my site is often updated manually every few days as I add new, fresh ORIGINAL content!
I just removed an remote server php affiliate product catalog that occupied a directory in my site and created about 1000 pages. I had no idea it would be that large when it I installed it. Even so, this was an unimportant add-on for me; I offered it more as a service to readers, never considering the impact this might have in Google until now. I have since used the removal tool to delete the entire directory, sacrificing two innocent pages. All the rest of my site is homegrown HTML - no php, asp, or dynamic anything as I barely know what any of them are, let alone know how to use them.
Perhaps this addition of 1000 pages back in March upset something in Google's filter in this update?
If anyone is seeing similar results on their sites, can you let us know? Finding the trends and identifying the triggers in this mess in the first step to fixing it.
Maybe I should try that!
I decided to offer for my visitors a related products, so I added an amazon product feed cgi-script with modrewrite htaccess file, still at July. I really suprised when I seen the Googlebot crawled more than 9000 pages in august! I get hit at september 22.
That was a horrible mistake, I know already.
Anybody had a same experiece in a past?
How much is the estimated recovery time from this kind of accident?
What I have to do to now?
I have to remove the new pages with google removal tool? Or just sit and wait?
When I use filter=0, my site still in top10.
>>Hi Reseller,
I totally agree with you. Since yesterday I am busy getting rid of internal text links to the same pages. Whole navigation will be changed.:( <<
I know what that means ;-)
Have done the same (manually) on 137 pages just after the 22nd July 2005 where my site got the second hit by Google (first one was on 2-3 Feb. 2005).
Problem is if you have a "THEME" directory, or site, then you have the keyphrase/keyword "bla..bla..widget(s)" all over the menu bar. And that might result in accessive keyphrase/keyword density on all pages of the said directory/site. I.e you mightbe spamming in good faith (:(
Two days before the filter I had a visit from Google's manual review IP 65.57.245.11. It hit ~3 pages and the referer was Google search results for site:www.mysite.com with the time option as_qdr=m6. Basically someone from Google was checking to see if all my pages were added in a short period of time, past 3 or 6 months. I had banned this IP so he only saw an error message and I have no idea if he liked my site or not. Propably not.
I don't know if this helps but that IP is really interesting and I'd check my logs if my site was filtered.
Maybe Google has an automatic system that alerts when a site has a sudden increase in pages?
As for the navigation links: it's possible but many sites use templates with navigation bars/menus. Is Google going to ban or filter CNN for using the same link text for their Sports section on thousands of pages?
Anyway cache dates are as follows
1. 25th September 2005
2. 21st February 2005 (Tripod Free Hosting)
3. 24 December 2004 (Same site as above - eg Tripod)
4. 6 December 2004 (Geocities Free Hosting)
5. 1 December 2004 (With automatic redirect - Lycos Host)
6. 4 December 2004 (Same site as above with redirect)
7. 20 February 2005 (Tripod Free Hosting)
8. 22 February 2005 (Tripod Free Hosting)
9. 24 February 2005
10. 15th July 2005 (Another free Host)
OK - G way to go with the fresh results from quality domains :)
Something is wrong at the plex :(
Can GG at least confirm this is a work in progress - if people knew that it would end up leading to a fix for the canonical url bug and updating of the supplemental index then we could at least give you a breather.
[edited by: Dayo_UK at 9:09 am (utc) on Sep. 30, 2005]
I predict that it wont be fixed for a while so need to spread them out a bit though ;)
let me help you out with that problem:
while (($i < 100) &&!RESPONSEBYGG)
{
echo 'Listen Google ;) - I will say this only another '.$i.' times. The canonical url for my site is the homepage with the www - I have done the 301 - this is the page with the most backlinks - it is the page that should rank for the company name search. Etc.';
--$i;
}
;-)
But seriously, those results are sad :( - might aswell kept my old Geocities domain from years back - hmmmz I wonder if I still have the login details.
Lol - we know that Google gets the canonical url correct for Geocities and Tripod as this came up before.
[webmasterworld.com...]
Good work Google - sort out the free hosted sites let the proper domains die a slow death?
What do you mean by 'manual review'. I find it impossible to believe that any sane person whould visit the site and then penalize it having seen it. This process must be automatic, and obviously erroneous.
The process you describe though makes sense for us, in terms of what's happened, but not if a real person makes the decision.
They don't have much in common at all that Im aware of, the one common factor among them was this:
Urls with keywords in them. 8 using kw1-kw2.html , kw1_kw2.html some with more than 2 kw.
Titles with kw1 kw2 some with more.
Descriptions with kw1 kw2, again most with more.
On page kw1 kw2 (kw3 kw4) as the main heading (wrong word - sorry) for the page. eg. MY kw1 kw2 Page
The above is the only thing that is common when looking at the html for all the eleven sites. To see this I clicked only 1 link from each home page. All had at least 3 of the above factors, most have all of them.
I personally think there is more to. I also added a lot of pages (now removed) and added more links in the past month than normal.
I beleive a combination of all of this has taken me over the limit in some kind of filter scoring system and triggered it for me.
At someones suggestion I used a unique search phrase that would only apply to my site and was out ranked by a guy who is using alexa listings as a scraper for his amazon aff program (removed now),supplement results, and it came 9th. Can anyone explain that to me please as I honestly dont understand it. In the supplement results that listed above me there was an old Lycos search result, a hotlinker to an image, a site that links to a different site i own. Strange.
So if not fixed I will have to remember where I am at (93)
Listen Google - I will say this only another 93 times. The canonical url for my site is the homepage with the www - I have done the 301 - this is the page with the most backlinks - it is the page that should rank for the company name search. Etc."
After all Matt did say in his blog:-
As we work down the list of canonicalization issues that people run into and cross them off the list, I wouldn’t be surprised if this issue + 301s taking longer than before is the next thing on the list.
I really hope they will cross this off the list soon. Matt or GG - any point in doing another re-inclusion request - I have done one before but no progress.
Also - when you do sort - I hope the fixed domains dont then go into the sandbox :( - which I think is likely unfortunately as I cant see the domains just getting all the power back without triggering another filter - eg the Sandbox.
Lol - quick explanation why I am giving Google a breather from my constant nagging - I just cant deny that a little crawl went out on one of my sites and Google looks like they got it right - just hope it leads to a big crawl
Though very odd is that some of them on second and third page are just a page with my site name on them with meta data from our sections and a link to the site...they also spawned some winfixer popunder that tried to install something and crashed my browser...
Might odd results, obviously still movement...
I personally think there is more to. I also added a lot of pages (now removed) and added more links in the past month than normal.
Does anyone else find it pathetic that we can't add new pages as we like, for fear of tripping some unrealistic Google filter, which penalizes the rest of the site?
The tail is wagging the dog!
That hits the point....
Reseller,
When you had to change those 137 pages, did you try to stay out of the sandbox or did you changed everything within a few days?
I am fed up so I think will change everything as fast as possible and forget about Google for some time.
Or at least we need to develop a common set of questions and things to look at. Anyone up for this?
Here are my questions:
- When using "Site:" command, are counts correct?
(Our answer: No, 10X number of actual pages.)
- Have you redirected non-www to www? How long ago?
(Our answer: Yes. about 3 months ago.
- Are you seeing non-www appears in serps?(site:mysite.com -www)
(Our answer: Sometimes. Somehow Google picked up on our "FTP" subdomain. This resolves to the same address as WWW. This was removed sometime in Feb. However, these pages still pop up.
- Are you seeing old pages that are no longer exist reappear in the serps?
(Our answer: Yes, pages from last year came up in the serps yesterday. I checked this by looking for 404's in our logs. Googlebot was hitting these pages.
- Are you using Google sitemaps?
(Our answer: Yes.)
- What is the slant of your site: Content, Images, Affiliate, Other?
(Our answer: Content site.)
Please add other questions you think would be valuable.