Forum Moderators: Robert Charlton & goodroi
The particulars: site is 5 years old (so no sandbox issue), lots of relevant incoming links (many more than most of the sites ahead of mine in the listings), very optimized (natural web copy with lots of keywords), no copies of my site between pages or elsewhere (I caught a couple of thieves and one removed material, I reported the other, and I changed my web text so there would be no copy issue), a G-friendly site map, all coding errors corrected.
I have also asked G about this (specifically if two dots in my site URL causes issues, if a few pages having spaces in the URL--%20 in code--causes issues, and etc.), but I get no answer. So I am stumped. The site ranked really well literally for years, and now this.
So, what have I missed that might cause such a quick and extraordinary drop? Or what is the next step in figuring out how to "fix" my site?
my site also gets (got, given the drop in the listings) added to lots of sites "naturally" because it was on the first page of the listings. Lots of scrapers added it too, but what can you do?
I naturally had the same happen to me, whereby many scrapper sites would contact me everyday notifying me of a link addition on their obviously-link-infested-spam-link-site ..and naturally I would simply ignore, hit the delete button and move on. This site is the one that lost its rankings in December.
But the same thing has been ongoing for another one of my sites too, but it has not seem to affect it in any way.
Lorel mentionned something interesting :
This happened to a client of mine. The culprit had copied the home page word for word and posted it on several of their pages focusing on different keywords for each page. The owner of the site had no contact data so we contacted the host. They required a DMCA report. The site disappeared within a week. We also sent a DMCA to Google and sent in a reinclusion request explaining what happened. The PR came back within a few weeks but it took 3 months to get the keyword rank back.
I will have to look into it - but would Google really almost completely wipe out your site because of someone else's dirty work? If that is the case, Google has a long way to go ....
I will have to look into it - but would Google really almost completely wipe out your site because of someone else's dirty work? If that is the case, Google has a long way to go ....
Yes, I believe so. The culprit had a date attached to the stolen content, claiming it was a "review" of my client's site (without a link) and that was the same week the Google traffic decreased to next to nothing. The Google index for this site dissappeared also except for a few bogus pages and the home page went URL only.
We wrote Google and they said the site had not been banned but it had essentially dissappeared from Google until we sent in the DMCA to the host and Google and the site was removed. Then everything started to recuperate but it took 3 months for the Keyword rank to return and the site index to come back to normal.
My listing fell this same way several months ago. At the time the discussion was about G penalizing duplicate content within a site, and so I revised all my pages and joila, back to the top of the listings where my site of course belongs (until 12/01, when the site sank down the results again). One change was to make all internal links complete, from /specific page to http:mysite/specific page. But these are all non-www links and my G listing is www. I also submitted a site map for the first time, and again, as non-www.
My theory: G saw the non-www links as a new site, even though it is the same site as the www listing, and the new site/sandbox phenomena was activated. My site shot up the listings because well done and new (I did not pay attention to which version, www or non, was coming up so this is supposition), but then went into the box after a few months because new and supposedly still under development. My site looks to me like it dropped on 12/01, but in reality it dropped months ago and the non-www version just looked like I had been quite crafty and gotten back to the top of the heap. By this theory, G views the www version, the G index version, as mightily flawed (all internal links to another site entirely because non-www among other things), with few incoming links (because my incoming links are mostly www) and maybe dishonest, mirroring another site perhaps.
Proposed short-term fix: Replace all internal links with www versions, re-do the site map and resubmit (long-term I have to figure out the 302/404 thing, I suppose).
Logic or delusion (trying to make a picture of the Eiffel Tower using puzzle pieces for the Brooklyn Bridge)?
I would set up a 301 redirect in htaccess from www.domain.html to domain.html. Here is what I use (this only works if you are on an apache server that is set up for htaccess):
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^PutYourDomainHere\.com
RewriteRule ^(.*)$ [PutYourDomainHere.com...] [R=301,L]
It should take Google a few days to a few weeks to pick up the new addresses (Depending on how often they surf your site). Then no matter what version of your address is entered people will end up on your domain.html version.
However, all your pages could have gone supplemental already because of the earlier change (Google thinking you have duplicate copies of each page) and that, sorry to say, is not so easy to correct. You may have to change all the text on your pages once again and even that may not remove the supplemental penalties.
Maybe someone else has a better suggestion.
One change was to make all internal links complete, from /specific page to http:mysite/specific page. But these are all non-www links and my G listing is www. I also submitted a site map for the first time, and again, as non-www.
Geez, Pro_Editor, that's a bit of a mess, eh? First, what version do your best backlinks use? If it's www, perhaps 301 example.com to that. If visa versa, then visa versa. A real factor is your suspicion that non-www is sandboxed - if it is, and www isn't, then go for www. Definitely, you need only one version in G, but deciding which one is the tricky part. Take a long look at all the parameters before your decide.
Lorel, are you sure that code isn't sending non-www to www? I'm not great at regex, but the rewrite looks that way.
Thanks for the apache code. First I will need to find out what my provider uses. Another question: Tedster was suggesting on an earlier post that I need to have a valid 404 header. Does the code you have provided yield this or does the 404 require another step (I had assumed the latter before all this back and forth). Do you agree that step is necessary if so?
The easiest way to ensure that non-existent pages return a 404 is to not have a custom error page. Check a bad URL to see if your host kicks out a a 404 or not.
Edit - Fixed URL
<body>
<p>
Sorry! We have moved! The new URL is: <a href="http://subdomain.example.com">http://subdomain.example.com</a>
</p>
<p>
You will be redirected to the new address in five seconds.
</p>
<p>
If you see this message for more than 5 seconds, please click on the link above!
</p>
</body>
</html>
Or vice versa depending on what I decide to do. Different than the suggested Apache code, and so, any input would be greatly appreciated.
[edited by: tedster at 6:13 pm (utc) on Jan. 21, 2006]
[edit reason] use example.com in code [/edit]
Lorel, are you sure that code isn't sending non-www to www? I'm not great at regex, but the rewrite looks that way.
Yes it was. Sorry for the confusion. just switch the www's around for the other version. I usually check to see which way Google has indexed the site and which has the most PR and use that version.
Pro_Editor
I think I am about to enter the great unknown and purchase a different domain name and send both current non-www and www to it (current one needs shortened anyway,
If you set up another domain with the same content it may be marked supplemental right out of the gate plus being put into the sandbox for 6-12 months. So consider that carefully.
Thanks for the apache code. First I will need to find out what my provider uses.
You can check to see if you're on an apache server by checking any server header checker. If you're on a windows server you will need to try another approach.
Added: Also, the .htaccess will take care of people linking to you with the wrong version (www or non.www) - if they do, there will be an automatic redirect to the right one.
Lorel, I had not considered the sandboxing thing (of course G will find that one and probably expunge the current www listing altogether). I guess I need to leave my site as is (all internals non-www) and set up the redirect for non-www to www (which is how G has the only version coming up at all, albeit way down the listings, which is the reason I started this thread in the first place--the non-www version is there when I do a site:mysite.com search but mirrors the www version--one supplemental for a cgibin.stats page and the same G visit date--so I am really confused at present).
the non-www version is there when I do a site:mysite.com search but mirrors the www version
This confuses a lot of people. Here's the thing - a search of ¦ site:example.com ¦ will also include all the www subdomain pages. A search of ¦ site:www.example.com ¦ shows only those pages in the www sudomain. To force G to show you only non-www pages, i.e. example.com/default.htm, you have to do this ¦ site:example.com -inurl:www ¦. That shows all the pages with the root domain, example.com, that have no www on them (-inurl:www).
Now, you already had www.example.com listed, then you did internal links to example.com versions and submitted them in a site map, so apparently both should be listed. Try the method above and see what you get.
<edit>For clarity</edit>
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST}!^www\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
The forum software will strip out a space between {HTTP_HOST} and! (exclamation), so make sure you put it back in.
To create the file, use Notepad then save it as a htaccess.txt file. Upload it with ftp to the folder on the server that has all your html pages. When it is there, rename it to .htaccess (note that htaccess is the extension and there is no file name in front of it).
Then, take another look at your situation for other problems.
[If any other members here see errors in my reasoning, please point them out.]
Ahh, 4:00 PM has arrived, Stefan's beer time :-)
<head>
<meta http-equiv="Refresh"
content="5;url=http://a-1writingandediting.writernetwork.com">
</head>
It is my understanding that Google started penalizing sites for using a refresh because spammers use this. Anyone know for sure?
I think using a 301 redirect is your best course of action.
Stefan,
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST}!^www\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
Shouldnt that first URL be without the www :o)
that is right IMO
-phish
phish, a 301 permanent redirect is safer (I think), not a temp 302, but I have no idea if that's your problem.
bluewidgets, yeah it looks like that works too - the version I'm using is from jd (mod of the Apache forum), and he ususally covers all the angles so that's what I've gone with. (And welcome to WW).
Google is notorious for allowing "hijacking" of a page by incorrectly associating the information at the non www (in your case) with the www version. The effect could ultimately be that you are hijacking your own www site.
Search the forum on '302 hijacking'
Use the code for a 301 redirect for htaccess above in this post and change the 302 over as soon as you can (assuming you are on a unix server)
Jaid
I guess I will have to use the html version of a 301 redirect for each page but could use some insight. I have code that works (courtesy of the wcschools site, which has a handy feature that lets you test it), but now: do I need a file for each page containing this code, and if so, what are these files called? Or does this code go within the file for each page (which makes more sense as far as how a rediret works), and if so, where (the code I have looks self-standing, like a page: with head and body tags)?
My experiments so far have been pretty funny (or maybe I am getting G hysteria).
You are registered with a site that offers you a subdomain - your subdomain being www.domain - www and non-www does not come into the equation here, www could be abc.domain - or anything?
But you are a sub-domain of the main site - I dont think you have to worry about the non-www and www issue.
I am off out theBear will no doubt be able to answer any questions though.
Why have your rankings dropped - as you are a sub-domain you may have to ride out the smooth and the rocky of the main domain.
[small]Only had a very quick look - so could be wrong[small]
And Dayo, If you are right about the subdomain issue, that G views my site a sub of the domain where I rent my site (my search listing fortunes tied to this outfit with whom I have no relationship whatsoever), then isn't it all the more imperative to purchase a new domain and do a redirect for not only my non-www to www but both to a new site name altogether (the implications of such a move are making my head spin, non-code expert that I be)? Someone else noted that I would be sandboxed for 6-12 months, but I have also read that page rank transfers and so I am not sure this is true.
Man, this is getting messy.