Forum Moderators: Robert Charlton & goodroi
My site is on a shared windows server and all attempts to get my hosting company to resolve this issue have failed.
So, is the A Records fix ok on a Windows Server? I don't want Google or anyone else going to my non-www pages since nearly all of them don't exist anyway.
I don't have access to doing a 301 that I know of. And all other info I have forwarded to my hosting company about windows servers that I found on WW have been no help to them.
MC: "Jagger3 ended up being less about supplemental results. I believe we may still do more on that in the future."
I just saw that in the comments on his blog. I'll have to admit I'm pretty disappointed. I found a page in the supplementals today that hasn't been on my site for years. Most are recently deleted pages though.
66.102.7.104 = Research/Information
66.102.9.104 = Shopping/eCommerce
66.102.11.104 = Neutral/Median
On my top keyword I find all three leaning toward the commercial. Only 3 non commercial in the top 10 on my topic. It has me concerned that I may not be able to stay there long with the commerical competition. Guess I'll enjoy it while I can.
But the commercial sites in the top 10 are good solid sites like the most popular magazines, catalogs and such in the field. There are no made for adsense sites or any other spammy sites.
Your solution would work if the site was new with no pages indexed yet, but you stated that you are finding pages as www and non www. The problem is Gs inability to let the old pages go. I have several hundred old pages in the index and have tried just about everything to get rid of them. Most of them are URL only or supplimentals but they are there. I agree with others that it would be far safer for a 301 then to have the bot find a non-exist site. A 404 means the page is gone. A 301 means the page has moved. A "nothing" might give you undesired affects. Not sure about this, but I can see a can of worms here.
I still find it hard to believe that G cannot distinguish that www and non www are form the same IP. G was able to in the past and people had no problem running both. Now there seems to be a big problem with this and I think G has been scrambling for a fix but for the past year or so, no fix has been implemented.
Kind of wierd that a site gets tanked because it has two routes to the pages and G cannot distinguish between the two. Kind of like writing a book, making a copy and then getting charged for plagerism. Doesn't make sense to me.
Rule 1. Install 301 redirects for domain.com so it redirects to www.domain.com
I actually did wwww to NON www , lets see about 6 weeks ago with some help from members here.
I'm on a linux server with front page extensions installed (Using FP 20003) so it was a bit of a pain but everything went well and as of yesterday using market leap I see all backlinks pointing to NON www .
I just prefer example.com rather than www.example.com , It's cleaner and removes the "dumbing down" of adding www imo
Now that everything went well I can do the same across all my sites.
If you forget the trailing / then your link to www.domain.com/folder will first be redirected to domain.com/folder/ {without www!} before arriving at the required www.domain.com/folder/ page.
The intermediate step, at domain.com/folder/ will kill your listings. Luckily, this effect is very easy to see if you use Xenu LinkSleuth to check your site: it shows up as reporting double the number of pages (when you generate the sitemap) that you actually have, with half of the pages having a title of "301 Moved".
The same is true whichever way the redirect flows. If you are redirecting to something that is not the defaultsitename for the site, then the lack of a trailing / on links that point to folders will kill your listings.
[edited by: g1smd at 1:59 am (utc) on Nov. 7, 2005]
I have Windows servers and there is an ISAPI filter that will do this. Whether your hosting company will do it for you is another matter. I have the luxury of hosting my own DNS and a Class C IP block. Without the filter, the only way to do a 301 is to point the A record in the DNS to one IP and the www A record to another. You then need to set up a blank site and globally 301 the non www to the www. It takes 2 IPs though and I am sure that your hosting company will probably frown on that.
Other then that, you need to use asp to do the 301. I forgot the code, but I am sure you can find it on the forums here. I am sirring at home right now and if you sticky me, I can send it to you tomorrow.
Alternatively you can point both www and non-www to the same server space and add the ASP script to the begining of every file on your server.
.
The mod_rewrite (htaccess for Apache) or ISAPI_rewrite (for IIS) works at the server level, adding a simple instruction to the server configuration files. This is usually much easier to set up.
[edited by: g1smd at 2:04 am (utc) on Nov. 7, 2005]
Apologies for slight offtopicness - but with windows boxes, if you have access to IIS on the server (i.e. not shared, but you could ask the hosts to do this) you can do a proper 301 redirect by doing the following, without the need of ISAPI filters.
Create a new site in IIS to catch the host headers of 'yourdomain.com' - make sure the actual website entry only has the 'www.yourdomain.com' version listed in its host headers.
Select the site you just created and hit Properties then the Home Directory tab. Select "A redirection to a URL" and enter the url [yourdomain.com$S$Q...]
The $S$Q part carries any querystrings etc being passed so the URL translates perfectly. Make sure the boxes marked "The exact url mentioned above" and "A permanent redirection for this resource" are checked and the one marked "A directory below url entered" is unchecked.
You can do this on one IP address and you don't need to buy anything. I've had this setup on dozens of sites and it gets crawled/indexed perfectly. The first time I set it up I was dubious, but I used a header checker and this is definitely a 301 solution.
Hope that helps!
Back to topic :)
[edited by: patc at 2:13 am (utc) on Nov. 7, 2005]
Go to http://www.yoursite.com/make.up.a.page.name.that.does.not.exist and look at the "Error 404" message -- do this in Mozilla or Firefox so that you really do see the error mesage (Internet Explorer hides it and replaces it with its own).
The error message should say something like this at the bottom:
Apache/2.0.47 (Unix) mod_ssl/2.0.47 OpenSSL/0.9.6b PHP/4.3.2 Server at www.yoursite.org Port 80
If you see something else, like IIS then you are not using Apache. The .htaccess file only works on Apache servers.
For IIS you need to use ISAPI_rewrite instead. Google for ISAPI 301 redirect for more information on that.
All the search engines have problems figuring out canonicals. Webmaster who naively think their sloppy webmastering will be figured out only have themselves to blame.
Anyway, doubt if it's significant, but it's curious eh?
That's got to be the best synopsis of this canonical fiasco yet. Well done webdude! LOL
Actually, I`m not happy about it because its like Google tolerating and giving more favor for this kind of NO CONTENT and non sense website with spammy domain, and title to rank high. And my website became a tool of doing that.
So, Glad to hear G is working correctly in this instance.
Back to Watching
WW_Watcher
>sloppywebmastering
OK,I feel pretty stupid. One site that I inherited and placed on the back burner has some links with widget/index.html. All of the supplementals that google shows for this domain have internal links with /index.html.
I'll clean them up and see what happens. If Google drops the supplementals, I may be eating crow at their lunch in Vegas.
The solution assumes no custom 404 error pages are used.
The easiest way is to get Firefox, then install the extension:
Live HTTP Header
This will allow you to view the headers sent back and forth between client and server. The second header is the server information from the site, that will in most cases tell you what the server is.
For example, for this page, this is the output:
HTTP/1.x 200 OK
Date: Mon, 07 Nov 2005 03:17:26 GMT
Server: Apache/2.0.52
Cache-Control: max-age=0
Pragma: no-cache
X-Powered-By: BestBBS v3.39
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html
The second way, also very easy, is to create a file called say check.php
<?php
phpinfo();
?>
that's it, just those 3 lines, upload it to your root directory, then type in yoursite.com/check.php
If the site is running apache it's almost certainly running php, this will tell you all the information about the servers running, apache,mysql, etc.
If all you see when you run it is the above literal text, <?php... it's very unlikely you have apache running.
Once you have determined your apache information, delete that file, it's good info to hackers.
[edited by: 2by4 at 3:21 am (utc) on Nov. 7, 2005]
I told the host company about the ISAPI thing last spring... and they made some noises about ' that will redirect all pages to the homepage'.
On the '9' dc my www homepage is showing up #1 and my phantom dup content homepages seem to be slowly going away so I'm not 100% sure I'm getting hammered to any degree with the non www. Before Feb2 I had wonderful serp rankings including my homepage. On feb 2 the accidental dups of my homepage cause site to be penalized... penalty has been removed as of 1st week of oct.
Plus, GG know s of my specific case and he says 'hang on' it should be fixed.
<added>
I'm scared to death to touch anything now since visitors are coming back.