Forum Moderators: Robert Charlton & goodroi
Here's the set up -
His homepage is www.example.com and that's how it's listed in Google.
Google has just about every page in their index (and the site is doing well), but every page except the homepage is listed as example.com/subpage (ie with the www. removed).
I've looked at his html and it's clean as a whistle.
In a browser, any link from www.example.com takes me to *www*.example.com/subpage
Arguably, Google has decided that example.com/subpage and www.example.com/subpage are duplicates and has supressed the wrong one.
Like my sites, it's possible to browse his site with and without the www.
The one difference, which is why I'm asking for help, is that my sites are all Apache based, and his is Windows IIS.
Any ideas how we can get Google to list the www.example.com/subpage versions of his pages, instead of the example.com/subpage versions?
Thanks in advance
DerekH
In case his service provider won't allow him to edit his IIS configuration, assuming he is using ASP he can do the following:
Setup a server side include at the top of each page with the following asp script.
<%
PathInfo = Request.ServerVariables("PATH_INFO")
ServerName = Request.ServerVariables("SERVER_NAME")
IsWWW = InStr(ServerName,"www.mysite.com")
If IsWWW < 1 Then
NewLocation = "http://www.mysite.com" & PathInfo
Response.Status="301 Moved Permanently"
Response.AddHeader "Location", NewLocation
End If
%>
Just replace mysite in the example with the domain name.
I had to do this with my server since they don't allow access to the IIS configuration.
The problem was a canonicalization issue which Goog has admitted can happen, albeit not often. Their advice was the 301 solution, and it has worked out over about 2 months waiting time.
I should explain that my experience was related to terrible ranking due to the canonicalization issue. If your friend's site ranks ok, but you just want the www vs. non-www issue cleared up, you might have a different situation.
In either case, you probably should try these steps:
1. do the 301 redirect
2. make all internal links the full url path, including the www.
3. try to find if someone out there is linking to the site without the www. in their link code, and get them to change the link (this is not easy, but give it a try).
There are at least 2 threads on this board with much more detail, including advice from G-guy. I am extremly grateful to the people here who helped me out of this mess.
It can also be the result of an attempted takedown.
One or more links using the different valid aliases pointing to a page using relative urls.
Googlebot sees them and then walks them massive content duplication.
This can be done by any form of linking from any site.
I've put a 301 redirect from the ip to the www along with a site map listing the old ip and non-www urls that are showing up in google in the hopes of google indexing it and saying "oh, theres a 301, lets update the listings to www". It appears that googlebot has now visited the sitemap and browsed those links though the index looks the same.
The question is what to do now.
Leave the sitemap to the non-www 301 redirect's up for a while with the old ip still active?
Take the sitemap down so google doesn't get any more links pointing to the invalid urls?
Turn off the server on the ip to tell google that those urls are invalid?
If i was in that situation i would direct www-traffic to the non-www domain in stead of the other way round. It doensn't matter for rankings or anything else if you have the www or not - as long as you have consistency (ie. you use only one of them)
If you link to a folder, always include the trailing / at the end. It is important.
.
I just sorted out a mess where Xenu generated a massive site map for a site, one that was much larger than expected, contained every page duplicated, and loads with a title of "301 Moved". It turns out that although the site uses domain.com as the base in all the internal links, that the host name is configured as www.domain.com, and that many of the internal links did not include a trailing / on folder names. There was a valid .htaccess file directing calls for www.domain.com over to domain.com and it was correctly set up. So, what happens when you link to domain.com/folder is that there is an automatic internal server redirect to www.domain.com/folder/ (remember the host name is set to www.domain.com here) and then the 301 redirect inside the .htaccess file takes over and sends the visitor over to domain.com/folder/ instead. By including the trailing / this could have been avoided. Changing the server host name over to not include the www is also a good idea, but even if that was done, any request for domain.com/folder would still have to have an internal automatic redirect to domain.com/folder/ anyway. So, always include the / on the end to avoid any redirect happening at all.
Also part of this redirect issue is to make sure that if you link to an index page inside a folder then do NOT include the filename in the link. End with the folder name and a trailing / at the very end.
So, domain.com/folder/index.html is a bad link or only if your site has the servername as non-www?
Link to www.domain.com/folder/ or to /folder/ to avoid that problem.
.
Make sure the foldername ends in a / to avoid the server having to do a redirect from www.domain.com/folder to www.domain.com/folder/ which may go via domain.com/folder/ if the host name is not the same as the one in, or implied by, your link.
g1smd...
It's a bad link because when you change the technology of your site over to PHP or ASP then all your links will be instantly broken.Link to www.domain.com/folder/ or to /folder/ to avoid that problem.
Good point. Anyone out there know how to get Dreamweaver to support that style of link? Everytime I've tried it, I *have* to point to a file inside the folder in order to make a link...
DerekH