Forum Moderators: Robert Charlton & goodroi
But I do not understand if this applies to my site.
Here we go.
I have www.example.com and can browse the site using this format
and when I remove the www and go to example.com I can still browse the site as example.com and then say example.com/category.aspx - then perpaps I will visit a url I have added myself and I will go back to www.example.com/whereever.aspx
So I kind of have two versions.
Is this the kind of duplicate problem that needs fixing.?
Is this the kind of duplicate problem that needs fixing.?
In a word, yes. Any time you have the same page displayed under two different urls, you have a duplicate content problem, and you should not let it stand. For more information, take a look at the Hot Topics [webmasterworld.com] section, pinned to the top of the Google Search forum home page, and look at the Duplicate Content section.
The following threads might be particularly helpful, but I suggest you take a look at all the articles, as chances are very good that you have other duplicate content issues as well....
Why Does Google Treat "www" & "no-www" As Different?
[webmasterworld.com...]
Good summary threads:
Duplicate content in forum and articles - will I get penalized?
[webmasterworld.com...]
Canonical URL Issues - including some new ones
[webmasterworld.com...]
It would probably take a non-www inbound link to cause you problems, but there's a pretty good possibility of that happening.
Even though I myself can manually remove the www from my URL and go to the site does that mean the SE's have a problem?
I have checked:- site:example.com -inurl:www and this test produces no results, so I guess google has just the www versions in it's database.
Also when I look in my dns records I see:-
example.com IP Address 3600 A Record
*.example.com IP Address 3600 A Record
www.example.com IP Address 3600 A Record
Which I think means that any visiting spider will be sent to the www version only.
The only problem I forsee is if there is a link in the site which is the non www version - that would make the spider go wrong.
Am I understanding this correctly?
Which I think means that any visiting spider will be sent to the www version only.The only problem I forsee is if there is a link in the site which is the non www version - that would make the spider go wrong.
Am I understanding this correctly?
What the DNS records mean is that an inbound spider can be sent to any of the above... ie, to...
- example.com with no www
- example.com with a www subdomain
- or to any wildcard subdomain.
This is a fine dns setup, because it assures your site will respond to a likely variety of requests, but dns is only half of what you need to take care of.
You must also set up the proper 301 rewrites on your web hosting server (ie, where your site is hosted), so that only one canonical version of your site is served. That way, if there's a link to either a non-www version, or to some subdomain that you don't specifically want, the request for that link will be rewritten to a request for the desired version of your domain, and you won't have a duplicate content problem.
Right now, you're very open to errors and mischief from others.
A rewrite connects the external URL request to an internal server filepath that is different to that which may have been suggested by the path shown in the URL.
So, the browser asks the server for "A" and the server responds with a 301 redirect that tells the browser to make a new request for "B". The browser asks for "B" and the server gives it the content.
For a rewrite, you ask for "X" and the server fetches the content of file "Y" but does not reveal the fact that the path was internally changed.
Also when I look in my dns records I see:-example.com IP Address 3600 A Record
*.example.com IP Address 3600 A Record
www.example.com IP Address 3600 A Record
Your DNS records are unrelated to how your web server responds to requests for your site (I doubt you want a wildcard DNS subdomain entry in there, incidentally - depending on your web server configuration, that could mean [anything].example.com would return your site!).
All those records mean is that the web server at [IP address] will respond to requests for those (sub) domains - that web server decides what happens when the requests get there.
If you have the same content available on different URLs (any difference whatsoever, even a single character), then you trust to a search engine's approach to duplicate handling that it won't affect performance.
Google's handling of dupes may not necessarily work in your favour - it could result effects like lost links, or devalued content. Here's one simplified scenario to exemplify how this can affect a site's performance:
This isn't something that should really be a concern for the average person with a site, but unfortunately, website configurations (from technical perspective) are far from ideal, and it ends up being the webmaster who has to address the issues.