Forum Moderators: Robert Charlton & goodroi
That particular page has information on red widgets, blue widgets, green widgets, yellow widgets, etc. If you want to take a visitor to the specific point on that page that talks about yellow widgets, you can use an anchor to do it (A name="yellow"). But to get them there from another page on your site, you would have to link to it like this: example.com/foldera/index.shtml#yellow would you not?
I don't see any other way to do this, so that page would have two different URLs pointing to it, and both perfectly legitimate and proper. So how does Google deal with this? Is an exception made for anchors within a page, or do they still penalize your site for some reason?
Is there another way to link in this manner that I'm not aware of?
Edit to add: Thanks to everyone for the advice, I appreciate it.
[edited by: AndyA at 8:45 pm (utc) on Nov. 17, 2006]
Check your log files. Do bots even request the # part?
I suspect not.
.
I believe that it is only the browser that makes use of the named anchor in knowing where to "jump to" within the page.
The # part has no bearing on how the page is served.
Does a browser even send the # part of the URL to the server?
I don't see any other way to do this, so that page would have two different URLs pointing to it, and both perfectly legitimate and proper. So how does Google deal with this? Is an exception made for anchors within a page, or do they still penalize your site for some reason?
Google doesn't see the fragment identifier or named anchor. It is a client-side function performed at the browser and not the server.
I don't think you have anything to worry about in this instance. Google is not going to see duplicates. Google will see the primary URI and not the fragment identifier.
What you do need to be concerned with is using /index.html as opposed to /. Google will see those as two different locations (in the beginning). The one with the most links win. It is best practice to not use /index.html when formatting URIs. When it comes to root level pages (e.g. index.html), you should trim back to the trailing forward slash.
www.example.com/index.html#fragment
www.example.com/#fragment
Use the second example above to keep things short, sweet and at the same time, don't expose the underlying technology of the site (e.g. html, asp, php, jsp, etc.).