Forum Moderators: phranque
The entire shop sells widgets with a typical URL as such:
www.example.com/cgi-bin/shope/blue-widgets.cgi/round-blue-widgets/round-blue-widget-product1
Where blue-widgets.cgi is a category page listing all blue widgets, but after that it's also a directory for a sub-category page containing round blue widgets.
Problem here is Google currently has several of the .cgi URL's listed twice, as:
www.example.com/cgi-bin/shope/blue-widgets.cgi (no trailing slash)
AND
www.example.com/cgi-bin/shope/blue-widgets.cgi/ (with trailing slash)
I was originally going to redirect one to the other, but from the replies I got I decided not to at the moment and just rewrite everything to the new URL's I've been planning.
I need the format of Home/Main Category/Sub-Category/Product Pages, so I'm thinking the best format would be:
www.example.com/blue-widgets/round-blue-widgets/round-blue-widget-product1.htm
Question is, how should I handle things like
www.example.com/blue-widgets/round-blue-widgets/
where blue-widgets/ and round-blue-widgets/ can be either a page or a directory? Do I use a trailing slash or not? I want to avoid any further duplicate content issues that I probably already have.
Question is, how should I handle things likewww.example.com/blue-widgets/round-blue-widgets/
where blue-widgets/ and round-blue-widgets/ can be either a page or a directory?
Are you saying that in some cases, "round-blue-widgets/" might be a page, and in others, you want the server to produce a directory listing of a directory called "round-blue-widgets/" or to serve the index page in that directory, such as "round-blue-widgets/index.html"?
To be 'correct' according to the HTTP 'rules,' a trailing slash indicates a directory -- which may or may not have an index file in it. If there is an index file, then it will be served when that directory is requested. If there is no index file, then the server will generate a listing of all files in that directory if "Options +Indexes" is set, or it will produce a 403-Forbidden error response if "Options -Indexes" is set.
If there is no trailing slash, the URL refers to a 'page' or a 'file.'
You can use characters, strings, character-types, number of directory-levels in the URL (count slashes), presence or absence of trailing slashes, or any number of other things to differentiate between products or categories and to differentiate between URLs that should be rewritten to your script and those that should not. And don't be afraid of being explicit -- using "cat" and "prod" in the URL itself, for example.
The trick is to pick a good, consistent, expandable, and maintainable URL "system," and to stick with it.
Jim
A Rewrite does not rewrite one URL to a new URL.
Are you saying that in some cases, "round-blue-widgets/" might be a page, and in others, you want the server to produce a directory listing of a directory called "round-blue-widgets/" or to serve the index page in that directory, such as "round-blue-widgets/index.html"?
round-blue-widgets by itself can be a category page or it can have product pages under it like round-blue-widgets/round-blue-widget-product1.htm
round-blue-widgets/round-blue-widget-product2.htm
etc.
To be 'correct' according to the HTTP 'rules,' a trailing slash indicates a directory -- which may or may not have an index file in it. If there is an index file, then it will be served when that directory is requested. If there is no index file, then the server will generate a listing of all files in that directory if "Options +Indexes" is set, or it will produce a 403-Forbidden error response if "Options -Indexes" is set.If there is no trailing slash, the URL refers to a 'page' or a 'file.'