Forum Moderators: phranque
/widgets.php?colour=blue
/widgets.php?colour=red
/widgets.php?colour=green
I now have pages
/blue_widgets.php
/red_widgets.php
/green_widgets.php
Using apache how do I redirect the old pages to the new page?
What is the score with search engines and redirects like this? Will there be a duplicate penalty?
In webmaster tools I can submit a file to be removed from the SE. Clearly this is not possible if a file is redirected as G. would check for the old file and be served with the new. Or does it detect the 301/302 status and thus drop the old file?
The redirect will need to test the QUERY_STRING as a part of its operation.
You'll need to include the domain name in the target URL, and ensure that the parameters are stripped off too.
Post your best effort code here, as a basis for discussion.
Don't confuse URLs and files, or you will have a lot of trouble with understanding mod_rewrite. They represent two different and only loosely-associative naming systems for two very different name-spaces; URLs on the Web, and filepaths within your server. As dynamic URLs like "/widgets.php?colour=blue" make clear, a URL or a "page" is not necessarily a file. In this case, one php script (a file) accepts many parameter values and generates many pages. In other cases, such as on multi-language sites using content-negotiation, one URL may reference many pages, each in a different language.
So, what has changed -- The filenames? The URLs? The links on your pages? We need to know what the old a new values of all of these things were and are in order to propose a complete and correct solution.
Jim
I don't think the code needs to be fancy and strip the color to build the new url. There are only a dozen pages so I guess I could just do single entries for each:
redirect 301 /widgets.php?color=blue h t t p: / / www.example.com/blue_widgets.php
.
.
redirect 301 /widgets.php?color=cyan h t t p: / / www.example.com/cyan_widgets.php
That is not good for SE and I wanted to split them up into specific pages.
Now users can access info on purple widgets via a specific purple_widgets.php page and thus SEs will index each page separately.
It's something I should have done years ago.
[edited by: Frank_Rizzo at 12:50 am (utc) on Dec. 10, 2008]
What you may have missed is that all you had to do was to change the links on your pages, and then add a bit of mod_rewrite code. You did not need to change the PHP scripts or the filepaths used within the server to access them -- Not at all. Rather, the procedure would be:
There is no need to change the function or name of any script files, and no need to carry the legacy ".php" into your new URL names. (Dropping ".php" from the URLs now would mean that you would never have to change those URLs again, even if you changed your site technology from PHP to .asp or to coldfusion, or some other future server-side scripting language.)
You can't use the mod_alias "Redirect" directive to do what you need, because "Redirect" cannot 'see' the query strings appended to URLs. You'll need mod_rewrite. Something like:
RewriteCond %{QUERY_STRING} ^color=([a-z]+)$
RewriteCond %{DOCUMENT_ROOT}/%1_widgets.php -f
RewriteRule ^widgets\.php$ http://www.example.com/%1_widgets.php? [R=301,L]
Jim
Think of it like this: a website which has information on each month of the year. Visitors can read about April, May, November etc. and can do so either by navigating from the main menu, or from a link posted on an external site.
Unfortunately, a while ago I had used a navigation method based on one script with a query string used to determine the page returned
ul
li January a href = /months.php?month=Jan
li February a href = /months.php?month=Feb
li December a href = /months.php?month=Dec
The months.php would then pull info for the desired month based on the query string. This worked well (all transparent to users) but is clearly not ideal for SE purposes.
Now what I have changed to is this:
ul
li January a href = /month_january.php
li February a href = /month_february.php
li December a href = /month_december.php
This is much better for SE as engines can now see and index a month for a specific page. If someone were to google for "the month of january" it is more likely this new page will be returned rather than months.php?month=Jan
So what I need to ensure is that
a) Search engines do not now see two pages for say November (there is no duplicated page accessed via /month_november.php and months.php?month=Nov
b) All existing links and bookmarks will call up the correct page if accessed via the old structure. e.g. Month Gazette website wrote about our review of November and they link to us via months.php?month=Nov
It looks as if the rewrite cond you have suggested will do the trick.
[edited by: Frank_Rizzo at 10:13 am (utc) on Dec. 10, 2008]
All you needed to do, was implement a simple rewrite such that when the user asked for a URL like www.example.com/events/february (notice that you don't even need the .php on the end) your server would simply have retrieved the information from /months.php?month=Feb without revealing what that internal file location actually was.
What you also would do is set up a 301 redirect such that should a user ask for the old URL at example.com/months.php?month=Feb that they are redirected to make a new request for the new URL at www.example.com/events/february instead.
To make it all work all you needed to do was implement the rewrite and then make sure that all the links on your own pages pointed to URLs using the new format, as it is links that "define" URLs.
However, it looks like you have now made a lot more work for yourself by having twelve instances of your script on the server, one for each month. That's going to be much harder to maintain in the long run.
.
*** This is much better for SE as engines can now see and index a month for a specific page. ***
Again, Google does not see "pages", it sees content returned for individual URLs. That's a slightly different, but absolutely crucial, concept to understand.
*** If someone were to google for "the month of january" it is more likely this new page will be returned rather than months.php?month=Jan ***
For a URL with three or less parameters that is not true.
Where people hit problems with parameters it is because of
- inconsistent ordering creating duplicates &y=2&x=1 vs. &x=1&y=2.
- not returning a 404 header for non-valid URLs like &option=value-does-not-exist
- returning the same content for &value=AnyCase and &value=anycase and &value=AnYcAsE
- and so on.
That's not a problem with the parameters, per se, but a general design failure that can also be seen in non-parameter-based URLs too.
# Externally redirect direct client requests for old dynamic URLs to new static URLs
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /widgets\.php\?color=([a-z]+)\ HTTP/
RewriteRule ^widgets\.php$ http://www.example.com/%1_widgets.php? [R=301,L]
#
# Internally rewrite new static URLs to server script filepath
RewriteRule ^([a-z]+)_widgets\.php$ /widgets.php?color=$1 [L]
Note also that this subject has been previously covered in great detail in this thread [webmasterworld.com] in our Apache Forum Library and in several other previous threads.
Jim
The old widgets file looked like this:
** widgets.php
$colour=$_GET[color]
#some other var settings
include(/db/master_file.php)
Basically it was receiving the color var, testing a few other vars (visitor registration level) and then requesting data from the db.
The new files are basically exactly the same except that there is no need to $_GET the color.
** blue_widgets.php
$colour='blue'
#some other var settings
include(/db/master_file.php)
** green_widgets.php
$colour='green'
#some other var settings
include(/db/master_file.php)
This took less than 5 mins to create :-)
Jim. I had checked some of the archived threads but it is always wise to get the very latest info on this subject.
Cheers.
[edited by: Frank_Rizzo at 2:28 pm (utc) on Dec. 10, 2008]