Forum Moderators: phranque

Message Too Old, No Replies

Rewrite widgets.php?color=blue to blue widgets.php

         

Frank_Rizzo

10:24 pm on Dec 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I had a menu structure which had pages for different colors of widgets

/widgets.php?colour=blue
/widgets.php?colour=red
/widgets.php?colour=green

I now have pages

/blue_widgets.php
/red_widgets.php
/green_widgets.php

Using apache how do I redirect the old pages to the new page?

What is the score with search engines and redirects like this? Will there be a duplicate penalty?

In webmaster tools I can submit a file to be removed from the SE. Clearly this is not possible if a file is redirected as G. would check for the old file and be served with the new. Or does it detect the 301/302 status and thus drop the old file?

g1smd

11:49 pm on Dec 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You need a 301 redirect, to ensure that the Duplicate problem is fixed. Once a URL returns a redirect there is no content to be indexed at that URL.

The redirect will need to test the QUERY_STRING as a part of its operation.

You'll need to include the domain name in the target URL, and ensure that the parameters are stripped off too.

Post your best effort code here, as a basis for discussion.

jdMorgan

12:29 am on Dec 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google doesn't do anything at all with files. It recognizes only URLs, nothing else -- Not files, not sites, not domain, not pages, just URLs.

Don't confuse URLs and files, or you will have a lot of trouble with understanding mod_rewrite. They represent two different and only loosely-associative naming systems for two very different name-spaces; URLs on the Web, and filepaths within your server. As dynamic URLs like "/widgets.php?colour=blue" make clear, a URL or a "page" is not necessarily a file. In this case, one php script (a file) accepts many parameter values and generates many pages. In other cases, such as on multi-language sites using content-negotiation, one URL may reference many pages, each in a different language.

So, what has changed -- The filenames? The URLs? The links on your pages? We need to know what the old a new values of all of these things were and are in order to propose a complete and correct solution.

Jim

Frank_Rizzo

12:43 am on Dec 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Previously I have edited old pages and used a meta refresh to redirect. I always thought this is more efficient as the 'code is only run' when the old page is accessed. I guess it's efficient but not effective.

I don't think the code needs to be fancy and strip the color to build the new url. There are only a dozen pages so I guess I could just do single entries for each:

redirect 301 /widgets.php?color=blue h t t p: / / www.example.com/blue_widgets.php
.
.
redirect 301 /widgets.php?color=cyan h t t p: / / www.example.com/cyan_widgets.php

Frank_Rizzo

12:47 am on Dec 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



jdMorgan. The pages which has info on widgets has changed. Previously visitors could read about widgets by selecting a specific page via the menu. But the menu called a single page (widgets.php) which then pulled info on the colour referenced by color=

That is not good for SE and I wanted to split them up into specific pages.

Now users can access info on purple widgets via a specific purple_widgets.php page and thus SEs will index each page separately.

It's something I should have done years ago.

[edited by: Frank_Rizzo at 12:50 am (utc) on Dec. 10, 2008]

jdMorgan

1:36 am on Dec 10, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Frank, I don't know what you mean by a "page" -- I use only the terms "URLs", "files", and "filepaths". I don't insist on this out of pedantry, but rather because it is impossible to get the right answer without separating these concepts.

What you may have missed is that all you had to do was to change the links on your pages, and then add a bit of mod_rewrite code. You did not need to change the PHP scripts or the filepaths used within the server to access them -- Not at all. Rather, the procedure would be:

  • Change all on-page links to reference "/blue-widgets" instead of "/widget.php?color=blue"

  • Use mod_rewrite to internally rewrite a request for the URL "/blue-widgets" to the filepath "/widget.php?color=blue"

  • Optionally add a redirect, so that if a browser or robot directly requests the old "/widget.php?color=blue" URL, the server would generate a redirect to the new "blue-widgets" URL. (Although not required for the site to function with the new URLs, this would 'save' old bookmarks, preserve the PageRank/Link-pop of old links, and speed up search engines' replacement of the old URL with the new URL in their search results listings.)

    There is no need to change the function or name of any script files, and no need to carry the legacy ".php" into your new URL names. (Dropping ".php" from the URLs now would mean that you would never have to change those URLs again, even if you changed your site technology from PHP to .asp or to coldfusion, or some other future server-side scripting language.)

    You can't use the mod_alias "Redirect" directive to do what you need, because "Redirect" cannot 'see' the query strings appended to URLs. You'll need mod_rewrite. Something like:


    RewriteCond %{QUERY_STRING} ^color=([a-z]+)$
    RewriteCond %{DOCUMENT_ROOT}/%1_widgets.php -f
    RewriteRule ^widgets\.php$ http://www.example.com/%1_widgets.php? [R=301,L]

    The second RewriteCond prevents redirecting to a <color>_widgets.php URL if that URL will not resolve to an existing script file. To allow the redirect to occur with any arbitrary "word" as the color would be to risk having your site exploited my "malicious linkers."

    Jim

  • Frank_Rizzo

    9:26 am on Dec 10, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Jim,

    Think of it like this: a website which has information on each month of the year. Visitors can read about April, May, November etc. and can do so either by navigating from the main menu, or from a link posted on an external site.

    Unfortunately, a while ago I had used a navigation method based on one script with a query string used to determine the page returned

    ul
    li January a href = /months.php?month=Jan
    li February a href = /months.php?month=Feb
    li December a href = /months.php?month=Dec

    The months.php would then pull info for the desired month based on the query string. This worked well (all transparent to users) but is clearly not ideal for SE purposes.

    Now what I have changed to is this:

    ul
    li January a href = /month_january.php
    li February a href = /month_february.php
    li December a href = /month_december.php

    This is much better for SE as engines can now see and index a month for a specific page. If someone were to google for "the month of january" it is more likely this new page will be returned rather than months.php?month=Jan

    So what I need to ensure is that

    a) Search engines do not now see two pages for say November (there is no duplicated page accessed via /month_november.php and months.php?month=Nov

    b) All existing links and bookmarks will call up the correct page if accessed via the old structure. e.g. Month Gazette website wrote about our review of November and they link to us via months.php?month=Nov

    It looks as if the rewrite cond you have suggested will do the trick.

    [edited by: Frank_Rizzo at 10:13 am (utc) on Dec. 10, 2008]

    g1smd

    10:46 am on Dec 10, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    What JD said is that you didn't have to do any work at all to the FILES on your webserver in order to have new URLs for that content.

    All you needed to do, was implement a simple rewrite such that when the user asked for a URL like www.example.com/events/february (notice that you don't even need the .php on the end) your server would simply have retrieved the information from /months.php?month=Feb without revealing what that internal file location actually was.

    What you also would do is set up a 301 redirect such that should a user ask for the old URL at example.com/months.php?month=Feb that they are redirected to make a new request for the new URL at www.example.com/events/february instead.

    To make it all work all you needed to do was implement the rewrite and then make sure that all the links on your own pages pointed to URLs using the new format, as it is links that "define" URLs.

    However, it looks like you have now made a lot more work for yourself by having twelve instances of your script on the server, one for each month. That's going to be much harder to maintain in the long run.

    .

    *** This is much better for SE as engines can now see and index a month for a specific page. ***

    Again, Google does not see "pages", it sees content returned for individual URLs. That's a slightly different, but absolutely crucial, concept to understand.

    *** If someone were to google for "the month of january" it is more likely this new page will be returned rather than months.php?month=Jan ***

    For a URL with three or less parameters that is not true.

    Where people hit problems with parameters it is because of
    - inconsistent ordering creating duplicates &y=2&x=1 vs. &x=1&y=2.
    - not returning a 404 header for non-valid URLs like &option=value-does-not-exist
    - returning the same content for &value=AnyCase and &value=anycase and &value=AnYcAsE
    - and so on.

    That's not a problem with the parameters, per se, but a general design failure that can also be seen in non-parameter-based URLs too.

    jdMorgan

    2:15 pm on Dec 10, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    So, using your example of URL="/blue_widgets.php" and filepath="/widgets.php?color=blue", you'd just use two rules; one a redirect and one an internal rewrite:

    # Externally redirect direct client requests for old dynamic URLs to new static URLs
    RewriteCond %{THE_REQUEST} ^[A-Z]+\ /widgets\.php\?color=([a-z]+)\ HTTP/
    RewriteRule ^widgets\.php$ http://www.example.com/%1_widgets.php? [R=301,L]
    #
    # Internally rewrite new static URLs to server script filepath
    RewriteRule ^([a-z]+)_widgets\.php$ /widgets.php?color=$1 [L]

    The redirect is the one I described as "optional" in my previous post.

    Note also that this subject has been previously covered in great detail in this thread [webmasterworld.com] in our Apache Forum Library and in several other previous threads.

    Jim

    Frank_Rizzo

    2:28 pm on Dec 10, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    g1smd don't worry that I may have had to rewrite dozens of pages or files on the server. The files were just a few lines long which called a master db script to serve the pages.

    The old widgets file looked like this:

    ** widgets.php
    $colour=$_GET[color]
    #some other var settings
    include(/db/master_file.php)

    Basically it was receiving the color var, testing a few other vars (visitor registration level) and then requesting data from the db.

    The new files are basically exactly the same except that there is no need to $_GET the color.

    ** blue_widgets.php
    $colour='blue'
    #some other var settings
    include(/db/master_file.php)

    ** green_widgets.php
    $colour='green'
    #some other var settings
    include(/db/master_file.php)

    This took less than 5 mins to create :-)

    Jim. I had checked some of the archived threads but it is always wise to get the very latest info on this subject.

    Cheers.

    [edited by: Frank_Rizzo at 2:28 pm (utc) on Dec. 10, 2008]

    jdMorgan

    4:41 pm on Dec 10, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    OK, if you keep these as separate files, then you'll only need to use the first rule in my last post -- with modifications to match your actual URLs and filepaths.

    Jim