Forum Moderators: phranque

Message Too Old, No Replies

RewriteRule to remove a subdirectory from URL

Need help with syntax for RewriteRule!

         

milo2man

7:33 pm on Nov 19, 2008 (gmt 0)

10+ Year Member



I've searched the posts and wracked my brain and am still not any wiser about the syntax I need. Hope someone can help.

I want to make

http://www.example.com/blog/category/blog/?cattag=events

appear as

http://www.example.com/events/

on the users browser.

If this is not possible I would be happy with

http://www.example.com/blog/events/

I would like to have a general rule that would work on all cattag possibilities such as: events, learning, media, etc.

I am using Wordpress 2.6.1 with "pretty" permalinks turned on and my .htaccess file currently looks like:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /blog/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /blog/index.php [L]
</IfModule>

jdMorgan

10:45 pm on Nov 19, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you want to change the URLs that the user sees, then edit the WordPress scripts to change those URLs on the WordPress pages. The URLs are defined by the links on Web pages, and mod_rewrite cannot "change" them.

If you don't have the knowledge to edit the WordPress scripts, or if you don't want to because you'll have to re-edit them every time a new version is released (a good reason), then look into getting (or buying) and installing an "SEO-friendly-URL", "Search-engine-friendly-URL", or "SEF URL" plug-in for WordPress.

Jim

[edited by: jdMorgan at 10:52 pm (utc) on Nov. 19, 2008]

milo2man

6:50 am on Nov 20, 2008 (gmt 0)

10+ Year Member



Thanks for the reply, Jim.

I don't want to change the link itself, if it can be avoided, but just what is displayed in the address bar of the browser.

So the user clicks on a link generated by Wordpress with the URL
http://www.example.com/blog/category/blog/?cattag=events

and "sees" in address bar
http://www.example.com/events/

which still leads to the location
http://www.example.com/blog/category/blog/?cattag=events

Is this not possible? Is this not good practice? I thought that this was exactly what the rewrite rules did?

jdMorgan

7:39 am on Nov 20, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The link defines what is displayed in the address bar of the browser. So there is no choice there.

Now you can indeed redirect client requests to change the browser's URL and address bar. But keep in mind what that means: Every request for a page on your server that is the result of a click on one of these "wrong-looking" Wordpress links will get redirected to the "right-looking" URL. This redirect response terminates the clients HTTP transaction with your server, and the client will have to re-issue the page request using the "right-looking" URL provided in that redirect response.

So, you will get two HTTP transactions logged for each 'real page' request, and your users will suffer the latency of two HTTP requests to get one page, making your site look slow compared to others. The double-logging will result in your page counts being doubled in your "Stats" reports as well.

Strongly suggest you look for that plug-in and pay for it if it's not a freebie.

The usual set up is this:

The link on the page is www.example.com/events/

The user clicks that link, and that URL is requested from the server.
But that URL doesn't really exist, because content is generated by a script at /blog/index.php

So the server uses an internal rewrite to translate a request for the URL www.example.com/events/ into an internal request for the server filepath /blog/index.php?cattag=events

and the index.php script in the /blog subdirectory happily generates the content and sends it back to the user.

An optional third step would be to detect a direct client request for "example.com/blog/index.php?cattag=events" and externally redirect that back to the friendly "example.com/events/" for the purposes of obscuring your internal architecture and/or for cleaning up search results if that unfriendly path has been previously indexed by search engines. This redirect is only done as an optional "clean-up" step after the previous changes have been made to a site.

I hope this helps. This stuff can be massively confusing. A few key points are:

  • The links on pages define URLs
  • Web clients (browsers, search engine robots, etc.) request URLs from servers
  • Servers define filepaths. (The filepaths are defined by the server configuration and by the Webmaster's placement of files in the directory space allocated to the site.)
  • Servers map URLs to filepaths
  • This mapping may be direct: Remove domain name prefix, add document_root disk directory path, serve that file or run that script to generate content
  • This mapping may be indirect: Remove domain name, pass to mod_rewrite, do magic, add document_root directory path, serve that file or run that script to generate content
  • A client request may result in the requested content being returned, or it may result in a redirect response from the server
  • If a redirect response is received, the only 'content' in that response is the new URL. The client must start a new request using that new URL to get what it asked for in the first place

    Jim

  • milo2man

    10:46 am on Nov 20, 2008 (gmt 0)

    10+ Year Member



    Thanks for the detailed reply. I will look into SEO plugins for WP. Can you recommend any one in particular?

    An optional third step would be to detect a direct client request for "example.com/blog/index.php?cattag=events" and externally redirect that back to the friendly "example.com/events/" for the purposes of obscuring your internal architecture and/or for cleaning up search results if that unfriendly path has been previously indexed by search engines.

    I thought that this was in fact all I needed to change the URL that the user sees without changing the site structure?

    For example, I have in my .htaccess for the site root:

    RewriteCond %{THE_REQUEST} ^.*/index.html
    RewriteRule ^(.*)index.html$ http://www.example.com/$1 [R=301,L]

    and this succeeds in changing the URL displayed in the browser from www.example.com/index.html to www.example.com/

    Do I understand you right that this is not a good thing to do because it taxes the server unnecessarily? I have it in there so that anyone who has linked to my old index.html will not get a 404 error but still land on my new index.php page.

    jdMorgan

    2:06 pm on Nov 20, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Yes, you understand correctly. Your "index to /" redirect should be used only as a fix-up for old search engine listings and for old links from other sites over which you have no control. You should make sure that all links on your own site point directly to "/" when referring to your index page.

    Otherwise, every time a visitor clicks your home page link, or every time a search engine spider follows one of those links, it will get no content, but instead it will receive a redirect response, telling the browser or robot to ask again for that page at the "/" URL. So, you will log two requests for each such click, and the 'real' URL will remain "/index.html" as far as the Web is concerned.

    It's important to note that mod_rewrite does not change links embedded in your pages. It only "re-maps" or redirects incoming HTTP requests to your server. It executes after a request is received by your server, and before any content is served or any scripts are executed.

    So to "change the link on a page," you have to do just that: Modify the source code of the HTML page or of the script that generates that HTML page, and change the link.

    I cannot recommend an SEF plug-in for two reasons. First I'm not a WP user, and second, we do not allow commercial product promotion of any kind (or the posting of links to commercial sites) here on WebmasterWorld. The primary purpose of this policy is to prevent the forums from being polluted by posts whose only purpose is to promote a product or service, keeping the forums focused and bloat-free. Secondarily, the policy prevents members from accidentally creating a thread here that out-ranks their own site for a search on their domain name 20 minutes after they post to a thread; This can easily happen to pages which have a PR of 4 or less, maybe 5. Therefore, posts with 'real' URLs in them must and will be edited by the volunteer moderators here, which is a rather large time-waster. So we use only "example.com" in code examples, and we don't allow product recommendations for any specific commercial products or requests for product recommendations in the public threads.

    A search on the term "WordPress SEF plug-in" and the other similar phrases I mentioned in my previous post should give you some results for providers. Then research forum threads containing the provider and/or product names to assess the quality of the plug-ins you find to find out which work well, and which have problems requiring complex mod_rewrite code to fix. Pay particular attention to the plug-ins' handling of "search," "category view," and "previous/next" URLs, as these URLs tend to be complex and problematic.

    Jim

    milo2man

    2:24 pm on Nov 21, 2008 (gmt 0)

    10+ Year Member



    Thanks to your detailed explanation, I am starting to understand this stuff.

    I now have all my internal links pointing to "/" instead of index.html. Thanks for the tip!

    I will look into SEO plugins for wordpress and take it from there.

    g1smd

    2:28 am on Nov 23, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Have you also done the non-www vs. www fixes too?

    milo2man

    5:31 pm on Nov 23, 2008 (gmt 0)

    10+ Year Member



    I have my web host redirect any requests to

    example.com

    to

    www.example.com

    I set this up through my control panel for the domain.

    Is this sufficient, or do I need to set up something in my .htaccess?

    g1smd

    6:27 pm on Nov 23, 2008 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    You need to check they have implemented a 301 redirect, not a 302 redirect.

    You also need to check that the implementation does not cause a redirection chain for some inputs.