Forum Moderators: phranque

Message Too Old, No Replies

URL Rewrite like Wordpress

         

Poor_Knight

10:24 pm on Jul 11, 2008 (gmt 0)

10+ Year Member



Hello,

I'm trying to learn more about this topic while I solve my particular problem. So at the moment I am not sure where to begin with this.

I have a dynamic url that I simply want to replace with the page's title, similar to one of the methods available in Wordpress.

http://example.com/?action=view&id=1
change to
http://example.com/article-title

I've had mild success in changing the URL to http://example.com/view/1 but thats not what I need.

I assume that there has to be some query to get the article's title since it isn't in the URL to begin with. So this leads me to believe that this cannot be done in the .htaccess file since some server side script will need to be processed before the URL can be determined. Correct?

So do I need to do this processing and do the redirect within in my code?

Some guidance would be appreciated.

jdMorgan

10:57 pm on Jul 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Your script must take the "id=1" and look up the article-title in your database, and use article-title to create the new links on your pages.

When a user clicks on a link and sends a request to your server, you will rewrite all requests for pages which are not reserved (such as robots.txt), which do not contain a period in the final path-part (e.g. favicon.ico, styles.css, logo.gif, or hitcounter.js), and which do not exist as actual files, to your script. Your script will then look-up article-title and use it to generate a new page, just as it did before using id=1.

So, in fact, you could use the default WordPress .htaccess code (although it's not well-optimized) -- All the other changes will be in your script.

You may also want to add some code to your script to 301-redirect all old URLs to the new URLs, in order to speed up search engine re-indexing of your site and to preserve the "link juice" of all of your old inbound links. Since you will be changing almost all of your URLs, you should expect a drop in rankings for anywhere from a few weeks to several months -- possibly almost a year; It depends on your PageRank/Link-Popularity, and how often your site is spidered.

The ranking drop may be slight or dramatic, depending on whether your inbound links have good link-text or not. Plan ahead for this drop, and plan to operate on decreased revenue --possibly for an extended period-- while your site is re-indexed and re-ranked.

Jim

Poor_Knight

12:19 am on Jul 12, 2008 (gmt 0)

10+ Year Member



Thanks a bunch!

So the URL http://example.com/?action=view&id=1 will never be used in any links then (or shouldn't be)? The URLs in the links will be the 'pretty' URLs.

Is id=1 a moot point now since I need to do the content look up with the title?

Now my URLs should be http://example.com/article-title;
check for reserved files, etc.;
extract the title from the URL;
query DB with title;

Do I undertstand correct?
I'm not clear on where the 'rewriting' come in.

jdMorgan

2:35 pm on Jul 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> So the URL http://example.com/?action=view&id=1 will never be used in any links then (or shouldn't be)?
"id=1 must not be used in any links."

>The URLs in the links will be the 'pretty' URLs.
Yes, on-page links *define* the URLs, as far as the Web is concerned.

>Is id=1 a moot point now since I need to do the content look up with the title?
The phrase "moot point" is commonly misused and misunderstood (as here), but the "id=1" may or may not now be irrelevant/obsolete, depending on how your script works and whether such 'keys' are needed for other types of lookups, e.g. 'category' lookups, etc.

The rewriting comes in because you will need to rewrite *many but not all* URL-requests to the script, so that they can be handled.

Exceptions would likely be: Reserved files (robots.txt, .htaccess, labels.rdf, w3c/p3p.xml, sitemap.xml), media files (images, video, audio), external JavaScripts, CSS stylesheets, and all files in 'admin' subdirectories. WordPress uses 'file exists' and 'directory exists' checks to make this distinction -- a simple and effective --but highly-inefficient-- way of doing things (because each and every request to the server invokes *three* requests to the OS to go and check the filesystem (the two WP checks, plus the usual server check), possible resulting in actual disk reads, which are very slow compared to HTTP connection and data transfers).

Jim