Forum Moderators: phranque

Message Too Old, No Replies

Using .htaccess to get an e404 page from out of date dynamic pages

/index.php?module=article&view=73&MMN_position=62:29

         

james96

5:14 pm on Feb 1, 2007 (gmt 0)

10+ Year Member



Greetings.

I was using a content management system (CMS) on a web-site for quite some time. The site was fully crawled by MSN, Yahoo, Google, etc.

I took down the CMS (which happened to be phpWebSite) and have been building the site with more intelligible links. However, I am having a problem getting all of the old links out of the search engine. The reason is, because I am using index.php as my main index page for the site. And anything with index.php?... acts like a "valid" url, even though it's not any more.

For instance, here's a page from the old site:

/index.php?module=article&view=73&MMN_position=62:29

When the search engines try to check that page, they think that the content changed and that the page still exists.

Is there a way, in .htaccess to get links like that to generate true error 404 pages?

I would appreciate your help!

jdMorgan

5:48 pm on Feb 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This can be done using Apache mod_rewrite [httpd.apache.org]. For that specific URL and query string:

Options +FollowSymLinks
RewriteEngine on
#
RewriteCond %{QUERY_STRING} ^module=article&view=73&MMN_position=62:29$
RewriteRule ^index\.php$ /some_path_that_does_not_exist.lmth [L]


For all requests to index.php with *any* query string:

Options +FollowSymLinks
RewriteEngine on
#
RewriteCond %{QUERY_STRING} .
RewriteRule ^index\.php$ /some_path_that_does_not_exist.lmth [L]

By rewriting the request to a URL that is known not to exist, you force a 404-Not Found error.

You could also generate a 410-Gone response:


Options +FollowSymLinks
RewriteEngine on
#
RewriteCond %{QUERY_STRING} ^module=article&view=73&MMN_position=62:29$
RewriteRule ^index\.php$ - [G]

Somewhere between those three methods, it's likely you'll find the right solution. An exact match need not be required: It's possible to match partial query strings, or query strings that match certain specific conditions by taking advantage of the regular-expressions pattern-matching capability of mod_rewrite.

For more information, see the documents cited in our forum charter [webmasterworld.com] and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com].

Jim

james96

6:03 pm on Feb 2, 2007 (gmt 0)

10+ Year Member



Thanks for the links and code! I'll get to work on trying some of your options.

encyclo

6:10 pm on Feb 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It depends also on how you are using /index.php now - if don't mind having calls for /index.php (without any variables) serving a 404 as well, you could simply switch your default directory index to another name and remove index.php completely. eg. you decide to use "default.php" instead of index.php, then add:

DirectoryIndex [b]default.php[/b] index.php index.html index.htm

And rename your index.php to default.php.

james96

4:31 pm on Feb 3, 2007 (gmt 0)

10+ Year Member



jdMorgan, I went with this:

Options +FollowSymLinks
RewriteEngine on
#
RewriteCond %{QUERY_STRING} .
RewriteRule ^index\.php$ /some_path_that_does_not_exist.lmth [L]

And it works very nicely!