Forum Moderators: phranque

Message Too Old, No Replies

How to do a simple 410?

I have searched, honest

         

oddsod

7:15 pm on Jan 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My htaccess currently says

RewriteEngine On
RewriteCond %{HTTP_HOST} ^example.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
redirect 301 /folder/index.html http://www.example.com/resources/
(and some other redirects)

Where do I put the RewriteRule ^\products\discontinued\.html$ - [G]? Is it after the RewriteEngine On line?

I'm a complete htaccess idiot. Please bear with me as I don't know even the simplest terminology. I have read threads here like this one [webmasterworld.com] and this Apache page [httpd.apache.org] (well, I didn't quite read it, it's mostly not in English :(). Even the nice simple explanation here [diveintomark.org] doesn't seem to answer the question.

[edited by: jdMorgan at 9:28 pm (utc) on Jan. 12, 2008]
[edit reason] example.com [/edit]

Marcia

8:37 pm on Jan 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>I have searched, honest

You haven't found the easy way because the one that's easy isn't mod_rewrite. In .htaccess, for a simple 410

Redirect gone filegone.html
or
Redirect gone /thisisremoved/
or
Redirect gone /directory/removedthefile.html

ErrorDocument 410 discontinued.html

oddsod

8:40 pm on Jan 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks, Marcia. Would that go above everything else in the .htaccess?

jdMorgan

11:20 pm on Jan 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It doesn't matter, really, except the RewriteEngine on must precede any/all other mod_rewrite directives.

Redirect and RedirectMatch are mod_alias directives, and are not affected by mod_rewrite directives. Furthermore, the server will run all mod_rewrite directives first, followed by all the mod_alias directives, or vice-versa, depending on its configuration. So within the set of directives handled by either module, the order you specify matters, but the order of directives belonging to different modules in your file does not strictly control their execution order.

Note also that one of your slashes is backwards, and one of them shouldn't even be there:


RewriteRule [b]^p[/b]roduct[b]s/d[/b]iscontinued\.html$ - [G]

Jim

oddsod

1:17 pm on Jan 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks, jdmorgan, you're the king. And thanks for spotting the slash, I was vaguely uncomfortable about it as it seemed too easy to specific a directory path like I would in "normal" talk :) (unless, of course, I'm using Marcia's example for a single file).

jdMorgan

5:09 pm on Jan 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Not sure I understood that reply, but be aware that the URL-path syntax for Redirect (mod_alias) and RewriteRule (mod_rewrite) is different in .htaccess.

Jim

oddsod

8:34 pm on Jan 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My understanding is that redirect uses "normal" paths and rewrite uses "complicated" ones :)

This is how it looks now with your help. Cheers.


RewriteEngine On
RewriteCond %{HTTP_HOST} ^example.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
redirect 301 /folder/index.html http://www.example.com/resources/

#File that exists no more
Redirect gone filea.html

#Folder that exists no more
RewriteRule ^products/discontinued\.html$ - [G]

[edited by: jdMorgan at 9:17 pm (utc) on Jan. 12, 2008]
[edit reason] example.com [/edit]

Marcia

9:20 pm on Jan 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My understanding is that redirect uses "normal" paths and rewrite uses "complicated" ones

Redirect directive uses "normal" paths because it doesn't involve the use of regular expressions. mod_rewrite is different because it does use regular expressions, which have a certain syntax and a number of reserved characters.

For example:

#Folder that exists no more
RewriteRule ^products/discontinued\.html$ - [G]

discontinued.html would be OK with Redirect, but it has to be discontinued\.html for mod_rewrite because in mod_rewrite the period (or full stop) character has a designated meaning: . means "any character."

So with the final line of your example, the \ being used is an "escape" character telling Apache to treat the period . that follows it as a normal character instead of the reserved character.

jdMorgan

9:26 pm on Jan 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The mod_alias directive Redirect uses prefix-matching, while mod_rewrite's RewriteRule and mod_alias' RedirectMatch use regular-expressions patterns.

The code should look like this:


RewriteEngine On
RewriteCond %{HTTP_HOST} ^example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
#
# Folder that exists no more
RewriteRule ^products/discontinued\.html$ - [G]
#
Redirect 301 /folder/index.html http://www.example.com/resources/
#
# File that exists no more
Redirect gone [b]/f[/b]ilea.html

However, this leaves it up to the server whether the RewriteRules or the Redirects will be processed first -- Directives are processed module-by-module, not in strict line-by-line order, and either mod_alias or mod_rewrite may be configured to run first. This can cause problems with your domain canonicalization rule if mod_rewrite is configured to run first: Specifically, a request for http://example.com/folder/index.html would first be redirected to http://www.example.com/folder/index.html by your mod_rewrite rule, and then redirected again to http://www.example.com/resources/ by your mod_alias Redirect. Having two back-to-back redirects is inefficient and undesirable from an SEO perspective.

For this reason, it would be best not to mix module usage, and instead use all mod_rewrite or all mod_alias directives. Since mod_alias cannot check the requested hostname, that means that you need to use mod_rewrite:


RewriteEngine on
#
# Non-existent URLs (Removed files)
RewriteRule ^products/discontinued\.html$ - [G]
RewriteRule ^filea\.html$ - [G]
#
# Redirect index page of /folder to /resources/
RewriteRule ^folder/index\.html$ http://www.example.com/resources/ [R=301,L]
#
# If not already redirected above, redirect non-canonical domain requests to the canonical domain
RewriteCond %{HTTP_HOST} ^example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

Note that the rules are now in order from most-specific to least, and that we do not bother doing the domain canonicalization redirect if the requested URL is Gone. Now if a request comes in for http://example.com/folder/index.html, you can be sure it will be redirected straightaway to the proper domain and folder.

Jim

oddsod

4:00 pm on Jan 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Wow, that's incredible. And thanks for taking the time to explain the \ significance (Marcia) and the logic in the order of commands (jim).

I owe you guys more than the odd beer or two!