Forum Moderators: phranque

Message Too Old, No Replies

Rewrite 410 not working

Old filenames return 404 when should be Gone

         

dstiles

10:38 am on Sep 10, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have recently moved a site from ASP to Apache 2.4 on linux. I have expanded and modified the site considerably and file names no longer match, returning (naturally) 404.

I have tried (and failed) to catch the filenames and force a 410 on them but it does not seem to work. Filenames are of the general pattern index.htm, views.htm, maps-01.htm, views-t.htm, views-b01.htm. I have attempted to redirect these using the code below in htaccess:

RewriteRule "/(index|maps|views)-?[a-z]?(\d\d)?\.htm" "-" [G]

Any thoughts on this, please?

penders

11:13 am on Sep 10, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In .htaccess, the URL-path matched by the RewriteRule pattern never starts with a slash, so you need to remove the slash prefix on the regex.

For example:

RewriteRule "(index|maps|views)-?[a-z]?(\d\d)?\.htm" "-" [G]


You should probably have some start/end anchors on the regex. The surrounding quotes are optional here (they are only needed if the regex contains unescaped spaces).

If when the hyphen is included, it is always followed by something then I would probably make the whole of the last bit optional (eg. "-b01"), rather than making just the hyphen optional, which is arguably matching too much. (?)

dstiles

6:38 pm on Sep 10, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks for the reply, penders.

Point taken about the hypen/etc as well.

I have modified it to...

RewriteRule ^(index|maps|views)(-[a-z]?(\d\d))?\.htm "-" [G]

I will see what happens next. :)

penders

9:29 pm on Sep 10, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month




RewriteRule ^(index|maps|views)(-[a-z]?(\d\d))?\.htm "-" [G]


You'll still need to make the digits (\d\d) optional in order to match "views-t.htm".

For example:


RewriteRule ^(index|maps|views)(-[a-z]?(\d\d)?)?\.htm "-" [G]

phranque

11:35 pm on Sep 10, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



the double quotes are optional and unnecessary in this case.

i would also make the regular expression more specific with an end anchor:


RewriteRule ^(index|maps|views)(-[a-z]?(\d\d)?)?\.htm$ - [G]

dstiles

10:22 am on Sep 11, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



penders - thanks, well spotted! :)

phranque - Double quotes now removed. I deliberately avoided the end anchor as I've noticed some bad bots append querystrings to the end. Or would they be not included in the rule anyway? I see nothing to indicate that.

penders

11:05 am on Sep 11, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



...some bad bots append querystrings to the end. Or would they be not included in the rule anyway? I see nothing to indicate that.


Query strings are "not included in the rule anyway". The RewriteRule directive matches against the URL-path only, which notably excludes the query string, so the rule will match "any" query string by default.

Aside: In order to match a query string you would need a RewriteCond directive and match against the QUERY_STRING server variable.

phranque

12:18 pm on Sep 11, 2019 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



phranque - Double quotes now removed. I deliberately avoided the end anchor as I've noticed some bad bots append querystrings to the end. Or would they be not included in the rule anyway?

as penders mentioned, the query string isn't matched to the rewriterule pattern, only the url path is.

... The Pattern will initially be matched against the part of the URL after the hostname and port, and before the query string (e.g. "/app1/index.html").

source: https://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewriterule

dstiles

9:42 am on Sep 12, 2019 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Many thanks, both. End anchor now added.

And thanks for solving my problem. Log says I am now pushing out 410 for the relevant pages. :)