Forum Moderators: phranque

Message Too Old, No Replies

RewriteRule and [G]

         

chowse

11:50 am on Apr 19, 2009 (gmt 0)

10+ Year Member



Hi,
I'm new, this is my first post.

mod_rewrite is "just out of the box" here.
I think my issue won't be too difficult to solve for those of you familiar with mod_rewrite, but I'm not getting anywhere with it.

At one time, I was serving 3 rss feeds from my site, but now, only 1.
Google FeedFetcher continues to bug my server looking for the missing feeds.
I've used Google's Webmaster Tools to request that those feeds be removed from the site index, and it shows they have been removed, but still my httpd-access.log is filled with:

72.14.199.106 - - [19/Apr/2009:05:41:22 -0500] "GET /blog.rss HTTP/1.1" 404 206 "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 1 subscribers; feed-id=6739455439618173225)

So, as a challenge to myself,what if I used mod_rewrite to send a "Gone" to those requests?
In httpd.conf, I currently have:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} ="blog.rss" [OR]
RewriteCond %{REQUEST_URI} ="adesnse.rss"
RewriteRule \.rss [G]
RewriteLog /var/log/httpd-rewrite.log
RewriteLogLevel 3
</IfModule>

bit it's not working, I get a 404 when I try it.
the rewrite log says:

192.168.254.3 - - [19/Apr/2009:06:25:27 --0500] [curly/sid#80a7f10][rid#8179058/initial] (2) init rewrite engine with requested uri
/blog.rss
192.168.254.3 - - [19/Apr/2009:06:25:27 --0500] [curly/sid#80a7f10][rid#8179058/initial] (3) applying pattern '\.rss' to uri '/blog.
rss'
192.168.254.3 - - [19/Apr/2009:06:25:27 --0500] [curly/sid#80a7f10][rid#8179058/initial] (1) pass through /blog.rss

I assume pass through means, the pattern didn't match, or I told mod_rewrite to skip that rule...?

How should this be configured?
(I wouldn't mind verbosity from anyone who can help) :-)

TIA,
Charles

FreeBSD 6.4-STABLE, Apache 2.2.11

chowse

1:07 pm on Apr 19, 2009 (gmt 0)

10+ Year Member



Sorry to reply to myself, but I have solved it.

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/blog\.rss$ [OR]
RewriteCond %{REQUEST_URI} ^/adsense\.rss$
RewriteRule .* - [G]
RewriteLog /var/log/httpd-rewrite.log
RewriteLogLevel 3
</IfModule>

I didn't have my regexp correct, nor did I have my rule correct. Guess that sorta crashes and burns the whole thing, eh? ;-)

I's still be very interested to hear any comments on anything I mentioned upthread.

Thanks to this thread:
[webmasterworld.com...]

g1smd

4:24 pm on Apr 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Since the rule processing starts with the pattern on the left of the RewriteRule, and not with the stuff in the RewriteCond, using .* for the rule pattern means that Mod_Rewrite has to evaluate both of the RewriteCond lines to see if either of them are a match, and has to do this for every URL request (images, css, javascript, etc) hitting the server.

I'd look at the rule being:

RewriteRule  ^(blogĻadsense)\.rss$ - [G]

and then I don't know if you actually need the REQUEST_URI test in the RewriteCond at all.

If you do, it would be something like:

RewriteCond %{REQUEST_URI} ^/(blog¦adsense)\.rss$

chowse

4:38 pm on Apr 19, 2009 (gmt 0)

10+ Year Member



Hi g1smd, thanks for the reply.
I can see the value in changing the rule. I will take care of that right away.

I don't understand why you might think that I don't actually need the REQUEST_URI test in the RewriteCond.
Could you explain that please?

g1smd

6:34 pm on Apr 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It depends whether omitting it leads to some sort of infinite rewrite loop.

chowse

6:50 pm on Apr 19, 2009 (gmt 0)

10+ Year Member



Ok, so for safety, I will leave it there until further notice, and change the RewriteCond as you suggested.

Thanks very much!

jdMorgan

7:39 pm on Apr 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Checking the same thing -- %{REQUEST_URI} in a RewriteCond and the URL-path in RewriteRule is redundant and unnecessary. The stand-alone rule posted above is sufficient.

The only time you need to check "almost the same thing" in a RewriteCond and RewriteRule is when you are checking to see if the current Req_Rec (the URL-path examined by RewriteRule) has been previously rewritten.

In these cases, it is required to check %{THE_REQUEST} to be sure that the current URL-path matched by the RewriteRule was directly-requested by the client, and did not occur as the result of a previously-executed internal rewrite (in the context of this current HTTP transaction).

Note that there's another way to detect previously-executed internal rewrites: examining %{ENV:REDIRECT_STATUS} to see if it's blank or not.

However, in the case at hand, neither test is needed since the result of a match is a 410-Gone response.

Jim

g1smd

7:53 pm on Apr 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think that means: 'omit the RewriteCond', you only need the RewriteRule.

chowse

8:54 pm on Apr 19, 2009 (gmt 0)

10+ Year Member



Ok, I will change that. BTW, it's been working! Now if Google FeedFetcher will just give up and go away!