Welcome to WebmasterWorld Guest from 3.226.251.81

Forum Moderators: Ocean10000 & phranque

410 pages with certain parameters

     
9:28 am on Jan 7, 2019 (gmt 0)

Junior Member

10+ Year Member

joined:May 23, 2005
posts: 99
votes: 0


Hi,
I have a number of pages for which I would like to return a 410. The url's all include either one or more of the following parameters:
cur_page
price-min
beds-min
baths-min
price-max

Here an example url:
http://www.example.com/index.php?cur_page=0&action=searchresults&price-min=000&price-max=200000&beds-min=1

From looking through posts on the forum I have so far come up with the following:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{QUERY_STRING} url=(cur_page|price-min|beds-min|baths-min|price-max)
RewriteRule ^$ - [G]

Unfortunately it does not work. Any idea what might be wrong or if I am even on the right path with this? Thanks for your help!
10:54 am on Jan 7, 2019 (gmt 0)

Full Member

Top Contributors Of The Month

joined:Apr 11, 2015
posts: 328
votes: 24


RewriteCond %{QUERY_STRING} url=(cur_page|price-min|beds-min|baths-min|price-max)


What is "url=" intended to match? (This does not appear in the example URL/query string you posted)

RewriteRule ^$ - [G]


(Assuming you are using .htaccess). The regex ^$ matches an empty URL-path, whereas your example URL contains "index.php".

If you have other directives in your config file then these could also be a factor, as the order could be important.

<IfModule mod_rewrite.c>


Aside: The IfModule wrapper is not required here.
3:44 pm on Jan 7, 2019 (gmt 0)

Junior Member

10+ Year Member

joined:May 23, 2005
posts: 99
votes: 0


Thank you for your reply!
I must have misunderstood the thread where I got the code from, I thought it would be an easy way to 410 everything with just one line.
I have now added separate directions like the following:
RewriteCond %{QUERY_STRING} \bcur_page\b [NC]
RewriteRule ^ - [G]

This seems to work ok.
7:01 pm on Jan 7, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15753
votes: 826


I thought it would be an easy way to 410 everything with just one line
Yes, you can do that. The question was just where the heck the "url" part came from.
RewriteCond %{QUERY_STRING} (cur_page|price-min|beds-min|baths-min|price-max)
without anchors should work just fine.

But wait! Are there other parameters containing "min" or "max" that you need to keep? If not, all you'd need is
RewriteCond %{QUERY_STRING} \b(cur_page|min|max)\b
taking advantage of the fact that - (hyphen) is a non-word character.
12:30 am on Jan 8, 2019 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11771
votes: 224


RewriteRule ^ - [G]


this rule will fire on every request (including images, for example), requiring the evaluation of the conditional.
if you specify a more restrictive pattern you can make things more efficient.

i would go with something like this:
RewriteCond %{QUERY_STRING} \b(cur_page|price-min|beds-min|baths-min|price-max)=
RewriteRule ^index\.php$ - [G]
8:50 am on Jan 8, 2019 (gmt 0)

Junior Member

10+ Year Member

joined:May 23, 2005
posts: 99
votes: 0


Thanks for all your replies and information!

Sorry about the confusion regarding the 'url=' in the code I used. I had another look at the thread I had it from and it was actually part of the url of the example that was used - I had thought it was a placeholder or expression.....

The suggestions work very well, thank you! I like
RewriteCond %{QUERY_STRING} \b(cur_page|min|max)\b
but I am not 100% certain that there might not be some links at the backend of the cms with the 'min' or 'max' parameters. I decided to go with the more restrictive rule from the last post and all seems to work great.
Thanks again!
10:53 am on Jan 9, 2019 (gmt 0)

Junior Member

10+ Year Member

joined:May 23, 2005
posts: 99
votes: 0


Hi again,
Unfortunately I have come across another issue I need to fix with a 410 response:
There are loads of pages with two constant terms followed by a random number, like
[url]http://www.example.com/constant1/constant2/2345.html[/url]
I now put int o the htaccess file:
RewriteRule constant1/constant2 - [G]

It seems to work ok, but I am wondering if this is safe / sufficient or if I need to add any extra characters?
Thanks!
11:39 am on Jan 9, 2019 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11771
votes: 224


RewriteRule constant1/constant2 - [G]

It seems to work ok, but I am wondering if this is safe / sufficient or if I need to add any extra characters?

it's generally most efficient to make the pattern as restrictive as possible:
RewriteRule ^constant1/constant2/[0-9]+\.html$ - [G]

[edited by: phranque at 10:11 pm (utc) on Jan 9, 2019]

3:03 pm on Jan 9, 2019 (gmt 0)

Junior Member

10+ Year Member

joined:May 23, 2005
posts: 99
votes: 0


Thank you, but unfortunately it does not seem to work.
It returns a '200' response (the urls are all links to pages with an individual photo. For some reason the cms never returns a '404' whether the url / picture ever existed or not).
Could that be why?
9:23 pm on Jan 9, 2019 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15753
votes: 826


If you're using a CMS, make sure all your own rules are located before the CMS segment, which typically involves sending all requests for files-that-don't-physically-exist to index.php. You are right that server logs will always show a 200 response, because all it means is that the server has successfully rewritten to a file that does exist, namely index.php or whatever it may be. If it's a properly coded CMS, it will then send out a 404 response; you just won't see it in logs. (It took me a long time to wrap my brain around the fact that the response the server records is not necessarily the response the user receives.)

If the part you quote is the beginning of the URLpath, then put in an opening anchor:
RewriteRule ^constant1/constant2/ - [G]
If nothing fitting this pattern is still in use, you don't need a closing anchor or a longer pattern, because the server already has all the information it needs.
1:05 am on Jan 10, 2019 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11771
votes: 224


but unfortunately it does not seem to work.

i see that i had a typo which i have edited to correct.

if lucy24's solution is sufficient, go with that:
RewriteRule ^constant1/constant2/ - [G]

if that isn't specific enough then use my version:
RewriteRule ^constant1/constant2/[0-9]+\.html$ - [G]
8:38 am on Jan 10, 2019 (gmt 0)

Junior Member

10+ Year Member

joined:May 23, 2005
posts: 99
votes: 0


Thank you both very much for your help!

Regarding the cms - as Google has loads of non existent pages either indexed / as soft 404's or at least listed as 'not indexed', they must be returning a 200 response.
I do have the cms code below my other rules, and I guess it might be the following rule from the cms that deals with the rewrite of pages that do not exist:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]

But maybe best to leave it alone, I think thanks to your help most if not all non existent pages should now be 'gone'!
9:05 am on Jan 10, 2019 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11771
votes: 224


RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . index.php [L]

this is a typical catchall rewrite ruleset used by a cms.
these directives check to see if the requested filename is a file or directory and if not, all requests (other than "GET /") are internally rewritten to index.php

index.php must check that any url path rewritten to it is legitimately a canonical url.
otherwise the response should be a 301 or a 404/410.