Welcome to WebmasterWorld Guest from 54.160.221.82

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

Blocking query-string URLs

     
10:32 pm on Jan 6, 2011 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member encyclo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 31, 2003
posts:9063
votes: 2


I've recently updated a site which had some URLs in the form of:

example.com/filename.php?do=something


These URLs coexisted with different content under the URL
/filename.php
(with no query string). All the query-string URLs were robots-excluded (they were basically login pages for a CMS linked from each page).

In the new site, there are no URLs with query strings anywhere - however, the server configuration now allows for duplicate content for filename.php with or without the query string.

My fix is as follows:

RewriteCond %{QUERY_STRING} .
RewriteRule ^.* - [G,L]


All URLs with a query string now issue a 410 Gone response, which is good. But is there a better way? Specifically, can can the server issue a 404 for filename.php?something without having to "patch" with mod_rewrite?
10:47 pm on Jan 6, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


The L flag is not needed. Use just [G], as G implies L.

The .htaccess file is the best way to fix this.

The server never gets as far as invoking PHP, passing data to that process, doing disk reads, processing PHP instructions, and so on, so the .htaccess version is very much quicker and way more efficient.

The doorman rejects the requests right at the front door.
12:20 am on Jan 7, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Yes, trim it down:

RewriteCond %{QUERY_STRING} .
RewriteRule ^ - [G]

Or if there are any valuable 'straggler' links to those query-stringed URLs out on the Web, then just 301-redirect all such requests to remove the query string:

RewriteCond %{QUERY_STRING} .
RewriteRule ^(.*)$ http://www.example.com/$1? [R=301,L]

(This also helps speed up the cleaning-up of the SERPs and your GWT reports.)

The server cannot issue a 'native 404 response' to a request for /filename.php?bogus-query because in fact, /filename.php exists as a file -- The query string is not part of either the URL or the filename; it is simply data appended to the URL to be passed to the resource located at that URL. And in this case, that URL resolves to an existing file. So, to force a 404 on these requests would take the same amount of work as generating the 410-Gone response does:

Apache 1.x or 2.x :

# Force a 404-Not Found response to any URL with any query string appended
RewriteCond %{QUERY_STRING} .
RewriteRule ^ /some-filepath-that-will-never-exist.lmth [L]

Apache 2.x only:

# Force a 404-Not Found response to any URL with any query string appended
RewriteCond %{QUERY_STRING} .
RewriteRule ^ - [R=404,L]

Jim