homepage Welcome to WebmasterWorld Guest from 54.161.192.61
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Blocking query-string URLs
encyclo

WebmasterWorld Senior Member encyclo us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4250056 posted 10:32 pm on Jan 6, 2011 (gmt 0)

I've recently updated a site which had some URLs in the form of:

example.com/filename.php?do=something

These URLs coexisted with different content under the URL
/filename.php (with no query string). All the query-string URLs were robots-excluded (they were basically login pages for a CMS linked from each page).

In the new site, there are no URLs with query strings anywhere - however, the server configuration now allows for duplicate content for filename.php with or without the query string.

My fix is as follows:

RewriteCond %{QUERY_STRING} .
RewriteRule ^.* - [G,L]


All URLs with a query string now issue a 410 Gone response, which is good. But is there a better way? Specifically, can can the server issue a 404 for filename.php?something without having to "patch" with mod_rewrite?

 

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4250056 posted 10:47 pm on Jan 6, 2011 (gmt 0)

The L flag is not needed. Use just [G], as G implies L.

The .htaccess file is the best way to fix this.

The server never gets as far as invoking PHP, passing data to that process, doing disk reads, processing PHP instructions, and so on, so the .htaccess version is very much quicker and way more efficient.

The doorman rejects the requests right at the front door.

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4250056 posted 12:20 am on Jan 7, 2011 (gmt 0)

Yes, trim it down:

RewriteCond %{QUERY_STRING} .
RewriteRule ^ - [G]

Or if there are any valuable 'straggler' links to those query-stringed URLs out on the Web, then just 301-redirect all such requests to remove the query string:

RewriteCond %{QUERY_STRING} .
RewriteRule ^(.*)$ http://www.example.com/$1? [R=301,L]

(This also helps speed up the cleaning-up of the SERPs and your GWT reports.)

The server cannot issue a 'native 404 response' to a request for /filename.php?bogus-query because in fact, /filename.php exists as a file -- The query string is not part of either the URL or the filename; it is simply data appended to the URL to be passed to the resource located at that URL. And in this case, that URL resolves to an existing file. So, to force a 404 on these requests would take the same amount of work as generating the 410-Gone response does:

Apache 1.x or 2.x :

# Force a 404-Not Found response to any URL with any query string appended
RewriteCond %{QUERY_STRING} .
RewriteRule ^ /some-filepath-that-will-never-exist.lmth [L]

Apache 2.x only:

# Force a 404-Not Found response to any URL with any query string appended
RewriteCond %{QUERY_STRING} .
RewriteRule ^ - [R=404,L]

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved