| How to deal with a bogus query string? How to deal with a bogus query string? |
chazeo

msg:3381683 | 1:32 am on Jun 29, 2007 (gmt 0) | The site in question is an Apache Running php as CGI. The site is also a WP blog. How would I go about dealing with this bogus query string? Thanks, Chaz <snip> [edited by: jdMorgan at 7:13 pm (utc) on June 29, 2007] [edit reason] Removed *private* message... [/edit]
|
chazeo

msg:3381688 | 1:40 am on Jun 29, 2007 (gmt 0) | Jim, AS I mentioned in the email, the site in question is Wordpress 2.0.9. Below is the .htaccess file contents: <IfModule mod_rewrite.c> RewriteEngine On RewriteBase / RewriteCond %{REQUEST_URI} ^/(stats¦failed_auth\.html)/?(.*)$ [NC] RewriteRule ^.*$ - [L] </IfModule> # BEGIN WordPress <IfModule mod_rewrite.c> RewriteEngine On RewriteBase / RewriteCond %{REQUEST_FILENAME}!-f RewriteCond %{REQUEST_FILENAME}!-d RewriteRule . /index.php [L] </IfModule> # END WordPress RewriteCond %{HTTP_REFERER} bogus-query-domain.com\.com order allow,deny deny from (server 1 IP address of bogus query domain) deny from (server 2 IP address of bogus query domain) allow from all
|
jdMorgan

msg:3382416 | 7:14 pm on Jun 29, 2007 (gmt 0) | Please answer all the questions I listed in our private exchange. Thanks. Jim
|
chazeo

msg:3382470 | 8:14 pm on Jun 29, 2007 (gmt 0) | Jim, When you ask about query strings, is that only content that Google/SE's index or all possible query strings such as admin links (for logging in, editing posts, etc...)? I assume the only area of the site that deals with query strings is the /wp-admin/ folder (WordPress), correct?
|
jdMorgan

msg:3382622 | 1:39 am on Jun 30, 2007 (gmt 0) | That's what we need to know, since your server isn't like my servers, so I have no idea what your admin links might look like. If they are all in the same directory, then just knowing the directory name is enough -- I mean, we don't have to exclude each individual URL that does use query strings one at a time! The trick is to find and use just enough of the local URL-path to exclude URLs that *do* use query strings on your site from being modified. Also, since I don't use WP myself, and don't read many blogs, I have no idea yet if your site is all-static (except for /admin) or if it's a full-blown dynamic site. The problem can be solved either way, but knowing exactly what the problem is will reduce the time required to days instead of months... :) The basic idea is to exclude folders and files on your site that *do* require query strings, and redirect the rest, like this:
# If the query-string is non-blank RewriteCond %{QUERY_STRING} . # and if not an admin folder request RewriteCond %{REQUEST_URI} !/admin/ # and if not a "birdpressed" script request RewriteCond %{REQUEST_URI} !/birdpressed/ # and if not a widgetizer.php script request RewriteCond %{REQUEST_URI} !/scripts/widgetizer.php # then externally redirect to the requested URL, but with the query string cleared. RewriteRule (.*) http://www.example.com/$1? [R=301,L]
Things can get very complex if your site is fully-dynamic and someone is injecting random name/value pairs into your query strings or scrambling up their order. These problems can be fixed, but it's complicated to do on a forum or in e-mail -- Fully and precisely describing the problem in terms of "good" and "bad" URL-query combinations becomes 99% of the work! Jim
|
|
|