Forum Moderators: phranque
I have an unusual problem with google spidering pages that technically exist but do not have content on them, just header/logo and footer.
Site is php/mysql driven and lists widgets by category, 10 to a page. It uses /category.php for 1st page of widgets, .php?offset=10 for 2nd page, .php?offset=20 for 3rd page, etc. Do to poor programming, urls with negative offsets (offset=-365) are being returned with no content and spidered by google.
I need a short term fix to 301 re-direct to the correct main category page for each of these negative offsets until I can get the php/db recoded properly. (I'm not a programmer.)
I've tried this which does not work:
redirect 301 /sub-dir/redwidget.php?offset=-325 h**p://www.domain.com/sub-dir/redwidget.php
This format is working successfully with other pages/directories.
I have:
Options +FollowSymLinks
RewriteEngine On
before this directive (and others which are working).
What I basically need to do is:
redirect: /sub-dir/redwidget.php?offset=any-negative-value
to
domain.com/sub-dir/redwidget.php
I've successfully done htaccess redirects for non-www to www, old directory names, to new ones, etc. but am stumped on this one.
Suggestions?
Thanks.
See RewriteCond [httpd.apache.org] directive used with %{QUERY_STRING} server variable.
Jim
Thanks for the QUERY_STRING advice, had missed that, and for the apache link. I started at apache.org but unfortunately much of that doc is above my comprehension level so I'm back to WebmasterWorld.
From your QUERY_STRING hint I've made some progress, uncovered and fixed a problem, thought of a few more issues but haven't gotten it right yet.
First, I added: RewriteRule ^\.htaccess$ - [F] because the file was exposed. That part works.
Then to address initial problem I picked up part of your regex example from forum92/830.htm post 8 and edited it. I also came to the conclusion that I should be showing google a true 404 (not custom 404 page) rather than a 301.
I've edited your regex example to the following:
RewriteCond %{QUERY_STRING}!^offset=([0-9] {1,4})$ (there's a space before the!)
I don't want negative offsets spidered and want a 404 returned, so above =
offset not positive integer, total offset length up to 4
so, rewrite negative offsets to 404:
RewriteRule ^$ [R=404]
I've tested various combinations of above to no avail. Also tested patterns like: ^\/subdir\/greenwidget\.php?offset=\-$ which also doesn't work and and then realized I'd have to do this for all widget categories anyway. Not really sure which way to go next. The part of the htaccess file addressing this issue is:
Options +FollowSymLinks
RewriteEngine On
RewriteBase /
RewriteRule ^\.htaccess$ - [F]
#RewriteCond %{QUERY_STRING} (SPACE)!^offset=([1-9] {1,4})$
#RewriteRule ^$ [R=404]
# return a 404 code, not custom 404 error page
# above for any offset not starting from positive 1 to 9
# for all red, blue, green, etc. widget.php files
#http://www.domain.com/subdir/redwidgit.php?offset=negative number
The pattern string may be correct but the 404 rewrite rule seems to kill the whole site so it's remmed out for now. Am I even close?
Any further advice appreciated.
Thanks,
Jim
The closest you can come to a [R=404] is to use 410-Gone which is even more specific than 404-Not Found.
Also, it seems to me that you're over-thinking this problem, in that it doesn't matter at all what URL is requested; negative offset queries aren't acceptable no matter what the URL is. So, simplifying, you could use something like:
RewriteCond %{QUERY_STRING} ^offset=[^0-9]
RewriteRule .* - [G]
There is no difference in server response signalling between a "custom 404" and a "server 404." However, a common error is that Webmasters will use a canonical URL in an ErrorDocument directive instead of the (documented) local URL-path, and that creates a 302-Found response instead of the desired 404. It's clearly documented, but...
Jim