Forum Moderators: phranque
G. is crawling and indexing (thanks to people linking to me like this) and giving me a duplicate content penalty:
example.com// (notice double slashes, server responds code 200 anyway)
example.com/?nonsense_query_string_that_never_existed
example.com/page.php/
example.com/index.php
What I need to do is 301 all those to the propper pages, ie:example.com, example.com/page.php without a trailing slash and all without upsetting the major search engines by having multiple redirects and without harming my sites search section that has a url like:
http://example.com/cgi-bin/search/search.pl?p%3Apm=1&Terms=searchterm
There's many old pages I moved a few months ago, and I've redirected www to non www so I already have this in the .htaccess:
Options +FollowSymLinks
RewriteEngine on
rewriterule ^oldpage\.shtml$ http://example.com/folder/newpage.php [R=301,L]
rewriterule ^oldpage2\.shtml$ http://example.com/folder2/newpage2.php [R=301,L]
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}!^example\.com
RewriteRule (.*) http://example.com/$1 [R=301,L]
.htacces frightens me at the best of times, this one is so far over my head I can't even see it, any help is appreciated.
# Remove multiple slashes anywhere in URL (less efficient than next rule)
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . http:example.com%1/%2 [R=301,L]
#
# Remove multiple slashes after domain (more efficient, but not for use in httpd.conf):
RewriteRule ^/(.*)$ http:example.com/$1 [R=301,L]
#
# Remove query strings on *all* requests:
RewriteCond %{QUERY_STRING} .
RewriteRule (.*) http:example.com/$1? [R=301,L]
#
# Remove trailing slash if filetype present in URL
RewriteRule ^(.+\.[^/]+)/$ http:example.com/$1 [R=301,L]
A little while ago, after my (shared) server moved up to Apache 2, I found that testing:
RewriteCond %{QUERY_STRING} .
no longer worked against "/foo.html?" ie. an empty query string.
Since then I've used:
RewriteCond %{THE_REQUEST} [?]
RewriteRule ^(.*)$ ht*p://www.mysite.net/$1? [R=301,L]
Peter.
Options +FollowSymLinks
RewriteEngine on
rewriterule ^oldpage1\.shtml$ http://example.com/folder1/newpage1.php [R=301,L]
rewriterule ^oldpage2\.shtml$ http://example.com/folder2/newpage2.php [R=301,L]
#
# Remove multiple slashes anywhere in URL (less efficient than other rule but works on inner folders?)
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . http:example.com%1/%2 [R=301,L]
# Remove trailing slash if filetype present in URL
RewriteRule ^(.+\.[^/]+)/$ http:example.com/$1 [R=301,L]
#
# Remove query strings on *all* requests (will this kill my site search?)
RewriteCond %{THE_REQUEST} [?]
RewriteRule ^(.*)$ http://example.com/$1? [R=301,L]
#
# Redirect all www to non www
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}!^example\.com
RewriteRule (.*) http://example.com/$1 [R=301,L] Any visible errors?
Is that the right order so there won't be multiple redirects? ie www.example.com/?stuff goes in 1 step to example.com/ etc.
Also, the site search uses a? in the url:
http://example.com/cgi-bin/search/search.pl?p%3Apm=1&Terms=searchterm
Also outgoing links look like:
http://example.com/l/go.php?id=333
http://example.com/link/?o=id
Will the removal of all query strings kill those or just ones after example.com/? If it does kill them, is there a way to allow it in specific folders or something?
To be sure I understand, add this in here like this you mean, right?
RewriteCond %{THE_REQUEST} [?]
RewriteCond %{REQUEST_URI}!^/cgi-bin/search/search\.pl$
RewriteCond %{REQUEST_URI}!^/link/$
RewriteCond %{REQUEST_URI}!^/l/go\.php$
RewriteCond %{REQUEST_URI}!^/l/admin/search\.php$
RewriteCond %{REQUEST_URI}!^/l/admin/view_cat\.php$
RewriteCond %{REQUEST_URI}!^/l/admin/view_stats\.php$
RewriteRule ^(.*)$ http://example.com/$1? [R=301,L]
Is something like this allowed, just to let the entire script in the /l/ folder use strings, or is it necessary to go through each page in the admin? (There are quite a few, 50 or so I'm guessing from a glance, so I want to be as efficient as possible for my server sake.)
RewriteCond %{REQUEST_URI}!^/l/*\.php$
If this is possible, is that how it's written? How, if that's not quite it?
Again, thank you so much for helping.
The address that needs to work looks like http://example.com/link/?o=id
So far the rest is working, (oddly the search is working fine even without an exclusion rule) but after hours of messing with this line I can't see what's wrong. I've tried with and without the space before!^ and a bunch of other, more-than-likely newbish, variations.
Finally got index.php redirecting properly (I think) to / as well =)
Your RewriteCond pattern is correct and should work for the example URL you provided, so by circular argument, I have to ask: Did you flush your browser cache before testing?
Jim