Forum Moderators: phranque
I had looked back and someone previously discussed a "%20" problem.
I have tried
RewriteEngine on
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}!^www\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
-and-
RewriteEngine on
RewriteCond %{HTTP_HOST} ^\%09www\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
No success
What do you mean?
How did you test?
What was the result?
How does that differ from your expectations?
Those basic questions/answers will help us help you.
The next thing to determine is whether a click on that malformed link will even reach your site. If so, then a solution can be found. If not, then you can try setting up wild-card DNS, and hope that "%09www" will be treated as a valid subdomain and passed to your server.
FYI, %09 is a Tab character, so it's likely the URL was found in or copied from a non-html document.
Jim
216.113.181.67 - - [25/Mar/2006:10:59:54 +0000] "GET ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦/mfc/pb.html#MR2800 HTTP/1.0" 200 47473 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Q312461; .NET CLR 1.1.4322)" In:- Out:-:-pct.(to prevent this board compressing them, I have replaced multiple spaces--tabs?--with a `¦')
216.113.181.67 - - [12/Mar/2006:10:49:07 +0000] "GET ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦/mfc/pb.html#MR2800 HTTP/1.0" 200 47473 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Q312461; .NET CLR 1.1.4322)" In:- Out:-:-pct.
216.113.181.67 - - [14/Mar/2006:18:16:50 +0000] "GET ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦/mfc/pb.html#MR2800 HTTP/1.0" 200 47473 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Q312461; .NET CLR 1.1.4322)" In:- Out:-:-pct.
httpd.conf for the relevant VirtualHost contains:
#
# redirect all non-``www.my-site.co.uk' requests
#
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST}!^www\.my-site\.co\.uk
RewriteRule ^/(.*) http://www.my-site.co.uk/$1 [L,R=permanent] I would have expected the above to redirect requests preceded by a space/tab/white-space, but clearly not.
What is missing here?
Try to do env variable dump and see what it displays. Does it display the malformed URL or the clean version?
You can do it easily with php -
<?php phpinfo();?>
or in Perl
#!/usr/bin/perl
print "Content-type: text/html\n\n";
print join '', map "$_ = $ENV{$_}<br>\n", keys %ENV;
> I would have expected the above to redirect requests preceded by a space/tab/white-space, but clearly not.
> What is missing here?
Your domain redirect code won't affect your problem, because the tabs/spaces are not in your domain, they're in your URL-path. You can catch that kind of stuff by using %{THE_REQUEST} in a RewriteCond, and looking for one or more hex-encoded entities after the method (GET, POST, etc.) and before the first "/":
RewriteCond %{THE_REQUEST} ^[A-Z]+\ (\%[0-9a-f]{2})+/([^\ ]*)\ HTTP/
RewriteRule .* http://www.example.com/%2 [R=301,L]
GET /index.hmtl HTTP/1.0
If you wish to continue this discussion of bad URL-paths, let's do so in a separate thread, so as not to hijack discussion of Wally_Books (related-but-different) problem.
Thanks,
Jim
>What do you mean?
Google has 5000+pages from my site in the format
%09www.mysite.com/mypage.html
>How did you test?
My redirects? As in my example. I'm new to .htaccess so very likely was not doing something correctly.
>What was the result?
Didn't work, "could not be found"
>How does that differ from your expectations?
I recently changed hosts, and the previously host had a zeus server that could not do a 301 redirect. These links are pointing to my site, I was getting traffic from them at my previous host. According to my previous host: "Currently your site is set to redirect anything.mysite.com"
My current host is going to look into it but I'm not optimistic.
>The next thing to determine is whether a click on that malformed link will even reach your site.
They were being redirected at my previous host, I am trying to find out how, they do not have a phone# only email and a very slow to respond. Again they have a zeus server so their redirect may not apply to an apache.
>FYI, %09 is a Tab character, so it's likely the URL was found in or copied from a non-html document.
Yes, I've heard this,
Please let me know if there is any other information you might need to help me resolve this. I'm in supplemental h*ll as it is. I could use the little traffic this links were generating.
Wally
RewriteEngine on
RewriteCond %{HTTP_HOST} .
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
Options +FollowSymLinks
Jim