Forum Moderators: phranque

Message Too Old, No Replies

RewriteRule Problems

Lame Lame Lame Lame

         

ahmedtheking

6:51 pm on Nov 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ok, i've just had enough now! Grr! I'm just trying to get my htaccess file to allow ppl to access 'static' urls (yoursite.com/this/this/this instead of yoursite.com/?get=this).

This is my htaccess file. It's big. A lot of the stuff is from a previous thread here (to block bad bots, good stuff!)

# this is to stop the htaccess file being loaded
<Files 403.shtml>
order allow,deny
allow from all
</Files>

# options
Options +FollowSymLinks

# turn rewrite engine on
RewriteEngine On

#this is to block bad bots and browsers
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
RewriteCond %{HTTP_USER_AGENT} ^bandit [OR]
RewriteCond %{HTTP_USER_AGENT} ^BatchFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Buddy [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
RewriteCond %{HTTP_USER_AGENT} ^Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Copier [OR]
RewriteCond %{HTTP_USER_AGENT} ^DA [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo\ Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Wonder [OR]
RewriteCond %{HTTP_USER_AGENT} ^Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Drip [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
RewriteCond %{HTTP_USER_AGENT} ^FileHound [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetSmart [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
RewriteCond %{HTTP_USER_AGENT} ^gotit [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Iria [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC [OR]
RewriteCond %{HTTP_USER_AGENT} ^JustView [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^lftp [OR]
RewriteCond %{HTTP_USER_AGENT} ^likse [OR]
RewriteCond %{HTTP_USER_AGENT} ^Magnet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mag-Net [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
RewriteCond %{HTTP_USER_AGENT} ^Memo [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mirror [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZip [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ninja [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pockey [OR]
RewriteCond %{HTTP_USER_AGENT} ^Pump [OR]
RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Reaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Recorder [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
RewriteCond %{HTTP_USER_AGENT} ^Snake [OR]
RewriteCond %{HTTP_USER_AGENT} ^SpaceBison [OR]
RewriteCond %{HTTP_USER_AGENT} ^Stripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Vacuum [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^Whacker [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^MSFrontPage [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*Indy [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
#RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetSeer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^Ping [OR]
RewriteCond %{HTTP_USER_AGENT} ^Link [OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule /*$ [microsoft.com...] [L,R]

# this is for virtual static linkage
# for about section
RewriteRule about(/?)$ main.php?goto=ab_index

# for services section
RewriteRule services/(.*)/$ main.php?goto=sv_%1
RewriteRule services(/?)$ main.php?goto=sv_index

# this is a redirect for da cp
RewriteRule da(/?)$ [66.246.229.21:2222...] [R]

# this is for php error reporting
php_flag register_globals on
php_flag display_errors on
php_value error_reporting 7

- END -

The area that's annoying me is this:

# this is for virtual static linkage
# for about section
RewriteRule about(/?)$ main.php?goto=ab_index

# for services section
RewriteRule services/(.*)/$ main.php?goto=sv_%1
RewriteRule services(/?)$ main.php?goto=sv_index

# this is a redirect for da cp
RewriteRule da(/?)$ [gohere.com...] [R]

For the first 3 rules, they just don't work! However, the 4th rule works, but only like www.this.com/da NOT www.this.com/da/. It's so lame! Please can someone help me?

jdMorgan

5:59 pm on Nov 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There are several problems and needed tweaks here, but since you didn't explain what you mean by 'it doesn't work' and since there are hundreds of ways for something to 'not work,' all I can do is to suggest the following, and suggest that you spend some time reading the references cited in our fourm charter [webmasterworld.com], and using them to understand your code line.by-line.

# this is to stop the htaccess file being loaded
<Files 403.shtml>
order allow,deny
allow from all
</Files>

This doesn't stop the .htaccess file from being requested via HTTP. It only refers to "403.shtml". This replacement or additional code

# this is to stop the htaccess file being loaded
<Files ~ \.htaccess$>
order allow,deny
allow from all
</Files>

will probably work better...


# options
Options +FollowSymLinks
# turn rewrite engine on
RewriteEngine On
#
#this is to block bad bots and browsers
RewriteCond %{HTTP_USER_AGENT} ^attach [OR]
RewriteCond %{HTTP_USER_AGENT} ^BackWeb [OR]
...
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
#
# UAs beginning with "Web"
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]

You could shorten the above section to a single line:

RewriteCond %{HTTP_USER_AGENT} ^Web(\ Image\ ¦\ Sucker¦Auto¦Copier¦Fetch¦Reaper¦Sauger¦site¦ster¦Stripper¦ZIP) [NC,OR]

and apply that technique to many other groups in your list as well. Note that you'll need to change the broken pipe "¦" characters to solid pipe characters before use; Posting on this board modifies them. Also, note the [NC] flag that makes the string compare case-insensitive.


RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
...
#RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]

Duplicate line?

RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
...
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule /*$ http://www.microsoft.com [L,R]

It's rather a waste of time to try to *redirect* bad-bots. They typically won't follow a redirect anyway, and if they did, it's not right to pass your problems on to someone else, anyway. Even if you don't like MS, you would still be causing extra wasted traffic on the internet. Just 403 these user-agents and be done with it:

RewriteRule .* - [F]


# this is for virtual static linkage
# for about section
RewriteRule about(/?)$ main.php?goto=ab_index

This should work, assuming that your code is in a directory *above* the /about directory, and that the requested URL ends with "about" or "about/". I'd suggest you add an [L] flag, though, unless you want to further process this URL after it has been rewritten.


# for services section
RewriteRule services/(.*)/$ main.php?goto=sv_%1

The back-reference refers to a non-existent RewriteCond preceding this rule. If you want to refer to the matched URL-path in the RewriteRule pattern, use "$1" not "%1". Otherwise, this should work, as long as your code is in a directory above "services/ and the requested URL-path ends with slash <anything> slash. Again, add an [L] flag unless you're sure you don't want it. As long as there's always going to be something between the last two slashes, then this would be faster:

# for services section
RewriteRule services/([^/]+)/$ main.php?goto=sv_$1 [L]

This requires at least one character not equal to a slash to be present between the two slashes, and it can be processed faster than "(.*)/$" because no regular-expressions backoff match passes are required.


RewriteRule services(/?)$ main.php?goto=sv_index

This rule should work as long as the code is above the services directory, and the requested URL-path ends with "services" or "services/". But there is no need to use parentheses here, since you're not creating a back-reference and only the single trailing "/" is optional:

RewriteRule services/?$ main.php?goto=sv_index [L]


# this is a redirect for da cp
RewriteRule da(/?)$ http://66.***.229.21:2222/ [R]

This should work as long as the requested URL ends with "da" or "da/". Again, you should add an [L] flag.

# this is for php error reporting
php_flag register_globals on
php_flag display_errors on
php_value error_reporting 7

No comment, see PHP forums if any trouble with this part... :)

Jim

ahmedtheking

6:22 pm on Nov 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ok rewritten my code, but when i request, say, www.this.com/about (with or without a '/') (and without having a directory called 'about') I just get a 404 error. Otherwise when I create the about dir, it just goes to that dir. It doesn't want to redirect for some reason!

jdMorgan

6:35 pm on Nov 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Try adding

Rewriteoptions inherit

just after

RewriteEngine on

Jim

ahmedtheking

1:53 pm on Nov 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Doesn't work! This is the contents of my htaccess file:

# this is to stop the htaccess file being loaded
<Files ~ \.htaccess$>
order allow,deny
allow from all
</Files>

# options
Options +FollowSymLinks

# turn rewrite engine on
RewriteEngine On
Rewriteoptions inherit

[...] (bot stuff)

# this is for virtual static linkage
# for about section
RewriteRule about/?$ main.php?goto=ab_index [L]

# for services section
RewriteRule services/([^/]+)/$ main.php?goto=sv_$1 [L]
RewriteRule services/?$ main.php?goto=sv_index [L]

# this is a redirect for da cp
RewriteRule da/?$ [****...] [L]

# this is for php error reporting
php_flag register_globals on
php_flag display_errors on
php_value error_reporting 7

Have a look if you would like: <snip>

[edited by: jdMorgan at 5:10 pm (utc) on Nov. 30, 2005]
[edit reason] No URLs please, See TOS. [/edit]

ahmedtheking

12:43 pm on Dec 6, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



anyone? it's still not working!