Forum Moderators: DixonJones

Message Too Old, No Replies

redirecting visitirs using anonymizers?

         

NotSoSavvy

11:55 pm on Feb 22, 2003 (gmt 0)

10+ Year Member



I'm not exactly well versed in website design but I know some basics.

I have a problem with a stalker. Long story, but it involves multiple restraining orders, etc., and has continued into internet stalking.

After police reports were filed about the violations of Restraining Order via the internet, my stalker has learned to use anonymous proxies.

I want to be able to redirect all visitors to my site that use anonymous prxies to another page, however some services disable the javascript so it doesn't work. Anonymizer is one of those services.

Is there some way of doing this?

Ideally, I would like to redirect all visitors using anonymous services to be directed to a frames page, where the top frame would (falsly) claim the users ip address has been logged, and the bottom from would be www.fbi.gov I'm thinking that might scare them off for a while.

From the behavior of my stalker, I can see that the ONLY visitor I have using anonymous services is the stalker themselves, so I need not worry about sending a random visitor to this page.

Any suggestions?

And does anyone know about banbots, and whether this can help me?

jdMorgan

12:27 am on Feb 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NotSoSavvy,

Welcome to WebmasterWorld [webmasterworld.com]!

You can block by IP address, user-agent, remote host, or referrer, and that's what I suggest you do: either block access completely, or redirect to a page with a nondescript message like "fatal error" and nothing else.

Forget the F BI thing - it's too transparent and obviously untrue... The FB I is not in the habit of warning criminals. Doing this might just make your problem worse.

If you describe the information - the fingerprint - this person leaves in your log files, and also your server configuration, you can get more detailed help here.

On a personal note, I hope you are equipped to take care of this person in the absense of the authorities - do not count on them for emergency help (also a long story).

Jim

NotSoSavvy

1:28 am on Feb 23, 2003 (gmt 0)

10+ Year Member



>>Welcome to WebmasterWorld!

Thanks!

>>You can block by IP address, user-agent, remote host, or referrer, and that's what I suggest you do: either block access completely, or redirect to a page with a nondescript message like "fatal error" and nothing else.

I sort of understand this but not completely. Is it possible to redirect based on IP WITHOUT using javascript? Because some of the anonymous services automatically block javascipts.

>>Forget the F BI thing - it's too transparent and obviously untrue... The FB I is not in the habit of warning criminals. Doing this might just make your problem worse.

You'd be surprised how stupid my stalker is. Besides, that was just an example. Another idea would be a redirect to a fake "journal" in which I would describe my "plans for the evening", sending the stalker on a wildgoose chase tring to find me, successfully wasting their time and energy, which they seem to have too much of. I have a few ideas, I just don't know how to implement such a redirection.

I also don't just want to block them completely, as the logs are important for reporting to the police. (not that they actually DO anything about it), but still, in this situation what I really need to do is make it so this person sees a different "site" than the actual site my genuine visitors see.

>>If you describe the information - the fingerprint - this person leaves in your log files, and also your server configuration, you can get more detailed help here.

I don't know what a server config is, but the "fingerprint" is basically accessing the site same time every day, then immediately accessing the same pages (having to do with my whereabouts, my messageboard, and my guestbook), and then going directly to my spouse's website and repeating the behavior. Sometimes I can see the operating system and browser, sometimes not, but it is very clearly the same person.

>>On a personal note, I hope you are equipped to take care of this person in the absense of the authorities - do not count on them for emergency help (also a long story).

Trust me, I'm aware of the lack of help from authorities. LOONNG story. Involves 5 years of stalking, having me and my spouse followed, breaking into my apartment, stealing my mail and forging my signature on checks sent to me, harassment by email, repeated legal attempts to overturn the Restraining Orders against them (always fails at that but keeps me spending money on attorneys and court dates), hundreds and hundreds of phone messages, etc. In short, an ongoing nightmare.

Anyway, Thanks for any help.

jdMorgan

2:09 am on Feb 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NotSoSavvy,

If you are on Apache server, you can do redirection at the server level - The "fake page" is served instead of the real one, and the site visitor is none the wiser. This can also be done with IIS server, although I have no experience with it. No scripting of any kind is involved, and the results do not depend in any way on the visitor's client (browser).

If you don't know what server your site is hosted on, check it here [webmasterworld.com].

The fake journal sounds OK, but basically, never tease a nut-case. Posting details of your personal life might be construed in a trial as "asking for it," so you are already somewhat legally compromised in that respect. So, if this person "cracks," make sure you have done nothing to "make things worse." I realize this is not fair and I do not subscribe to the twisted ideas that may be used against you in court - I'm just advising caution here. :)

In addition to the link above, try a WebmasterWorld site search (see link at the top of this page) on "redirect" and "redirection" along with the name of your server software as given by the header-checker link I cited above. That will allow you to ask more pointed questions.

What you want to do is probably easy, but the difficulty lies in precisely defining what you want to do, and what information you want to base the action on.

Jim

NotSoSavvy

3:39 am on Feb 23, 2003 (gmt 0)

10+ Year Member



Thanks again.

No, of course I don't want to do anything that can be construed as "making things worse" or "inciting" or anything like that. But as I said, this person has all the time and energy in the world to make our lives hell, so something that would waste that time and energy ala' the fake online journal which would send them trying to track us down at locations we definitely aren't is a good thing.

The link you provided says i'm on an apache server, which i knew i just didn't understand the question.

I'll investigate the redirection posts you refer to, and see if i can make any sense of it.

Thanks again on that.

Another related question that may help...

A couple of months ago our stalker filed papers attempting to overturn the restraining order (again) and claimed that the police reports filed regarding the harassing emails sent to us were "fabricated", as well as the access logs indicating that the harassing messages on my messageboard originating from my stalkers IP address were also "fabricated", and that we were, in fact, the ones who were doing the "harassing" by filing "false" police reports.

Any suggestions, not including the very expensive process of subpoenaing records, on refuting those claims?

jdMorgan

4:09 am on Feb 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NotSoSavvy,

If you are on Apache, and if you have access to mod_rewrite [httpd.apache.org] via .htaccess, then you can redirect based on the request parameters I listed earlier.

Check to see if you have a file called .htaccess in the root directory of your site. If so, you can simply add some code to it to implement the redirect. If not, you can create a .htaccess file, and use the test code below.

I'm not a lawyer, and can't advise on the valdity of evidence. But you should save backup copies of your site's raw access logs, and consult your attorney on whether you need to contact your ISP about retaining their copy - since there is no way you could tinker with their copy, thus making the ISP's copy "untainted."

If you wanted to redirect a visitor based on a user-agent string, let's say "Anonomyzer III" just for example, then you could use the following code in .htaccess:


Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^Anonomyzer\ III$
RewriteRule ^journal\.html$ /bogusjournal.html [L]

This would redirect anyone using a user-agent called "Anonomyzer III" from "journal.html" to "bogusjournal.html" in a transparent manner. They would just see the bogusjournal page if they requested the journal page. Using a normal browser, they would see the real page. Therefore, you would also want to cover any known IP addresses as well, regardless of the user-agent used:

Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^Anonomyzer\ III$ [OR]
RewriteCond %{REMOTE_ADDR} ^127\.0\.0\.0$ [OR]
RewriteCond %{REMOTE_ADDR} ^192\.168\.0\.0$
RewriteRule ^journal\.html$ /bogusjournal.html [L]

The key is to test your server to see if you are permitted to use the mod_rewrite facility. A simple test would be something like this:


Options +FollowSymLinks
RewriteEngine on
RewriteRule ^testindex\.html$ /index.html [L]

You can install this in your .htaccess file, and then test it: Request the (non-existant) page "testindex.html" from your site. If the above code is working, it should serve your normal index.html page in response. If it does not work, or if you get a server error, you will need to remove the code immediately, and contact your hosting provider to see if they can enable mod_rewrite for you. Do this testing when your site traffic is lowest, usually late at night. If you cannot use mod_rewrite, then your options are limited to blocking access only - You won't be able to redirect conditionally.

Jim

NotSoSavvy

4:31 am on Feb 23, 2003 (gmt 0)

10+ Year Member



Wow. Thank you so much for your time.

I'm not sure if I have to know-how to implement it, but I'm sure going to try.

I don't have an htaccess that I can see. However, my server says this:

The following modules are installed with our webserver's Apache:

Compiled-in modules:
http_core.c
mod_env.c
mod_define.c
mod_log_config.c
mod_mime.c
mod_negotiation.c
mod_include.c
mod_autoindex.c
mod_dir.c
mod_asis.c
mod_actions.c
mod_speling.c
mod_alias.c
mod_rewrite.c
mod_access.c
mod_auth.c
mod_auth_dbm.c
mod_expires.c
mod_unique_id.c
mod_setenvif.c
mod_ssl.c
mod_cgiwrap.c
mod_phpcgiwrap.c
mod_frontpage.c
mod_dosevasive.c
suexec: enabled; valid wrapper /powweb/apache/bin/suexec

I'm assuming this includes what you're talking about.

It also says this:
We currently have the AllowOveride directive for Apache set on all of our Web hosting servers. Any commands that typically work in a .htaccess file should function properly.

I'm off to work now. I'll try and tackle this following your advise in the morning. Thanks again.

NotSoSavvy

12:29 pm on Feb 23, 2003 (gmt 0)

10+ Year Member



Your "test" was:

>>>The key is to test your server to see if you are permitted to use the mod_rewrite facility. A simple test would be something like this:

Options +FollowSymLinks
RewriteEngine on
RewriteRule ^testindex\.html$ /index.html [L]

You can install this in your .htaccess file, and then test it: Request the (non-existant) page "testindex.html" from your site. If the above code is working, it should serve your normal index.html page in response. If it does not work, or if you get a server error, you will need to remove the code immediately, and contact your hosting provider to see if they can enable mod_rewrite for you. Do this testing when your site traffic is lowest, usually late at night. If you cannot use mod_rewrite, then your options are limited to blocking access only - You won't be able to redirect conditionally.<<<

I did this, I think correctly, and received a "500" error.

Am I correct in assuming I can't do what i'm trying?

NotSoSavvy

12:35 pm on Feb 23, 2003 (gmt 0)

10+ Year Member



WAIT!

IGNORE MY LAST POST!

The "test" worked, and returned the index page with everything appearing as it should with the only difference is the url appearing with "/testindex.html" on the end. This is good enough, and I'm on to trying the whole thing now.

I'm all excited!

NotSoSavvy

2:57 am on Feb 24, 2003 (gmt 0)

10+ Year Member



Jim:

I must thank you again for spending the time to answer my questions.

Your information is greatly appreciated, and seems to be working exactly as you indicated.

Everything is running just as you said. Now my only questions are:

Can I put "wild cards" in the IP addresses. I know enough to be careful with this, but I don't understand the syntax of the slashes and everything, so I dont know exactly where/how to do that, and after testing, I'm finding out that some of the anonymizers used by my stalker have a number of IP addresses in the same little block.

Also, I don't understand what to put in to block specific "user agents" as you have so graciously provided. Is the the refering URL? Or what exactly would I put in there? Wildcards possible there?

Thanks again for all the help.

jdMorgan

6:53 am on Feb 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NotSoSavvy:

Can I put "wild cards" in the IP addresses. I know enough to be careful with this, but I don't understand the syntax of the slashes and everything, so I dont know exactly where/how to do that, and after testing, I'm finding out that some of the anonymizers used by my stalker have a number of IP addresses in the same little block.

Sure... In order to block or redirect 123.45.67.xxx, just leave off the digits corresponding to xxx

RewriteCond %{REMOTE_ADDR} ^123\.45\.67\.

If you need to get down into sub-ranges, things get a bit more complicated, but here's an example you can dissect that specifies IP address range 123.45.78.27 - 123.45.78.54:

RewriteCond %{REMOTE_ADDR} ^123\.45\.78\.(2[7-9]¦[34][0-9]¦5[0-4])$

(See reference citation below, and remember that those should be solid vertical pipes, not broken like this "¦" - they change when posted here)

Also, I don't understand what to put in to block specific "user agents" as you have so graciously provided. Is the the refering URL? Or what exactly would I put in there? Wildcards possible there?

This would be the user-agent from the user-agent field of you raw log file, typically something like one of these:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; AT&T CSM7.0) 
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:1.0.2) Gecko/20021120 Netscape/7.01

The above are all common browsers; The term "user-agent" refers to all clients which can connect to your server - browsers, search engine robots, e-mail address harvesters, etc.

You could use the referring URL as well - It all depends on what your "guests" log fingerprint looks like. I do not know what specific bits of information you are using to identify your guest so I'm trying to cover all the bases here. Basically, you want your block to be as specific as possible to avoid blocking legitimate visitors without becoming so specific that it is easily-bypassed and without becoming so complicated that it is inconvenient to modify/improve/understand/read, etc.

Have a look at the Apache mod_rewrite documentation cited in this Introduction to mod_rewrite [webmasterworld.com] post. It describes all of the request fields and variables that mod_rewrite can test and act upon.

HTH, Jim

NotSoSavvy

9:20 pm on Feb 24, 2003 (gmt 0)

10+ Year Member



Jim:

Please forgive my ignorance, but my knowledge is very limited. Have been trying to follow, but I'm now one step behind my actual capabilities.

To be more specific about redirecting the user by user agent, these are the log entries I'd like to redirect via the "user agent":

130.94.107.233 - - [09/Feb/2003:11:39:04 -0800] "GET / HTTP/1.1" 304 - "-" "Mozilla/4.78 (TuringOS; Turing Machine; 0.0)"

209.234.157.51 - - [09/Feb/2003:11:39:05 -0800] "GET /frame1.html HTTP/1.1" 304 - "-" "Mozilla/4.78 (TuringOS; Turing Machine; 0.0)"

209.234.157.43 - - [09/Feb/2003:11:39:06 -0800] "GET /frame2NYE.html HTTP/1.1" 304 - "-" "Mozilla/4.78 (TuringOS; Turing Machine; 0.0)"

200.64.191.49 - - [09/Feb/2003:12:15:50 -0800] "/index.html HTTP/1.1" 200 407 "-" "http://@nonymouse.com/ (Unix)"

24.198.24.56 - - [16/Feb/2003:13:26:07 -0800] "GET /graphics/tree.jpg HTTP/1.1" 200 5750 "http://jproxy.uol.com.ar/jproxy/http://www.mysite.com" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"

Obviously, I don't want to block all users with MSIE 5.5 Windows 98, as the above example has in it. But I would like to redirect all that have the phrase "Turing", "nonymouse", and "jproxy" in there.

I feel like I'm being a little selfish with your advice, as you've helped so much already, but if you can, I'd greatly appreciate an example of how to redirect those examples.

Also, a quickie for you I'm sure, how would I redirect all IP addresses that begin with 209.234? I know you answered but my understanding is barely hanging in there and I seem to have trouble making it work.

jdMorgan

10:20 pm on Feb 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NotSoSavvy,

I feel like I'm being a little selfish with your advice, as you've helped so much already, but if you can, I'd greatly appreciate an example of how to redirect those examples.

No problem - I don't like bad people.

Obviously, I don't want to block all users with MSIE 5.5 Windows 98, as the above example has in it. But I would like to redirect all that have the phrase "Turing", "nonymouse", and "jproxy" in there.

Blocking/redirecting by referrer instead of by user-agent is the solution for this problem.

I can't advise blocking 209.234.xxx.xxx, since that will block 65,534 possible visitors. The code I provide below will block up to 254 users at 209.234.157.xxx. If you really do need to cast a wider net, change the fifth line below to:


RewriteCond %{REMOTE_ADDR} ^209\.234\. [OR]

The following code will redirect user-agents containing "Turing Machine" or starting with "http://@nonymouse.com"
Any user-agent referred from "http://jproxy.uol.com" will also be redirected, as will anyone trying to access your site from machines connected (or proxied) from 130.94.107.233, 209.234.157.xxx, 200.64.191.49, or 24.198.24.56

Note that there must not be an [OR] on the last RewriteCond. Also, the use of "^" and "$" is intended precisely as shown. These are start and end anchors and specify where the regular-expressions pattern-matching is to start and/or end. Any pattern with both a start and an end anchor will only match a request containing exactly that pattern - nothing more, nothing less.


Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} Turing\ Machine [OR]
RewriteCond %{HTTP_USER_AGENT} ^http://@anonymouse\.com [OR]
RewriteCond %{HTTP_REFERER} ^http://jproxy\.uol\.com [OR]
RewriteCond %{REMOTE_ADDR} ^130\.94\.107\.233$ [OR]
RewriteCond %{REMOTE_ADDR} ^209\.234\.157\. [OR]
RewriteCond %{REMOTE_ADDR} ^200\.64\.191\.49$ [OR]
RewriteCond %{REMOTE_ADDR} ^24\.198\.24\.56$
# redirect journal.html file requested by lusers above
RewriteRule ^journal\.html$ /bogusjournal.html [L]

- Or use this rule instead -

# deny all requests from lusers above, return 403-Forbidden server response
RewriteRule .* - [F]

That should take care of your examples nicely.

Again, I encourage you to reference the Apache mod_rewrite documentation [httpd.apache.org], and use it to "decode" the above examples piece-by-piece. That way, you will be better prepared to immediately respond if a new gambit is attempted.

HTH,
Jim

NotSoSavvy

4:50 am on Feb 25, 2003 (gmt 0)

10+ Year Member



Thanks so much again. Yes, I keep reading and reading and reading trying to grasp the commands and syntax involved but it's just a bit beyond me.

Your example, once again, worked like a charm. However, when I try to add IPs to redirect it seems to randomly decide whether it will recognize them or not.

Here's what I have now:

Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} Turing\ Machine [OR]
RewriteCond %{HTTP_USER_AGENT} ^http://@anonymouse\.com [OR]
RewriteCond %{HTTP_REFERER} ^http://jproxy\.uol\.com [OR]
RewriteCond %{HTTP_REFERER} ^http://www.space.net.au/~thomas/quickbrowse.html [OR]
RewriteCond %{REMOTE_ADDR} ^130\.94\.107\.233$ [OR]
RewriteCond %{REMOTE_ADDR} ^209\.234\.157\. [OR]
RewriteCond %{REMOTE_ADDR} ^216\.127\.82\. [OR]
RewriteCond %{REMOTE_ADDR} ^216\.140\.249\. [OR]
RewriteCond %{REMOTE_ADDR} ^65\.161\.65\. [OR]
RewriteCond %{REMOTE_ADDR} ^200\.64\.191\.49$ [OR]
RewriteCond %{REMOTE_ADDR} ^200\.65\.25\.191$ [OR]
RewriteCond %{REMOTE_ADDR} ^65\.19\.131\.218$ [OR]
RewriteCond %{REMOTE_ADDR} ^64\.69\.79\.212$ [OR]
RewriteCond %{REMOTE_ADDR} ^64\.246\.11\.102$ [OR]
RewriteCond %{REMOTE_ADDR} ^24\.198\.24\.56$
RewriteRule ^index\.html$ /index2.html [L]

It seems to work, except as I said, sometimes just randomly doesn't. For example, 64.69.79.212 (which comes from snoopblocker.com) appears to me to be clearly redirected, but when i test it I find that it is not redirected at all. Access logs confirm that that IP address is the one which is not being redirected. Any ideas? I am using Fetch and uploading the file in text mode, so I don't think that's it.

jdMorgan

5:26 am on Feb 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NotSoSavvy,

Well, I stared at it for ten minutes, and I don't see anything, so I'm stumped.

There is nothing seriously wrong with your code - the only technicalities I see are the unescaped dots in the space.net referrer pattern, and an extra space at the very end of the RewriteRule line.

From what you have said, I can't find any reason why it should work intermittently. The only way you'll be able to tell with the internal redirect you're using is if the file size of index.html and index2.html are different. There is no reason a RewriteCond would be intermittent, so make sure you are testing remote_addr or http_referer appropriately.

Remember, if you can't block/redirect by remote_addr IP, you can block/redirect by http_referer or http_user_agent - or some or all of these.

Jim

NotSoSavvy

6:15 am on Feb 25, 2003 (gmt 0)

10+ Year Member



Well thanks for the consideration anyway.

>>There is nothing seriously wrong with your code - the only technicalities I see are the unescaped dots in the space.net referrer pattern,<<

I'm laughing at just how out of my league I am here. Unescaped dots? Is it something I should change? Could it affect anything?

>> and an extra space at the very end of the RewriteRule line. <<

Do spaces screww things up? I notice that I'm not sure if there's supposed to be a space or not at the end of every line.

Also, I should be more clear about it working "intermittently" or "randomly". As far as I can tell from testing, it consistently redirects all requests that have "Turing Machine" or "@nonymouse", but I'm not sure about the IP adresses being redirected, and I DO find that the IP I specifically refered to (64.69.79.212) seems to consistently NOT be redirected.

Anyway, if it's beyond you, I can't really ask for more help. You've been more than patient, and I can't tell you how much it's been appreciated.

NotSoSavvy

6:39 am on Feb 25, 2003 (gmt 0)

10+ Year Member



You know, I just may have figured something out.

After some more testing, I'm realizing that perhaps SOME of the times snoopblocker just serves up a cache of the page. I tested this by changing some content on the site, and trying to access it through snoopblocker, only to find the OLD pages coming through.

Still doesn't explain the times when the IP was logged though. That means it was definitely getting through, and getting through to the NON-redirected pages.

jdMorgan

4:18 pm on Feb 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NotSoSavvy,

That means it was definitely getting through, and getting through to the NON-redirected pages.

How can you tell? We are using a transparent redirect here, so you won't see the "index2.html" page listed in your access log. As I stated above, the only way to tell is if the byte counts for index.html and index2.html are different (because the pages are different) and if the byte count shown in the access log for an "index.html" request matches the byte count for index2.html, not that for index.html. Mod_rewrite is simply replacing the requested index.html page with the index2.html file, and you won't see a request for index2.html in your logs because the browser never requested that file, mod_rewrite simply substituted it.

Unescaped dots? Is it something I should change? Could it affect anything?

Mod_rewrite uses regular-expressions pattern-matching. The text on the right-hand side of a RewriteCond and on the left-hand side of a RewriteRule is a "regular expressions pattern". In regex patterns, some characters have special meanings; For example, "." means, "any single character." and ".*" means, "any number of any characters." In order to tell mod_rewrite that you want it to look for and match a literal character - such as the period used in domain names or IP numbers - any "special character" present in the pattern must be preceded by a backslash, thus: "\."

The line in question is this one:


RewriteCond %{HTTP_REFERER} ^http://www.space.net.au/~thomas/quickbrowse.html [OR]

which should be:

RewriteCond %{HTTP_REFERER} ^http://www\.space\.net\.au/~thomas/quickbrowse\.html [OR]

and actually, I'd recommend this, so as not to be overly-specific:

RewriteCond %{HTTP_REFERER} ^http://www\.space\.net\.au/~thomas/ [OR]

This "escaping" requirement applies to the following characters: . [ ] ( ) _ ; ! * + ^ $ \ <space>
and probably several others that I've forgotten to list.

You have taken on an "advanced" project here, and I really can't emphasize strongly enough the usefulness of reading this Introduction to mod_rewrite [webmasterworld.com] thread and following the linked citations for the regular-expressions (regex) tutorial and the Apache mod_rewrite documentation. Read these documents thoroughly and print them out; I refer to my printed copies on a daily basis, despite having worked with this stuff for a long time. I've had to re-print them, too - I have literally worn out several copies. :)

I intend the above paragraph in the friendliest of ways, and will continue to assist wherever possible. But mod_rewrite is a powerful and dangerous tool, and uses the powerful regular-expressions notation. Both require extraordinary attention to detail and the frequent use of reference material to avoid surprises.

Extra spaces are not needed, especially at the end of lines, and only serve to slow things down (very slightly) and bloat your .htaccess file.

If you are really seeing requests by a particular IP address getting through, and those IP addresses are present in a RewriteCond %{REMOTE_ADDR} statement, I can't explain that. If, however, the IP address is appearing in the referrer field of your log entry, then you must use RewriteCond %{HTTP_REFERER}, those fields being separate and distinct. Example:


24.198.24.56 - - [16/Feb/2003:13:26:07 -0800] "GET /graphics/tree.jpg HTTP/1.1" 200 5750 "http://jproxy.uol.com.ar/jproxy/http://www.mysite.com" "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)"

breaks down into the following variables, testable by mod_rewrite:

{REMOTE_ADDR} = 24.198.24.56
{THE_REQUEST} = GET /graphics/tree.jpg HTTP/1.1
{REQUEST_METHOD} = GET
{REQUEST_URI} = /graphics/tree.jpg
{HTTP_REFERER} = http://jproxy.uol.com.ar/jproxy/http://www.mysite.com
{HTTP_USER_AGENT} = Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90)

There are others, such as {REMOTE_HOST}, {REMOTE_USER}, {REMOTE_IDENT}, and {QUERY_STRING} which are either blank or not present in this example request, but which also can be tested by mod_rewrite in .htaccess.

The caching issue is likely caused by your browser. If using IE, change your Temporary Internet Files General Settings to "check every time" until you finish testing. You can also use mod_expires and mod_headers to better control caches by "tagging" the files returned by your server to expire after a given time and force caches to reload, but that's another subject... :)

HTH,
Jim

NotSoSavvy

8:39 pm on Feb 25, 2003 (gmt 0)

10+ Year Member



Jim:

I cannot thank you enough. I am impressed with your generosity to a complete stranger. You have helped more than you know.

Just to clarify, I read through the info you referenced, as well as searching for more. I have learned a little, but as I said, it's still just beyond my grasp.

Obviously I'm not sure why, but sometimes the logs do show that "index2.html" has been nabbed, although looking over the logs I see that you are correct in that it usually doesn't show up. The reason I was seeing the hits is because I had the foresite to add a few 2x2 jpgs onto "index2.html" just to make sure I could see when this person had been redirected, which of course I saw when testing.

Thank you, Thank you, Thank you, again.

jdMorgan

9:04 pm on Feb 25, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK,

So you see the index2.html as the referrer for the .jpg images - Good method. I'm still stumped as to how your IP detection could be bypassed, though... But I'm no expert on using open proxies to "hide."

I hope this works out for ya!

Jim

Shawn Collins

2:52 am on Feb 28, 2003 (gmt 0)

10+ Year Member



I've got a question that is coming from the other side of this issue. I run the site for a ticket broker, and a while back, he said he was being blocked him from accessing the site of a company where everybody buys their tickets.

At the time, I suggested he use an anonymizer, and it did the trick for a while.

But now, he says he cannot get to their site even with the anonymizer.

Any ideas how to enable somebody to access a site, when they have apparently been blocked. I was going to suggest to him that he try a spyware remover, but I figured it might be more difficult than that.

Any thoughts?

jdMorgan

3:12 am on Feb 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Shawn,

Well, first go ahead and run the spyware/scumware removers - It's doubtful that spyware is the problem, but get his machine 'clean' first.

Then, if he still can't access the site, I'd suggest he contact them and find out why he's been blocked - Politely.

One of two possibilities: First, he was inadvertently blocked as part of a larger IP address group that was causing them problems, or second, he was blocked specifically, in which case he needs to find out why and what he needs to do to avoid annoying them and getting blocked again.

It's doubtful he can get back in from his current IP address unless they take action to allow it. And if he repeats whatever behaviour it was that got him in trouble initially from a new IP address, they'll block that one, too.

If my sites are abused in certain ways, such blocks are invoked automatically - I only have to "count the bodies" once a month. So, it may have been an honest mistake on either party's part, and should not be taken personally.

Jim

NotSoSavvy

9:32 pm on Mar 2, 2003 (gmt 0)

10+ Year Member



Jim:

I have another question if you don't mind. I think it's a simple one.

As it is, I have the htaccess set up following your advice. The last line looks like this:

RewriteRule ^index\.html$ /fakeindex.html [L]

How would I add more Rewrite Rules, which would follow the same conditions.

In other words, if the conditions set up are met (based on IP address or UserAgent), then not only would "index.html" be switched to "fakeindex.html", but ALSO if that IP address or UserAgent tried to go to "journal.html" it would serve them "fakejournal.html" and if they tried for "calendar.html" they would be served "fakecalendar.html".

Could I simply add Rewrite rules like this:

RewriteRule ^index\.html$ /fakeindex.html [L]
RewriteRule ^journal\.html$ /fakejournal.html [L]
RewriteRule ^calendar\.html$ /fakecalendar.html [L]

I ask because although setting up the htaccess following your advice seems to be working most of the time, this person still ocassionally gets through, and it appears as if it may be because they have the index.html either cached on their browser, or some anonymizers have it cached, and then they can click a link to say, "journal.html" and still get there. Or they may just have bookmarked the "journal.html" page and are skipping the index entirely.

Thanks again for any input.

jdMorgan

10:12 pm on Mar 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NotSoSavvy,

No, you can't really do it this way, because the preceding RewriteConds only apply to the first RewriteRule:


<existing RewriteCond list from above>
RewriteRule ^index\.html$ /fakeindex.html [L]
RewriteRule ^journal\.html$ /fakejournal.html [L]
RewriteRule ^calendar\.html$ /fakecalendar.html [L]

In this case, only the first RewriteRule would be qualified by the RewriteConds above it. The next two would be applied unconditionally.

For a limited number of page names, you can use this construct:


Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} Turing\ Machine [OR]
RewriteCond %{HTTP_USER_AGENT} ^http://@anonymouse\.com [OR]
RewriteCond %{HTTP_REFERER} ^http://jproxy\.uol\.com [OR]
RewriteCond %{HTTP_REFERER} ^http://www\.space\.net\.au/~thomas [OR]
RewriteCond %{REMOTE_ADDR} ^130\.94\.107\.233$ [OR]
RewriteCond %{REMOTE_ADDR} ^209\.234\.157\. [OR]
RewriteCond %{REMOTE_ADDR} ^216\.127\.82\. [OR]
RewriteCond %{REMOTE_ADDR} ^216\.140\.249\. [OR]
RewriteCond %{REMOTE_ADDR} ^65\.161\.65\. [OR]
RewriteCond %{REMOTE_ADDR} ^200\.64\.191\.49$ [OR]
RewriteCond %{REMOTE_ADDR} ^200\.65\.25\.191$ [OR]
RewriteCond %{REMOTE_ADDR} ^65\.19\.131\.218$ [OR]
RewriteCond %{REMOTE_ADDR} ^64\.69\.79\.212$ [OR]
RewriteCond %{REMOTE_ADDR} ^64\.246\.11\.102$ [OR]
RewriteCond %{REMOTE_ADDR} ^24\.198\.24\.56$
RewriteRule ^(index¦journal¦calendar)\.html$ /fake$1.html [L]

Remember to edit this new RewriteRule to replace the broken vertical pipes "¦" with solid vertical pipes from your keyboard (as usual).

Jim

NotSoSavvy

11:02 pm on Mar 2, 2003 (gmt 0)

10+ Year Member



Do you know what the "limited number" of pages is?

And, (once again, sorry for my ignorance), but in the example you give:

RewriteRule ^(index¦journal¦calendar)\.html$ /fake$1.html [L]

what is "fake$.html"?

Does that mean that, if a condition is met, that "index.html" will be served as "fakeindex.html", and "journal.html" will be served as "fakejournal.html"?

OR, does it mean that a request for any of the specified pages will all go to the same page defined as "fake$1.html".

jdMorgan

11:22 pm on Mar 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



NotSoSavvy,

Do you know what the "limited number" of pages is?

Yeah, it's as many as you can stand before the code gets messy, runs off the right side of your screen, etc. :)
Or, it's the number of alternate pages that you wish to create.

At some point, it becomes a game of diminishing returns, and using the RewriteRule


RewriteRule .* - [F]

with the same list of RewriteConds becomes attractive. In other words, just ban them instead of redirecting.

Does that mean that, if a condition is met, that "index.html" will be served as "fakeindex.html", and "journal.html" will be served as "fakejournal.html"?

Yes, exactly. The $1 represents the contents of the first preceding pair of parentheses, and is filled-in by mod_rewrite. This is called a back-reference, as discussed in the Apache mod_rewrite documentation. Verrrrry useful. :)

Jim