ciml

msg:1527727 | 3:42 pm on Sep 7, 2002 (gmt 0) |
dave, that's not a spider it's highly likely to be a person. Google's WAP proxy is a bridge for individual people to access Web pages via the WAP protocol used in mobile telephones.
|
Key_Master

msg:1527728 | 3:44 pm on Sep 7, 2002 (gmt 0) |
Congratulations! You just banned a cell phone visitor. [google.com...]
|
bill

msg:1527729 | 3:58 pm on Sep 7, 2002 (gmt 0) |
We had a related thread [webmasterworld.com] a few days ago.
|
carfac

msg:1527730 | 4:17 pm on Sep 7, 2002 (gmt 0) |
Key_Master: But isn't it weird, it only requested two documents- my main page and the spider trap (which is a hidden URL)? You would have to look at the code of the page to even know that link existed, the file name is pretty obscure... Should I un-ban the IP? dave
|
Key_Master

msg:1527731 | 4:32 pm on Sep 7, 2002 (gmt 0) |
carfac, Yes, you should un ban the IP and modify your trap to exclude these types of visits. The best way to check your script out is to use the Google WAP proxy to check out your site. I'd bet that cell phone visitor was shown your spider trap link and clicked on it purely out of curiosity.
|
carfac

msg:1527732 | 4:42 pm on Sep 7, 2002 (gmt 0) |
Key_Master: Very good... I will edit and edit! Thanks for the advice! Dave
|
carfac

msg:1527733 | 4:52 pm on Sep 7, 2002 (gmt 0) |
Hi: Probably not the place for perl questions, so sorry if this is inappropriate... I wrote this: $visitor_ua = $ENV{'HTTP_USER_AGENT'}; if ($visitor_ua =~ 'WAP') { print "Content-type: text/html\n\n"; print "<html>\n"; print "<head>\n"; print "<title>Forward On</title>\n"; print "</head>\n"; print "<body>\n"; print "<p><b>Please <A HREF="http://www.mydomain.com/">Click Here</A> to continue!</b></p>\n"; print "</body>\n"; print "</html>\n"; exit; } else { CODE } and inserted above the logging part of the trap... look good? dave [edited by: carfac at 5:56 pm (utc) on Sep. 7, 2002]
|
Key_Master

msg:1527734 | 5:28 pm on Sep 7, 2002 (gmt 0) |
I wouldn't do it by user agent. Too easy to spoof. Use IP addresses instead. Here is a list of Google WAP proxies and translator IP addresses. 216.239.33.5 216.239.35.4 216.239.37.5 216.239.39.5 |
|
|
carfac

msg:1527735 | 5:41 pm on Sep 7, 2002 (gmt 0) |
Key_Master: Perfect- thanks! I am a bit shaky on Regular expressions, can you tell me if this is correct to match all those numbers: if ($visitor_ua =~ '(216.239.33.5¦216.239.35.4¦216.239.37.5¦216.239.39.5)' { code blah blah Sorry- I get mixed up sometimes whether to us single or double quotes, or the ^ anchor... Thank you! Dave
|
Key_Master

msg:1527736 | 5:51 pm on Sep 7, 2002 (gmt 0) |
This should do it: $visitor_ip = $ENV{'REMOTE_ADDR'}; if ($visitor_ip =~ /^216\.239\.3([3¦7¦9]\.5)$¦^216\.239\.35\.4$/ { |
| Remember to change the pipe (¦) to the proper character.
|
carfac

msg:1527737 | 5:57 pm on Sep 7, 2002 (gmt 0) |
Key_Master: Thank you! I am a mere Gate_Keeper... :) Dave
|
carfac

msg:1527738 | 6:35 pm on Sep 7, 2002 (gmt 0) |
Couple of minor edits, and it works.... Tested on the WAP emulator, and got the spider, but it did not log the IP... Should I post the final, fixed code? dave
|
mbauser2

msg:1527739 | 8:10 pm on Sep 7, 2002 (gmt 0) |
| But isn't it weird, it only requested two documents- my main page and the spider trap (which is a hidden URL)? |
| Not that weird. The HTML-to-WML proxy normally strips out graphics and inserts placeholders. It's probably "un-hiding" your hidden link.
|
|