Forum Moderators: open

Message Too Old, No Replies

How to "deny from" WebCollage?

         

misosoph

5:01 am on Apr 4, 2002 (gmt 0)

10+ Year Member



I would like to deny access to the WebCollage agent by using lines in my .htacess file that simply say

deny from env=webcollage/1.82
deny from env=webcollage/1.87

But these lines do not work. Is there an uncomplicated way to do what I would like to do? I have tried to study the "Apache HTTP Server" texts, but I don't understand them.

littleman

5:15 am on Apr 4, 2002 (gmt 0)



You sure you want to do that? WebCollage is tied to yahoo's link of the moment. I have seen evidence that a site could be delisted in yahoo if WebCollage gets an error -- that was some time ago though.

misosoph

5:54 am on Apr 4, 2002 (gmt 0)

10+ Year Member



Yes, thanks, but I think I do.

I have never gotten a WebCollage request via Yahoo! Everything comes from AltaVista and looks like this:

129.162.1.32 - - [03/Apr/2002:19:53:59 -0500] "GET /logwitt/logwit15.html HTTP/1.1" 200 53578 "http://www.altavista.com/sites/search/web? pg=q&kl=XX&search=Search &q=indeterminacy%20OR%20eros %20OR%20incredulous %20OR%20atoms%20OR %20phototypesetters&pgno=3&stq=20" "webcollage/1.87"

The files requested are all text with a single 134 bytes image. This happens several times every day: the same type of mindless requests: "indeterminacy + eros + incredulous + atoms + phototypesetter".

This is not what I go to the trouble of maintaining a Web site for, and I want to stop it. Here is another example:

216.23.11.254 - - [26/Mar/2002:13:26:16 -0500] "GET /valente/outline.html HTTP/1.0" 200 78022 "http://www.altavista.com/sites/search/web? pg=q&kl=XX&search=Search& q=sorts%20orthopedic% 20dictatorship%20unconstitutional% 20elephants&pgno=3&stq=20" "webcollage/1.82"

Often the IPs belong to colleges, but they are almost always different IPs.

fixed side scrolling url3

[edited by: Brett_Tabke at 7:52 am (utc) on June 4, 2002]

littleman

6:08 am on Apr 4, 2002 (gmt 0)



Okay, then put this in your .htaccess file:
SetEnvIf User-Agent ^webcollage keep_out
order allow,deny
allow from all
deny from env=keep_out

misosoph

6:46 am on Apr 4, 2002 (gmt 0)

10+ Year Member



Thank you for sharing your knowledge with me. I have added your lines to my .htaccess file, and tomorrow I will hope to see the happy number 403. I lose my sense of humor when participation is compulsory.

Next day:

66.68.66.225 - - [04/Apr/2002:04:40:33 -0500] "GET /logwitt/logwitt9.html HTTP/1.0" 403 219 "http://www.altavista.com/sites/search/web? pg=q&kl=XX&search=Search& q=ingenious%20OR %20monoprogramming%20OR%20untested%20OR%20guessing %20OR%20constellation &pgno=2&stq=10" "webcollage/1.87"

The size of the file denied is 85,019 bytes.

fixed side scrolling url5

[edited by: Brett_Tabke at 7:51 am (utc) on June 4, 2002]