Forum Moderators: open

Message Too Old, No Replies

webcollage variation

collage.cgi/1.82

         

misosoph

5:30 pm on May 12, 2002 (gmt 0)

10+ Year Member



64.124.236.141 - - [12/May/2002:05:11:40 -0400] "GET /verbsap/verbsap.html HTTP/1.0" 200 59097 "http:/ /www.altavista.com/sites/search/web? pg=q&kl=XX&search=Search&q=assonance%20determinacy%20recollections %20unmask%20disadvantageous &pgno=2&stq=10" "collage.cgi/1.82"

This seems to work exactly like webcollage. The above search string is: assonance OR determinacy OR recollections OR unmask OR disadvantageous

(The IP belongs to Abovenet Communications, Inc., so as with webcollage this could be any user. And as with webcollage the request came from AltaVista.)1

[edited by: Brett_Tabke at 7:53 am (utc) on June 4, 2002]

Marius

1:52 am on May 16, 2002 (gmt 0)



I've also run into the matter of webcollage. In recent weeks i found requests of the sort:

boskoop.iwr.uni-heidelberg.de - - [14/May/2002:13:25:51 -0500] "GET /NABA/WXLX.html&rpos=31 HTTP/1.0" 404 294 "http://www.altavista.com/cgi-bin/query? ipht=1&igrph=1&iclr=1 &ibw=1&micat=1&imgset=1&stype=simage &mmW=1&q=governance%20 OR%20leroy%20OR %20airliner%20OR%20author%20OR%20fronts&pgno=3&stq=24" "webcollage/1.90"

The part that is most annoying is the '&rpos=31' that is appended to the URL: it results in a 404.

I did a little research and found that the problem is not due to altavista. The webcollage agent is using altavista as an image database.

Webcollage is a module for xscreensaver (X windows): see [jwz.org]

[edited by: Brett_Tabke at 7:53 am (utc) on June 4, 2002]

misosoph

3:28 am on May 16, 2002 (gmt 0)

10+ Year Member



Thank you. I only knew that this was a random-word page searcher, but I did not know that it was only interested in images to add to a screensaver.

I have never seen &rpos=31 or anything similar added to the requested URL (i.e. it's always just /files/filename.html). -- However, I have not been visited by version 1.90 yet; maybe there is a problem with that version? (Wishful thinking.)

What angers me about Webcollage is that it treats the Internet as if the Internet were a big trough that people ought feed at from like pigs: mindlessly take whatever you want, no matter that someone's hard work went into what you're taking and that someone has to pay to make it available to you.

I had about 1 MB of pages -- HTML pages of long text that contain only a 134 bytes mascot image -- devoured every day by Webcollage requests from all over the globe, until this forum's moderator littleman told me how to deny them by adding this to my .htaccess file:

SetEnvIf User-Agent ^webcollage keep_out
order allow,deny
allow from all
deny from env=keep_out

To this I have now added the line:

SetEnvIf User-Agent ^collage.cgi keep_out

... which I hope is correct, but I don't know because collage.cgi hasn't come back. Thanks again for the new information.

jdMorgan

2:09 pm on May 17, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here's one from my logs that used Lycos to "search":

137.112.0.58 - - [17/May/2002:06:48:56 -0400] "GET / HTTP/1.0" 200 59533 "http:/ /lycospro.lycos.com/srchpro/? lpv=1&t=any&query=fractions%20 mantel%20defensive%20nonmilitary%20 seclusion&start=1" "webcollage/1.87"

So, it's no longer an "Altavista-referer-only" thing.

Jim4

[edited by: Brett_Tabke at 7:53 am (utc) on June 4, 2002]