Forum Moderators: open

Message Too Old, No Replies

Wget

why 2 Wget's in a row

         

dolphin

11:18 pm on Jul 6, 2001 (gmt 0)

10+ Year Member



From my log:
2001-07-06 13:51:11 63.166.100.25 - xxx.xxx.xx.xx GET /page1.asp - 200 32622 81 16 HTTP/1.0 www.domain.com Wget/1.6 - -
2001-07-06 13:54:24 209.114.200.145 - xxx.xxx.xx.xx GET /page2.asp - 200 14903 96 15 HTTP/1.0 www.domain.com Wget/1.6 - -

This person @ 63.166.100.25 has been quite frequently using Wget to my site. Allways the 63.166.100.25 is the same, but there are allways 2 logs for Wget next to each other when i see 63.166.100.25 and below it is allways a dynamic different ip and a different page it goes to.
Can someone explain whats going on here? After i see 63.166.100.25 why is Wget also going coming to my site from another ip? Also, does anyone know who this 63.166.100.25 actually is? I know it ain't no search engine crawler.
How could i ban the address on a NT server ?
Thanks alot for your replies.

209.114.200.14 I traced to evilkiwi.net
63.166.100.25 I traced to seanmcpherson.com

littleman

1:54 am on Jul 7, 2001 (gmt 0)



Have you been to [primary.seanmcpherson.com...] ? It looks like a persona; website for an ultrageek. could be he took a liking to your sight and is downloading the pages to read later?

When I was poking around 209.114.200.14 (meehans2.airwire.net) was dead. I can't really link them together. Could be he uses the other address as a proxy.

toolman

2:06 am on Jul 7, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I get hit by seanmcpherson too. Regularly. Can't figger out why?

dolphin

8:14 am on Jul 7, 2001 (gmt 0)

10+ Year Member



I still don't understand how and why they trigger 2 Wget's to go off at one time to my site, and why the first Wget is allways the same ip, but the second ip is allways diffrent? Is there a way on a NT server to ban ip's?

icehousedesigns

2:44 pm on Jul 7, 2001 (gmt 0)



From my post below:

BTW the site I am referring to was seanmcpherson.com...I should have stated that.

Ok after contacting a guy who has a server running wget ( primary.seanmcpherson.com) ( I ended up banning his whole domain via .htaccess ) This is the e-mail I got back from him.


The machine is acting as a client for a distributed search engine, and
is crawling sites sent down to it from a central server. You can hit the
web site of the project at www.grub.org, and probably submit your URL to
be placed on a "Do not crawl" list *grin* You might want to email Kord
(the head of the project) with any suggestions as to throttling and
such.

Send an e-mail to support@grub.org and complain. I did..haven't gotten a response back yet but I did :)

theperlyking

3:20 pm on Jul 7, 2001 (gmt 0)

10+ Year Member



Sad to see grub is cropping up with regularity at WmW and that they dont seem to care to much what webmasters think, they should be careful that they dont start to fall into the same category as emailsiphon, etc and get banned by many sites.

Its also hard to believe they are going to make a success of this if they arent interested in producing nicely behaved software.

littleman

8:01 pm on Jul 7, 2001 (gmt 0)



It is interesting that GRUB is a commercial project based on GNU software.

theperlyking

8:44 pm on Jul 7, 2001 (gmt 0)

10+ Year Member



Hm.. didn't know that. Doesn't sound like a recipe for success to me - how many of the kind of geeks who will run this will appreciate their (free) work going into a commercial product.

icehousedesigns

11:46 pm on Jul 7, 2001 (gmt 0)



In my book they have long ago fallen into that dreaded category..along with emailsiphon, webzip, etc. Now it is their job to dig out.