Forum Moderators: bakedjake
So, since I liked the site, I offered to farm out a lot of the images onto a server of mine, and everything was fine.
Then my friend's site went down again, and it turned out an abusive user had turned some sitegrabber (using a legit browser u_a, which of course is trivial to do) on us.
I was wondering if I could modify one of those scripts used to catch abusive users and instead of blocking the user (wouldn't help) send the user a 301 to a page which explains our policy (heh) on abusive users. If the page 301'd to had links off (which of course would just 301 back there) it would trap the sitegrabber for a while, at least, putting less load on the Geocities part of the site. (And when they viewed the site offline later they could see a page telling them what a bad person they are.)
Any chance? And is there anything I can do for the Geocities site, aside from moving it to another server? Somehow I doubt Geocities lets you upload your own custom .htaccess file. It would probably reject it as an invalid file name. (Well, if they're smart.) At any rate .htaccess alone can't save you from spoofed u_a's.
I'm not sure I want to hose this guy with something like blackflag for sitegrabbers (heh) but it would help if I could slow a sitegrabber down long enough so that the Geocities site isn't knocked offline.
I don't know if that's possible. For one thing, I guess I don't understand so well how sitegrabbers work. I know wget would probably keep going if it got 301's and come back to them later -- annoying to the user, but wouldn't solve the problem. Hmm.
(The page on the Geocities site has links to about 100 images, and about 50 of them are on my server. If you were running recursive wget, it would probably try every single link on that page before it tried any link beyond.)
I doubt wget will loop back from one page to another and back to the same.
I don't like the sleep idea. Many site grabbers pull multiple pages a second and won't slow down to wait for a page to load before making a new request. And if I remember correctly sleeper scripts will cause delays on all cgi procceses running on the server which would effect the visitors on your site as well.
I would block the abusive member from grabbing the images from your server. Tell your friend to place a hidden link on each one of his pages that points to a trap script on your server and to contact Geocities about blocking the abusive visitor from his site.
Do you think the abusive user is a competitor intentional draining the bandwidth of the Geocities account?
Nope, just an idiot, the apprentice using a wizard's tool ineptly.
If I thought it was malicious--whoa, we'd be in another ballgame altogether. (But that would be fine. I'd just out the malingerer to everyone within the little hobby widgets community and they'd be stung for life. Long memories there.)