Forum Moderators: DixonJones

Message Too Old, No Replies

robot attack

         

J64sqs

9:32 pm on Sep 25, 2004 (gmt 0)

10+ Year Member



I think a bad robot got my site a couple days ago.

Looking at my stats (with webalizer 2.01), I see that one particualar IP address had created 12,000 hits in just 3 visits on 1 day. My site usually has less than 400 hits daily, so this really increased my bandwidth this month. And it looks like the robot went back and forth between only 6 or so pages. All the pages it hit were part of my message board on my site and were directly requested.

So, is there anything I can do about this? Or can I do anything to prevent this in the future? Is there an easy answer to what such robots are trying to do?

Thanks in advance.

tacheman

3:21 pm on Sep 28, 2004 (gmt 0)

10+ Year Member



Try blocking it with a robots.txt file. If that doesn't work then ban its IP using the control panel.

J64sqs

8:24 pm on Sep 28, 2004 (gmt 0)

10+ Year Member



Can you tell me more about robot.txt files? how to create one, where to put it, and what to put in it.

fiestagirl

8:36 pm on Sep 28, 2004 (gmt 0)

10+ Year Member



A place to start:

[robotstxt.org...]

Matt Probert

3:48 pm on Sep 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Are you now clear on "robots" and robots.txt?

Most robots are actually search engine spiders indexing your site for their search engine, often a desirable occurence.

However, there are malicious robots. Those used to suck an entire site down for off-line reading, and others for costing you money by exceeding your transfer allowance or carrying out DoS attacks.

Do you see an identifying "user agent" string in the log files for the robot in question?

Matt

J64sqs

9:17 pm on Sep 29, 2004 (gmt 0)

10+ Year Member



Thanks for your replies. I'm starting to get a better understanding of all this, but I still have some questions...

What does it mean to "index" a website? And how is it possible for a search engine robot to find my site since I haven't submitted my site to any search engines?

Also, what is a "user agent string"?

richlowe

10:27 pm on Sep 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



And how is it possible for a search engine robot to find my site since I haven't submitted my site to any search engines?

Search engines look for links. If you've been linked to, you'll be found.

what is a "user agent string"?

Some text which identifies the program (spider or browser or whatever) which fetches a page or object from your site. Thus, google might have a user agent set to "googlebot". You can examine your server logs and then see what is accessing pages. Note that the user agent is handed to the server by the program and thus is not necessarily accurate. For example, a home-grown spider could claim to be googlebot just by setting the appropriate user agent.