Why are the hooligans inserting gibberish into my forms?

Forum Moderators: phranque

Message Too Old, No Replies

Why are the hooligans inserting gibberish into my forms?

limoshawn

2:52 pm on Mar 11, 2008 (gmt 0)

Are they trying to build a bot to insert spam or is the gibberish the end result? My first thought is that they are trying to reverse engineer a bot but I'm not sure. I've been doing the IP blocking dance, as soon as I get a gibberish filled form submit I block that group of IPs.

physics

4:28 pm on Mar 11, 2008 (gmt 0)

Is the gibberish all alpha-numeric or are there other characters like ' \ ¦, etc?

limoshawn

9:57 pm on Mar 11, 2008 (gmt 0)

There doesn’t appear to be any code being injected, simply looks like someone typing random keys just to make sure each field has an entry. like: jkdslhflhoi

thanks

physics

10:15 pm on Mar 11, 2008 (gmt 0)

Are they linking anywhere?

limoshawn

10:42 pm on Mar 11, 2008 (gmt 0)

sometimes they put fake links in:

<a href="http://khfjklashfkhskjh">hfdslflei</a>

which is one of the reasons i think they are trying to create a bot for spamming purposes, they seem to be trying to find out if they can get a good link.

physics

11:26 pm on Mar 11, 2008 (gmt 0)

limoshawn, you're probably right on that one.
Ways to reduce problems from this (besides blocking) are using a captcha or even using a simple field that says something like

Type the word hi into the box on the right:

And then check if the word hi exists in the submitted form data.

limoshawn

1:17 am on Mar 12, 2008 (gmt 0)

so by suggesting using a captcha, do you think that this is already a bot? I guess it could be one that searches for webforms to insert links into. it runs around the net inserting goblygook into forms until one gives them a good link.

rocknbil

5:07 pm on Mar 13, 2008 (gmt 0)

This is how it "starts." Someone may manually or with a program throw data at your forms. If they get a response, they'll leave you alone for a while, then in a month or two their bot will come back and start dumping link spam on a regular basis.

The other scenario I've seen is an obvious bot "feeling out" the script. That is, on the first hit it throws some data at it and gets an invalid data error (missing required field, for example.) It hits again, populating that field. Through trial and error, it figures out what fields are email addresses, and worst of all, what fields will get injected directly into a mail header. The "subject" line is an Achilles heel in this respect - if they can inject into any of the mail headers, it can add it's own BCC field and email a few thousand addresses at a clip, and you'll never know - you only get one. To clarify, "we don't have a BCC field in our mail program" won't help you. They attempt to inject a newline and their own BCC, or even a multipart mail header.

By reviewing the log file for this "transaction," the entire process composes of 10 to 20 hits in under a minute, an obvious bot.

You can see this kind of stuff by logging all raw input from your forms. Not server or mail logs, these (to me) are often cryptic and tell only part of the story. You add a routine to your processor to write all input data to a file in a safe location and put a time stamp on each entry. Review it often, you'll be surprised at what you find.