Welcome to WebmasterWorld Guest from

Forum Moderators: phranque

Message Too Old, No Replies

Creative new search form spam attack

redirecting google to do the spamming

10:45 am on Dec 21, 2005 (gmt 0)

Senior Member from MT 

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 1, 2003
votes: 0

Recently I started seeeing log hits from google that included strange html codes and urls similar to this:


After some checkingI relised the spamming site must have learned that the search form on my site doesn't filter out html codes. It then appearently set up links to my search form that would cause google to check it out, turning the phrase:

We didn't find "search keyword"

Into a link to their spam site.

Unfortunately I have no longer access to the source code of this script, and until itcan be replaced I am considering a mod_rewrite solution.

Current candidates include:

- check for the spammers domain and rewrite all urls containing it to 404 (how?)
- check for "<" and ">" and rewrite them to entity references.
- check for url elements such as http or html and remove those.

What would you do?


12:15 pm on Dec 21, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 4, 2002
votes: 0

I ensure that *anything* I display that comes from an untrusted source has "<" and ">" encoded.

Untrusted sources include:
-- webpage forms
-- emails
-- other people's databases
-- etc

1:52 am on Dec 22, 2005 (gmt 0)

Senior Member from MT 

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 1, 2003
votes: 0

That's obvious... now ;) My problem refers to how to deal with legacy code. This stuff is several years old, with the last recompile over 3 years ago, but some of the code is 5 years and older.

The mode_reqrite rule seems to have worked in the end.

Another complicating factor is that this search page can accept searches both via the path as well as a query string. So it can be called as "/search/keywords" or as "/search?q=keywords". My mod_rewrite rules fix the first, but not the second. Can I rewrite query strings in mo_rewrite?


2:28 am on Dec 22, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
votes: 0

Yes, using RewriteCond %{QUERY_STRING} to test and create back-references to the query string parts.


7:21 am on Dec 22, 2005 (gmt 0)

Junior Member

joined:Mar 13, 2005
votes: 0

Hold on, if the spammer has used links to YOUR site to get google to check things out, then I can not see how this benefits the spammer thou it could possibly hurt you.

Yes I understand google creates a link to their site but that's only after YOU back-check things, it doesn't do a thing for the spammer that I know of, nor does it help them in any other way except google files, someplace, a few error links... So much for blackhat techniques, how stupid. This really is no more effective than referral spam, it's just plain dumb.

Here is the only part that bothers me:
If an SEO creates deceptive or misleading content on your behalf, such as doorway pages or "throwaway" domains, your site could be removed entirely from Google's index.
Might this affect you?
You did not hire them, but...
I don't know...

To be on the safe side, you may wish to file a report:

7:43 am on Dec 22, 2005 (gmt 0)

Preferred Member from ES 

10+ Year Member

joined:Nov 13, 2005
votes: 0

The only way I can understand it is:
a) the spammer search the web looking for redirection scripts (redirect.cgi, redir.php and so on)
b) he sends a bot to do every day a few hits
c) he links to the stats page of this script
d) he waits Google indexes the stat page and count a link to his site

It's the same scheme of referral spam (at least in the way I can see this).

But... and if my stats pages are password protected?

10:44 am on Dec 22, 2005 (gmt 0)

Senior Member from MT 

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 1, 2003
votes: 0

Ok, let me re-explain, perhaps I wasn't clear before. This is NOT referal spam. I get NO accesses from any bot or entity connected to the spammer.

This site is a large business directory, running on old scripts. One of these scripts, a search, does not escape incomming search queries. It also repeats the query on the "entry not found" error page, which returns a 200 http code.

Now basically the spammer must have created links to this site (I only know indirectly from googles indexing behavior) in this form:

[mydomain.com...] href=http://spammerdomain.com/pron-keyword.html>the</a>

This causes my search script to not find anything, and report that it hasn't found any results for the keyword "the" where the word is hot linked to his site.

The targeted purpose of this spam operation is to create backlinks from this site to his (which is a strong PR6 with 10s of 1000s of pages with high PR).

I have now solved the issue by using mod_rewrite to encode the "<" and ">" characters and by deleting all occurances of "spamdomain.com" from incoming url strings.

The spammer must ahave investigated this site and deliberately targeted it after finding out how the script works.

Also, googlebot hit it with 1000s of different porn keywords as target pages, so it was a wide attack.


11:33 am on Dec 22, 2005 (gmt 0)

Senior Member from MT 

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 1, 2003
votes: 0

Another thing, how would I file a spam report? This is not conected to a query, just a planned/intended spam. The spam site itself appears to be a realty site, while it tries to push 1000s of porn pages on the same domain into google. If this has just started they will not yet show. But the negative association might stick, i.e. It might get my site associated with porn and perhaps even excluded from filtered pages and so on.

Would the spam report sill be a good place to report? And what would I use to fill in the query related items?