homepage Welcome to WebmasterWorld Guest from 54.166.255.168
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / Webmaster General
Forum Library, Charter, Moderators: phranque

Webmaster General Forum

    
Creative new search form spam attack
redirecting google to do the spamming
killroy

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 10538 posted 10:45 am on Dec 21, 2005 (gmt 0)

Recently I started seeeing log hits from google that included strange html codes and urls similar to this:

http://example.com/porn-keyword.html

After some checkingI relised the spamming site must have learned that the search form on my site doesn't filter out html codes. It then appearently set up links to my search form that would cause google to check it out, turning the phrase:

We didn't find "search keyword"

Into a link to their spam site.

Unfortunately I have no longer access to the source code of this script, and until itcan be replaced I am considering a mod_rewrite solution.

Current candidates include:

- check for the spammers domain and rewrite all urls containing it to 404 (how?)
- check for "<" and ">" and rewrite them to entity references.
- check for url elements such as http or html and remove those.

What would you do?

SN

 

victor

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 10538 posted 12:15 pm on Dec 21, 2005 (gmt 0)

I ensure that *anything* I display that comes from an untrusted source has "<" and ">" encoded.

Untrusted sources include:
-- webpage forms
-- emails
-- other people's databases
-- etc

killroy

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 10538 posted 1:52 am on Dec 22, 2005 (gmt 0)

That's obvious... now ;) My problem refers to how to deal with legacy code. This stuff is several years old, with the last recompile over 3 years ago, but some of the code is 5 years and older.

The mode_reqrite rule seems to have worked in the end.

Another complicating factor is that this search page can accept searches both via the path as well as a query string. So it can be called as "/search/keywords" or as "/search?q=keywords". My mod_rewrite rules fix the first, but not the second. Can I rewrite query strings in mo_rewrite?

SN

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 10538 posted 2:28 am on Dec 22, 2005 (gmt 0)

Yes, using RewriteCond %{QUERY_STRING} to test and create back-references to the query string parts.

Jim

topsites



 
Msg#: 10538 posted 7:21 am on Dec 22, 2005 (gmt 0)

Hold on, if the spammer has used links to YOUR site to get google to check things out, then I can not see how this benefits the spammer thou it could possibly hurt you.

Yes I understand google creates a link to their site but that's only after YOU back-check things, it doesn't do a thing for the spammer that I know of, nor does it help them in any other way except google files, someplace, a few error links... So much for blackhat techniques, how stupid. This really is no more effective than referral spam, it's just plain dumb.

Here is the only part that bothers me:
If an SEO creates deceptive or misleading content on your behalf, such as doorway pages or "throwaway" domains, your site could be removed entirely from Google's index.
Might this affect you?
You did not hire them, but...
I don't know...

To be on the safe side, you may wish to file a report:
[google.com...]

Lexur

5+ Year Member



 
Msg#: 10538 posted 7:43 am on Dec 22, 2005 (gmt 0)

The only way I can understand it is:
a) the spammer search the web looking for redirection scripts (redirect.cgi, redir.php and so on)
b) he sends a bot to do every day a few hits
c) he links to the stats page of this script
d) he waits Google indexes the stat page and count a link to his site

It's the same scheme of referral spam (at least in the way I can see this).

But... and if my stats pages are password protected?

killroy

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 10538 posted 10:44 am on Dec 22, 2005 (gmt 0)

Ok, let me re-explain, perhaps I wasn't clear before. This is NOT referal spam. I get NO accesses from any bot or entity connected to the spammer.

This site is a large business directory, running on old scripts. One of these scripts, a search, does not escape incomming search queries. It also repeats the query on the "entry not found" error page, which returns a 200 http code.

Now basically the spammer must have created links to this site (I only know indirectly from googles indexing behavior) in this form:

[mydomain.com...] href=http://spammerdomain.com/pron-keyword.html>the</a>

This causes my search script to not find anything, and report that it hasn't found any results for the keyword "the" where the word is hot linked to his site.

The targeted purpose of this spam operation is to create backlinks from this site to his (which is a strong PR6 with 10s of 1000s of pages with high PR).

I have now solved the issue by using mod_rewrite to encode the "<" and ">" characters and by deleting all occurances of "spamdomain.com" from incoming url strings.

The spammer must ahave investigated this site and deliberately targeted it after finding out how the script works.

Also, googlebot hit it with 1000s of different porn keywords as target pages, so it was a wide attack.

SN

killroy

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 10538 posted 11:33 am on Dec 22, 2005 (gmt 0)

Another thing, how would I file a spam report? This is not conected to a query, just a planned/intended spam. The spam site itself appears to be a realty site, while it tries to push 1000s of porn pages on the same domain into google. If this has just started they will not yet show. But the negative association might stick, i.e. It might get my site associated with porn and perhaps even excluded from filtered pages and so on.

Would the spam report sill be a good place to report? And what would I use to fill in the query related items?

SN

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Webmaster General
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved